Problem Manager jobs in United States
cer-icon
Apply on Employer Site
company-logo

Skyline Technology Solutions ยท 1 month ago

Problem Manager

Skyline Technology Solutions is seeking a Problem Manager to lead their IT Operations Problem Management function. This strategic role involves architecting and implementing problem management processes, leading investigations of major incidents, and driving continuous improvement across the organization.

ConsultingCyber SecurityInformation TechnologySecurityVideo
check
Culture & Values

Responsibilities

Architect and implement end-to-end problem management processes aligned with ITIL 4 best practices and integrated with existing ITSM workflows
Lead problem management from identification and logging through root cause analysis, resolution implementation, and closure
Personally lead investigations of major incidents, chronic issues, and high-impact problems requiring deep technical analysis
Lead blameless After Incident Reviews that extract maximum learning and drive actionable improvements
Apply structured methodologies (5 Whys, Fishbone, Kepner-Tregoe, Fault Tree Analysis) to identify true root causes versus symptoms
Own creation of Internal and Customer facing After Incident Summary reports within customer contractual Service Level Agreements (SLA)
Establish and maintain the known error database with comprehensive documentation of problems, workarounds, and permanent solutions
Work with teams to analyze incident trends, monitoring alerts, and system telemetry to identify emerging problems before they cause a major impact
Ensure problems are resolved with permanent fixes, not workarounds; verify effectiveness and prevent recurrence
Establish problem prioritization criteria, SLAs, escalation paths, and review boards
Understand and map complex interdependencies across Skyline's infrastructure, applications, data flows, and integration points spanning all divisions
Analyze system architectures, code, configurations, logs, and performance metrics
Leverage Linux knowledge to task teams investigate kernel issues, performance bottlenecks, resource contention, and system-level failures
Lead troubleshooting reviews of networking issues across enterprise wired/wireless networks and service provider connections; analyze routing, switching, firewall, and load balancing configurations
Identify single points of failure, cascading failure risks, and resilience gaps
Decompose systemic issues into discrete technical tasks and effectively assign work to specialized technical resources
Evaluate proposed fixes for completeness, sustainability, and potential unintended consequences
Participate in the Change Management process and assess how changes, upgrades, and architectural decisions affect system stability and problem recurrence
Develop comprehensive dashboards and reports showing problem trends, team performance, business impact, and ROI
Provide regular updates to leadership on problem management effectiveness, achievements, and strategic recommendations
Continuously improve problem management processes, tools, and procedures based on lessons learned and industry best practices
Mentor incident managers and technical teams on effective problem identification, escalation, and initial triage
Develop training materials, playbooks, and workshops to elevate organizational problem-solving capabilities

Qualification

ITIL 4Linux expertiseProblem managementDevOps practicesServiceNowAutomation scriptingNetwork troubleshootingInfrastructure designAnalytical skillsAttention to detail

Required

5+ years of hands-on experience in problem management or leadership role, with demonstrated success improving reliability
3+ years with DevOps practices, CI/CD pipelines, infrastructure as code (Terraform, Ansible), and configuration management
3+ years using tools like Splunk, LogicMonitor, Prometheus, Grafana, or similar platforms for diagnostics and problem identification
3+ years working with ServiceNow or similar ITSM platforms for problem, incident, and change management
3+ years of deep experience with Linux system architecture, kernel operations, performance tuning, and troubleshooting in enterprise production environments
Experience in designing and managing complex infrastructure environments, including servers, storage, virtualization, and cloud platforms (AWS, Azure, GCP)
Experience in enterprise networking and connectivity solutions, encompassing wired/wireless technologies, WAN architectures, and advanced protocols (MPLS, BGP, SD-WAN)
Experience in security and monitoring, with hands-on experience in network security technologies, physical security systems, and performance diagnostics
Adept at automation and scripting (Python, Bash, PowerShell) to streamline operations and support modern containerized and microservices-based architectures
Ability to analyze complex, ambiguous situations and extract meaningful patterns and insights
Ability to challenge assumptions, ask probing questions, and distinguish correlation from causation
Capability to understand and troubleshoot issues spanning multiple interconnected systems and technology domains
Strong attention to detail in documenting investigations, findings, and solutions for future reference

Preferred

ITIL 4 Managing Professional (MP) or ITIL 4 Strategic Leader (SL) certification
Linux certifications (RHCE, LFCS, LFCE)
Network certifications (CCNA, CCNP, CWNA, or equivalent)
Cloud platform certifications (AWS Certified Solutions Architect, Azure Administrator, GCP Professional)
DevOps or SRE certifications (Certified Kubernetes Administrator, DevOps Institute)
Six Sigma or other continuous improvement methodologies

Benefits

Medical Insurance
Vision Insurance
Dental Insurance
FSA Plan
Paid Time Off
401K Retirement Savings Plan
Training & Tuition Assistance
Disability & Life Insurance

Company

Skyline Technology Solutions

company-logo
Skyline Technology Solutions is a technology consulting firm focusing on IT services, video sharing, and cybersecurity.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Mia Millette
Chief Executive Officer
linkedin
leader-logo
Paul Lennon
Chief Technology Officer
linkedin
Company data provided by crunchbase