Lead Software Engineer - Observability jobs in United States
info-icon
This job has closed.
company-logo

Wellmark Blue Cross and Blue Shield · 4 weeks ago

Lead Software Engineer - Observability

Wellmark Blue Cross and Blue Shield is a mutual insurance company focused on providing best-in-class service and innovative solutions. The Lead Software Engineer - Observability will design, build, and maintain observability platform tools and frameworks to enhance system performance, availability, and reliability while collaborating with various teams to ensure a seamless observability ecosystem.

Financial ServicesHealth InsuranceInsurancePersonal Health
badNo H1Bnote

Responsibilities

Design, build, and maintain observability platforms with reusability across services in mind
Develop scalable, automated pipelines for ingesting, transforming, and visualizing telemetry data
Integrate observability tools (e.g., Dynatrace, Splunk, Prometheus, Grafana, Splunk, Datadog, New Relic, OpenTelemetry) with existing infrastructure and applications
Enable root cause analysis through correlation of metrics, logs, and traces
Analyze telemetry data to identify performance bottlenecks and optimize resource allocation for improved efficiency
Define SLIs, SLOs, and error budgets with stakeholders for critical services
Improve incident response by enhancing monitoring dashboards, alerts, and automated notifications

Qualification

Observability platformsSite Reliability EngineeringDevOps practicesCloud platformsProgramming languagesAgile methodologiesIncident managementContainerizationCoachingCommunication skillsProblem-solving skillsMentoringTeam collaboration

Required

Bachelor's degree in Computer Science, MIS, or related field of study and at least 5 years of development experience (ex. Angular, NodeJS, TypeScript, C++, .NET, Java, SQL) OR 9 years of related and applicable experience
Strong analytical problem-solving skills. Accuracy and high attention to detail. Previous experience troubleshooting and developing creative technical solutions. Ability to provide innovative solutions to complex issues
Demonstrated experience in software development lifecycle methodologies
Demonstrated ability to communicate with and coach/mentor team members, while setting an example in maintaining a positive attitude, staying calm under pressure, being approachable, and respectful and taking responsibility for failures
Big picture thinker with the ability to translate the value of the Wellmark as a Service (WaaS) strategy to company strategy when making design and development decisions
Demonstrated, strong ability to gather information, perform necessary research needed for root cause analysis, problem definition and formulation, recommend solution implementation, verification, and ongoing optimization, using data to support recommendations
Demonstrated ability to build relationships to reach outcomes that gain the support and acceptance of all parties. Ability to communicate key information in a timely manner to the appropriate stakeholder audience with the ability to adjust communication style that will best suit the audience
Ability to thrive in fast-paced environment with changing priorities. Excellent organizational skills. Strong time management skills with the ability to set and meet established timeframes with little direction, while assuring data and information integrity
Eagerness to learn and stay current on industry trends and have a continuous learning mindset
Ability to collaborate and work as a team to accomplish goals and/or solve problems. Ability to earn trust and respect from peers, leadership, and stakeholders. Ability to learn by actively listening and applying coaching feedback
Ability to lead, support and work within a diverse development team model including global staffing, crowd sourcing, etc

Preferred

3–5 years of experience in Site Reliability Engineering, DevOps, or Observability/Monitoring engineering roles
Proven experience building or administering observability platforms in production environments
Track record of improving system reliability and reducing mean time to resolution (MTTR)
Hands-on experience with one or more observability platforms: Dynatrace, Prometheus, Grafana, OpenTelemetry, Elastic Stack, Splunk, Datadog, New Relic, AppDynamics, Honeycomb
Strong knowledge of observability concepts: metrics, logs, traces, SLOs/SLIs, error budgets
Experience working within an Agile team environment
Experience deploying and maintaining Open Telemetry-based observability pipelines
Prior experience working in highly regulated environments with compliance observability needs
Contributions to observability open-source projects
Familiarity with chaos engineering practices to validate monitoring and resilience
Certifications from AWS, Microsoft Azure, or Google Cloud
Demonstrated experience coaching/mentoring others by providing guidance and feedback to help an employee or groups of employees strengthen their knowledge and skills to accomplish a task or solve a problem
Excellent problem-solving skills with a strong analytical mindset
Strong written and verbal communication skills, including the ability to explain complex technical topics to both engineers and business stakeholders
Proven experience with designing technical architecture and keeping abreast of existing and emerging technologies
Experiencing consulting with stakeholders to understand needs with the intention of providing advice and counsel. Also interacting appropriately with others to guide individuals or groups to accomplish work, reach consensus, or take action
Proficiency in programming or scripting languages (Python, Go, Java, Bash, etc.) for observability automation
Experience with containerization and orchestration platforms (Docker, Kubernetes)
Deep knowledge of cloud platforms (AWS, Azure, GCP), observability/monitoring services, operating systems (Windows/Linux), networking, and containerization
Strong understanding of distributed systems, microservices, and cloud-native architectures
Proficiency in CI/CD pipelines and how observability integrates into DevOps workflows
Knowledge of incident management and on-call practices
Experience with supporting observability and monitoring for Artificial Intelligence agents

Company

Wellmark Blue Cross and Blue Shield

company-logo
Wellmark Blue Cross and Blue Shield and its subsidiaries provide health coverage to more than 2 million members in Iowa and South Dakota.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
John Forsyth
Chief Executive Officer
leader-logo
Andrew Neller
Deputy Chief Information Security Officer
linkedin
Company data provided by crunchbase