Apply on Employer Site

PDS · 7 hours ago

Site Reliability Engineer

Fort Mill, SC

Contract

Hybrid

Senior Level

5+ years exp

PDS is seeking an experienced Site Reliability Engineer with strong observability expertise to enhance transaction traceability, performance, and resiliency across a complex enterprise environment. The role focuses on building visibility into critical transaction flows and collaborating with cross-functional teams to implement observability frameworks and optimize system performance.

ComputerInformation TechnologySoftwareStaffing Agency

Responsibilities

Design and implement observability frameworks for full transaction traceability across microservices, APIs, databases, and third-party integrations

Utilize tools such as Dynatrace, OpenTelemetry, ELK, and Grafana to visualize dependencies and build actionable dashboards, alerts, and real‑time performance insights

Monitor latency, throughput, and failures to identify bottlenecks

Use telemetry and distributed tracing to troubleshoot and optimize transaction performance

Partner with application and database teams to improve system efficiency

Work with architects, engineering teams, and stakeholders to define observability standards and resiliency requirements

Establish monitoring best practices and provide training across teams

Identify and prioritize business‑critical transaction paths

Implement redundancy, failover strategies, and fault‑tolerant architectures

Support chaos engineering initiatives and resiliency testing

Define and measure SLOs and SLIs for critical transaction paths

Maintain detailed documentation of transaction flows and monitoring configurations

Produce regular reporting on system performance, resiliency, and improvement initiatives

Create incident playbooks and reusable observability frameworks

Achieve a 30% reduction in MTTD and MTTR within the first year

Identify the offending service/root cause for at least 70% of incidents within one hour

Detect 90% of issues through automated monitoring

Contribute to a culture of continuous improvement and knowledge sharing

Qualification

DynatraceAWSObservability frameworksMicroservicesScripting languagesChaos engineeringCollaborationDocumentation

Required

5+ years in SRE, Observability, or related engineering roles

Hands-on experience with Dynatrace, ELK, Datadog, Splunk, OpenTelemetry, Jaeger, or similar tools

Strong background with AWS, Azure, or GCP

Solid understanding of microservices, APIs, and distributed systems

Proficiency with scripting or programming languages (Python, Go, Java)

Preferred

Dynatrace Associate or Professional Certification

Experience with OpenTelemetry and observability standards

Familiarity with chaos engineering practices

Experience with AIOps and automation-driven monitoring

Company

PDS

PDS is one of the leading Aerospace, Information Technology (IT) & Engineering consulting firms in the Western United States.

Founded in 1987

Arvada, Colorado, USA

201-500 employees

http://pdsinc.com

Funding

Current Stage

Growth Stage

Leadership Team

Thomas Sweetman

President & Chief Executive Officer

Company data provided by crunchbase