TEC Group, Inc. ยท 21 hours ago
Senior Observability Architect (Datadog)
TEC Group, Inc. is seeking a Senior Observability Architect specializing in Datadog to design and implement end-to-end observability architecture across various platforms. The role involves defining monitoring standards, deploying Datadog solutions, and optimizing observability practices to enhance system reliability and performance.
Staffing & Recruiting
Responsibilities
Design end-to-end observability architecture using Datadog across cloud Azure, containers, Kubernetes, and on-prem workloads
Define monitoring standards, SLIs/SLOs, dashboards, alerting strategy, and tagging governance
Design and Architect end to end solution to integrate Mainframe platforms
Architect log ingestion pipelines, retention policies, and cost-optimized indexing strategies
Build scalable APM instrumentation patterns for microservices, serverless, and distributed environments
Deploy Datadog agents, integrations, and custom checks across large-scale infrastructure
Configure APM, RUM, Logs, SIEM, Synthetics, Network Performance Monitoring, and CI/CD Observability
Work closely with DevOps, SRE, Cloud, and Application teams to instrument services and ensure visibility
Analyze and optimize Datadog costs: usage, retention settings, indexing, and billing insights
Establish organization-wide tagging standards, dashboards, alerting guardrails, and onboarding processes
Create reusable templates, Terraform modules, and automation scripts for Datadog deployment
Ensure compliance with security and observability best practices
Mentor teams on Datadog usage, training engineers on dashboards, logs, traces, and alerts
Lead RCA investigations using Datadog metrics, traces, logs, and correlated events
Collaborate with engineering teams to improve system reliability, resilience, and performance
Identify gaps in observability and propose improvements across the stack
Qualification
Required
6 years in Observability, Monitoring, SRE, DevOps, or Cloud Engineering
3+ years of hands-on experience with Datadog
Strong understanding of distributed systems, microservices, and cloud-native architectures
Expertise with Kubernetes, Docker, AWS/Azure/GCP cloud services
Experience with Infrastructure as Code (Terraform preferred)
Strong knowledge of APM, Metrics, Logs, RUM, Synthetics, and Security Monitoring
Deep experience with Datadog dashboards, alerting, monitors, service maps, event correlation, and notebooks
Proficiency with Python, Bash, or similar scripting languages
Strong analytical, communication, and problem-solving skills
Preferred
Datadog Certifications (Datadog Fundamentals, APM, Log Management, or Observability)
Experience with Retail for observability tools
CI/CD observability experience (GitHub Actions, Jenkins, GitLab CI, etc)
Background in Performance Engineering, Reliability Engineering, or Platform Engineering
Company
TEC Group, Inc.
TEC Group recruits and employs quality talent for companies from diverse industries across the U.S.
H1B Sponsorship
TEC Group, Inc. has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (5)
2024 (10)
2023 (3)
2022 (7)
2021 (18)
2020 (15)
Funding
Current Stage
Late StageCompany data provided by crunchbase