Elluminates Software ยท 4 months ago
Senior Monitoring Engineer
Elluminates Software provides innovation for Federal customers, including AI-driven SaaS, Cloud and On-Prem transformation, and advanced Infrastructure Automation. The Senior Monitoring Engineer is a senior-level technical expert responsible for advanced troubleshooting, performance analysis, and optimization of enterprise monitoring platforms.
Information ServicesInformation TechnologySoftware
Responsibilities
Serve as the Tier 3 escalation point for issues related to any of the monitoring/ observability platforms and tools
Lead root cause analysis (RCA) for major incidents and recurring performance issues
Maintain, configure, and optimize monitoring tool deployments across cloud (e.g., AWS, Azure), on-premises, and VMware environments
Design and implement custom dashboards, synthetic monitoring, and service-level objectives (SLOs)
Develop and maintain alerting strategies that reduce noise and ensure actionable notifications
Work closely with application, infrastructure, DevOps, and security teams to define monitoring requirements and integrate observability into CI/CD pipelines
Analyze metrics, logs, and traces to ensure end-to-end service visibility and performance optimization
Assist in onboarding applications and teams into the observability platform
Provide training and mentorship to Tier 1 and Tier 2 support teams
Ensure platform resilience, availability, and compliance with internal standards and SLAs
Participate in on-call rotations for high-priority incidents as needed
Qualification
Required
5+ years of experience in IT infrastructure, application performance monitoring, or site reliability engineering (SRE)
2+ years of hands-on experience using platforms such as Dynatrace, Zabbix, and monitoring tools in VMware Cloud Foundation (VCF)
Solid understanding of observability concepts including metrics, logs, traces, and user experience monitoring
Experience supporting complex, distributed systems in cloud and hybrid environments
Proficient with scripting and automation (e.g., PowerShell, Python, Bash, or Ansible)
Strong understanding of networking, Linux/Windows systems, containers, and application architectures (microservices, APIs, etc.)
Bachelors and nine (9) years or more experience; Masters and seven (7) years or more experience; PhD or JD and four (4) years or more experience. Additional experience in lieu of degree
Clearance: Secret (with ability to obtain TS)
Preferred
Dynatrace Associate or Professional Certification
Experience with Dynatrace, including OneAgent deployment, Smartscape, PurePath, and Davis AI
Experience with integration of Dynatrace with tools such as ServiceNow, Splunk, Jira, or CI/CD pipelines
Experience with other observability tools (e.g., Prometheus, Grafana, New Relic, AppDynamics, Splunk, Elastic)
Familiarity with DevOps practices and Infrastructure-as-Code (e.g., Terraform)
Understanding of ITIL framework and change management processes
Excellent troubleshooting, problem-solving skills
Strong written and verbal communication
Ability to work independently and collaboratively across teams
Customer-focused mindset and attention to detail
Continuous learning and adaptability in a fast-paced environment