Matlen Silver · 15 hours ago
Site Reliability Engineer (Application Operations)
Matlen Silver is seeking a skilled and proactive Application Operations Engineer to support and enhance the reliability, performance, and compliance of application-specific services. This role involves managing the full lifecycle of application operations, including deployment, monitoring, incident response, automation, and stakeholder communication in the Treasury application space.
Responsibilities
Own and manage deployment pipelines tailored to individual applications. This includes coordinating release schedules, validating deployment artifacts, and ensuring smooth rollout processes across environments. Collaborate with development and QA teams to ensure deployments meet quality and compliance standards
Design, implement, and maintain monitoring solutions that provide visibility into application health and performance. Develop dashboards, configure alerts, and integrate telemetry to proactively detect issues. Ensure monitoring coverage aligns with business-critical functions and SLAs
Establish and operate anomaly detection mechanisms to identify and respond to incidents quickly. Develop triage workflows that streamline root cause analysis and resolution. Work closely with support and engineering teams to minimize downtime and improve incident response times
Create and maintain automation scripts and tools that enhance operational efficiency and reduce manual effort. Integrate automation into CI/CD pipelines, monitoring systems, and incident response processes. Continuously evaluate and improve tooling to support evolving application needs
Develop comprehensive runbooks, operational guides, and knowledge base articles for supported applications. Ensure documentation is up-to-date, accessible, and aligned with best practices. Promote knowledge sharing across teams to improve onboarding and reduce operational risk
Ensure applications adhere to internal and external regulatory requirements, including security and risk management standards. Implement controls and audit mechanisms to maintain compliance. Collaborate with InfoSec and Risk teams to address vulnerabilities and enforce governance policies
Maintain regular communication with business and technical stakeholders regarding application performance, incidents, and operational metrics. Prepare and deliver reports that highlight key trends, risks, and improvement opportunities. Foster strong relationships to align operational priorities with business goals
Manage the lifecycle of application certificates, including issuance, renewal, and monitoring. Ensure certificates are properly configured to support secure communication and meet compliance requirements. Automate certificate processes where possible to reduce risk and overhead
Apply SRE principles to improve application reliability, scalability, and resilience. Define and monitor Service Level Objectives (SLOs) and error budgets. Implement strategies for fault tolerance, capacity planning, and performance optimization. Collaborate with engineering teams to embed reliability into application design
Qualification
Required
Bachelor's degree in Computer Science or a related field is required
A minimum of 5+ years of broad and relevant engineering experience in application operations, DevOps, or site reliability engineering is essential
Demonstrated experience with the architecture and design of n-tier systems and microservices
Hands-on experience with monitoring and observability tools such as Splunk, Dynatrace, or similar platforms
Ability to design and implement telemetry solutions that provide actionable insights into application performance and reliability
A strong passion for automation and continuous improvement
Proven ability to develop scripts, tools, and integrations that reduce manual effort and improve operational efficiency
Experience working in agile development environments, with a strong understanding of iterative delivery, sprint planning, and cross-functional collaboration
Excellent written and verbal communication skills
Capable of producing clear documentation, runbooks, and reports for both technical and non-technical audiences
Strong analytical and troubleshooting skills
Experience applying SRE principles such as SLOs, error budgets, and resilience engineering to improve system reliability and performance
Preferred
Familiarity with CI/CD pipelines, cloud-native technologies, and infrastructure-as-code practices is highly desirable
Company
Matlen Silver
Matlen Silver is a staffing agency for IT firms.
H1B Sponsorship
Matlen Silver has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (34)
2024 (10)
2023 (8)
2022 (8)
2021 (17)
2020 (35)
Funding
Current Stage
Late StageRecent News
2025-11-19
2024-05-30
2023-12-10
Company data provided by crunchbase