Undisclosed · 9 hours ago
DevOps Engineer
Our client is seeking a DevOps Engineer to design, build, and maintain scalable, reliable, and secure infrastructure that supports the development and delivery of modern software products. This role plays a critical part in enabling engineering velocity, system resilience, and operational excellence by bridging development and operations through automation, observability, and best-in-class DevOps practices.
Financial Services
Responsibilities
Design, implement, and maintain cloud-based infrastructure to support scalable, high-availability applications
Build and manage CI/CD pipelines to enable fast, reliable, and repeatable software deployments
Automate infrastructure provisioning and configuration using Infrastructure as Code (IaC) tools
Partner with engineering teams to improve application performance, reliability, and deployment processes
Implement monitoring, logging, and alerting solutions to ensure system observability and rapid incident response
Support production operations, including on-call rotations, incident management, and root cause analysis
Collaborate with Security teams to enforce best practices around access control, secrets management, and compliance
Optimize cloud costs through capacity planning, usage analysis, and infrastructure optimization
Document systems, processes, and operational runbooks to support scalability and knowledge sharing
Continuously evaluate and introduce tools and practices that improve system resilience and developer productivity
Qualification
Required
Strong experience with cloud platforms such as AWS, GCP, or Azure
Proficiency in Infrastructure as Code tools (e.g., Terraform, CloudFormation, Pulumi)
Hands-on experience building and maintaining CI/CD pipelines (e.g., GitHub Actions, GitLab CI, Jenkins, CircleCI)
Solid understanding of Linux systems, networking, and cloud security fundamentals
Experience with containerization and orchestration technologies (Docker, Kubernetes)
Familiarity with monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, Datadog, ELK)
Strong scripting or programming skills (e.g., Bash, Python, Go)
Ability to troubleshoot complex systems and perform root cause analysis under pressure
Excellent collaboration and communication skills, with a strong sense of ownership
Preferred
4–8+ years of experience in DevOps, Site Reliability Engineering, or Infrastructure roles
Experience supporting SaaS or high-traffic, customer-facing applications
Exposure to security best practices, compliance frameworks, or regulated environments
Experience with service mesh, zero-trust networking, or advanced Kubernetes operations
Prior experience working in distributed or fully remote engineering teams