Lumin Digital ยท 7 hours ago
Senior Site Reliability Engineer
Maximize your interview chances
Financial ServicesFinTech
Growth Opportunities
Insider Connection @Lumin Digital
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Develop and manage CI/CD pipelines, ensuring efficient deployment and system updates.
Monitor and troubleshoot application and infrastructure issues across all environments, proactively ensuring SLOs and uptime requirements are met.
Collaborate with development and security teams to integrate best practices and ensure system resilience.
Engage in capacity planning and demand forecasting to anticipate performance bottlenecks and proactively scale the environment.
Manage change and configuration, ensuring stability and consistency across deployments.
Provide metrics to track system performance and identify areas for improvement.
Implement monitoring and alerting strategies that promote automation, self-healing, and effective incident response.
Participate in a 24x7 on-call rotation to support system reliability and availability.
Perform other duties as assigned.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Develop and manage CI/CD pipelines, ensuring efficient deployment and system updates.
Monitor and troubleshoot application and infrastructure issues across all environments, proactively ensuring SLOs and uptime requirements are met.
Collaborate with development and security teams to integrate best practices and ensure system resilience.
Engage in capacity planning and demand forecasting to anticipate performance bottlenecks and proactively scale the environment.
Manage change and configuration, ensuring stability and consistency across deployments.
Provide metrics to track system performance and identify areas for improvement.
Implement monitoring and alerting strategies that promote automation, self-healing, and effective incident response.
Participate in a 24x7 on-call rotation to support system reliability and availability.
Strong problem-solving skills with an operations mindset and an ability to anticipate issues in large-scale systems.
Proficiency with configuration management tools such as Chef, Ansible, or Puppet.
Knowledge of standard networking protocols and components (HTTP, DNS, TCP/IP, ICMP).
Expertise in AWS or other cloud hosting environments, with a security-focused approach to data integrity and availability.
Hands-on experience with containerization and orchestration technologies, including Docker and Kubernetes.
Advanced understanding of Terraform, CI/CD architecture, and the ability to automate workflows.
Ability to respond to incidents during off hours.
Company
Lumin Digital
Digital banking, online banking, financial technology
Funding
Current Stage
Growth StageRecent News
Company data provided by crunchbase