CareerUS Solutions ยท 3 days ago
Site Reliability Engineering
Maximize your interview chances
Insider Connection @CareerUS Solutions
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Infrastructure Management: Design, implement, and manage infrastructure using cloud platforms (AWS, Azure, GCP) and configuration management tools (Terraform, Ansible).
Monitoring & Incident Response: Develop and maintain monitoring solutions (using tools like Prometheus, Grafana, and Datadog) to ensure system reliability. Respond to and troubleshoot incidents on time.
Automation: Automate repetitive tasks to improve operational efficiency, including CI/CD pipelines and deployment processes.
Performance Optimization: Analyze system performance and recommend optimizations to enhance reliability and scalability.
Collaboration: Work closely with software development teams to ensure reliable and scalable application deployments and architecture.
Documentation: Maintain clear documentation of systems, processes, and incident responses for knowledge sharing within the team.
Security: Collaborate with security teams to ensure best practices are followed in infrastructure and application security.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
Programming and Scripting: Python, Go, Ruby, Java, and scripting in Shell or Bash
Systems Administration: Linux/Unix systems, including system performance tuning and troubleshooting
Cloud Platforms: AWS, Google Cloud Platform (GCP), Microsoft Azure, or other cloud service providers.
Infrastructure as Code (IaC): Terraform, Ansible, or CloudFormation for managing infrastructure.
CI/CD Pipelines: CI/CD tools (e.g., Jenkins, GitLab, CircleCI) to automate and manage deployments.
Containerization and Orchestration: Docker, Kubernetes, and other containerization technologies.
Monitoring and Alerting: monitoring tools such as Prometheus, Grafana, ELK Stack, New Relic, or Datadog to track system health and performance.
Load Balancing and Caching: NGINX, HAProxy, and Redis to improve system resilience and performance.
Networking: networking protocols, DNS, TCP/IP, load balancing, and VPNs.
Database Management: SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB)
Preferred
Certifications: Relevant certifications (AWS Certified Solutions Architect, Google Cloud Professional DevOps Engineer, etc.) are a plus.
Agile Methodologies: Familiarity with Agile and DevOps practices.
Experience with Microservices: Understanding of microservices architecture and associated challenges
Company
CareerUS Solutions
We are a consulting and staffing company that specializes in providing top-notch services to our clients.
Funding
Current Stage
Growth StageCompany data provided by crunchbase