Precision Solutions · 6 hours ago

Site Reliability Engineer

USA

Full-time

Remote

Mid Level

3+ years exp

Maximize your interview chances

AppsStaffing Agency

No H1B

Security Clearance Required

Insider Connection @Precision Solutions

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Monitor the performance and reliability of our client's Kubernetes clusters, software, websites, and applications

Automate routine maintenance tasks to ensure system stability and performance

Respond to and resolve incidents in a timely manner, minimizing downtime and impact on users

Conduct root cause analysis to identify and address underlying issues

Develop and implement strategies to prevent future incidents and improve system resilience

Design, build, and maintain automated systems and processes to improve efficiency and reduce manual intervention

Manage cloud infrastructure, including provisioning, scaling, and optimizing resources

Collaborate with development teams to ensure seamless deployment and integration of new features and updates

Analyze system performance and identify areas for improvement

Implement performance tuning and optimization techniques to enhance system efficiency

Collaborate with cross-functional teams to ensure optimal performance of all components

Ensure compliance with security best practices and industry standards

Implement and maintain security measures to protect systems and data

Conduct regular security audits and vulnerability assessments

Maintain accurate and up-to-date documentation of systems, processes, and procedures

Generate and analyze reports on system performance, incidents, and other key metrics

Provide regular updates to management and stakeholders on system health and performance

Identify opportunities for improving system reliability, performance, and scalability

Stay up-to-date with industry trends and best practices in site reliability engineering

Participate in training and development opportunities to enhance skills and knowledge

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

KubernetesCloud infrastructureAutomation toolsMonitoring toolsLogging tools