Site Reliability Engineer @ Precision Solutions | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Site Reliability Engineer jobs in United States
Be an early applicantLess than 25 applicants
company-logo

Precision Solutions ยท 5 hours ago

Site Reliability Engineer

ftfMaximize your interview chances
AdvertisingBusiness Development
badNo H1BnoteSecurity Clearance Requirednote

Insider Connection @Precision Solutions

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Monitor the performance and reliability of our client's Kubernetes clusters, software, websites, and applications
Automate routine maintenance tasks to ensure system stability and performance
Respond to and resolve incidents in a timely manner, minimizing downtime and impact on users
Conduct root cause analysis to identify and address underlying issues
Develop and implement strategies to prevent future incidents and improve system resilience
Design, build, and maintain automated systems and processes to improve efficiency and reduce manual intervention
Manage cloud infrastructure, including provisioning, scaling, and optimizing resources
Collaborate with development teams to ensure seamless deployment and integration of new features and updates
Analyze system performance and identify areas for improvement
Implement performance tuning and optimization techniques to enhance system efficiency
Collaborate with cross-functional teams to ensure optimal performance of all components
Ensure compliance with security best practices and industry standards
Implement and maintain security measures to protect systems and data
Conduct regular security audits and vulnerability assessments
Maintain accurate and up-to-date documentation of systems, processes, and procedures
Generate and analyze reports on system performance, incidents, and other key metrics
Provide regular updates to management and stakeholders on system health and performance
Identify opportunities for improving system reliability, performance, and scalability
Stay up-to-date with industry trends and best practices in site reliability engineering
Participate in training and development opportunities to enhance skills and knowledge

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

Site Reliability EngineeringKubernetesCloud InfrastructureAutomation ToolsMonitoring ToolsLogging Tools

Required

3+ years of experience in site reliability engineering, Kubernetes administration, or a related role
Deep expertise of Kubernetes and containers is required
Strong understanding of cloud infrastructure, automation tools, and best practices for maintaining high availability and performance
Experience with monitoring and logging tools such as Loki and Grafana
Excellent problem-solving skills and attention to detail
Strong communication and interpersonal skills, with the ability to work effectively with cross-functional teams
US Citizenship - Clearable; Ability to obtain a Secret Clearance

Preferred

Local to Washington D.C. is preferred
Experience working within a start-up environment is highly preferred

Company

Precision Solutions

twittertwittertwitter
company-logo
Precision Solutions is a marketing company that offers management training, business development, and direct sales job opportunities.

Funding

Current Stage
Early Stage

Leadership Team

A
Alex Roberts
Owner and Director Of Operations
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot