Peraton ยท 15 hours ago
Site Reliability Engineer (SRE)
Maximize your interview chances
Information TechnologyRobotics
Actively HiringNo H1BSecurity Clearance Required
Insider Connection @Peraton
Get 3x more responses when you reach out via email instead of LinkedIn.
Responsibilities
Manage, support and maintain a reliable environment for the site to ensure the stability and security of multiple systems/platforms that are run or operated in that environment
Develop or contribute to solutions to a variety of problems of moderate scope and complexity
Oversee the development of more robust systems for by building a resilient infrastructure
Build in redundancy, implement monitoring tools, and automate wherever possible and reduce toil by scripting routine tasks and automating self-repair
Run the production environment by monitoring availability and taking a holistic view of system health
Build software and systems to manage/operate platform infrastructure and applications
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational support and engineering for multiple large, distributed software applications and Improve reliability, quality, and time-to-market for suite of software solutions
Ensure Production readiness for releases which includes Performance/Usability Testing
Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Create sustainable systems and services through automation and uplifts
Production incidents RCAs and Conducting post-incident reviews
Experience in developing operational playbooks/runbooks, and disaster recovery testing
A strong focus on achieving value for business objectives
Comfort with collaboration, open communication and reaching across functional borders
Strong analytical, communication, and decision-making skills. Proficiency in a variety of computer programs and applications including VMWare, Windows, Linux, Oracle, and Solaris.
Qualification
Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.
Required
5 years with BS/BA; 3 years with MS/MA; 0 years with PhD
Bachelor's degree in computer science or other highly technical, scientific discipline
Current GSA Public Trust or ability to obtain GSA Public Trust
The SRE requires minimum of 5 years of programming experience with Operations of enterprise systems with over million users
Includes 5+ years of experience in working within software engineer team who leveraged DevOps with development
5+ years of experience with Cloud Architecture, preferably AWS
5+ years of experience in DevSecOps
3+ years of experience with microservices
Experience using a wide variety of open source and COTS technologies and tools
Strong background working in an agile development environment, collaborating with Application Development and Architecture Teams
Experience working in a High Availability environment with 99.99%+ uptime
Ability to program (structured and OO) with one or more high level languages, such as Python, Java, C/C++, Ruby, and JavaScript
Experience with cloud storage technologies as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
Run the production environment by monitoring availability and taking a holistic view of system health
Build software and systems to manage/operate platform infrastructure and applications
Improve reliability, quality, and time-to-market of our suite of software solutions
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
Provide primary operational support and engineering for multiple large, distributed software applications and Improve reliability, quality, and time-to-market for suite of software solutions
Ensure Production readiness for releases which includes Performance/Usability Testing
Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
Partner with development teams to improve services through rigorous testing and release procedures
Participate in system design consulting, platform management, and capacity planning
Create sustainable systems and services through automation and uplifts
Production incidents RCAs and Conducting post-incident reviews
Experience in developing operational playbooks/runbooks, and disaster recovery testing
A strong focus on achieving value for business objectives
Comfort with collaboration, open communication and reaching across functional borders
Strong analytical, communication, and decision-making skills
Proficiency in a variety of computer programs and applications including VMWare, Windows, Linux, Oracle, and Solaris
Preferred
AWS SysOps Administrator Certificate or AWS Developer - Associate Certificate
AWS Certified DevOps Engineer- Professional
Masters' Degree preferred or bachelor's level degree with equivalent years of work experience
Benefits
Paid Time-Off and Holidays
Retirement
Life & Disability Insurance
Career Development
Tuition Assistance and Student Loan Financing
Paid Parental Leave
Additional Benefits
Medical, Dental, & Vision Care
Company
Peraton
Peraton Fearlessly solving the toughest national security challenges.
Funding
Current Stage
Late StageRecent News
2024-04-18
2024-04-01
Company data provided by crunchbase