Senior ML Reliability Engineer @ CereCore | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
Senior ML Reliability Engineer jobs in United States
42 applicants
company-logo

CereCore · 1 week ago

Senior ML Reliability Engineer

Wonder how qualified you are to the job?

ConsultingInformation Services
check
Growth Opportunities

Insider Connection @CereCore

Discover valuable connections within the company who might provide insights and potential referrals, giving your job application an inside edge.

Responsibilities

Tool Development and Management: Build, manage, and maintain tools for system reliability, including dashboards, logging systems, and pager systems.
Infrastructure Maintenance: Help maintain and enhance CI/CD pipelines, logging infrastructure, and other operational systems crucial for MLOps.
Monorepo Management: Keep the monorepo up-to-date with the latest dependency and security updates, ensuring a secure and efficient development environment.
Vendor Collaboration: Assist in implementing and maintaining infrastructure and systems managed by external vendor teams.
Incident Management: Lead and participate in incident management processes, including troubleshooting, root cause analysis, and implementing corrective measures to prevent future occurrences.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

MLOpsGoogle Cloud PlatformVertex AI

Required

Hands on technical engineering, architecture and development experience with Google Cloud Platform Vertex AI MLOps live and in production at scale.
AI/ML Knowledge: Solid understanding of AI/ML principles and technologies.
System Monitoring and Tools: Experience with system monitoring tools and observability. Knowledge of GCP, Vertex AI, or other cloud platforms is highly beneficial.
Programming and Scripting: Proficiency in programming languages such as Python and scripting for automation.
Problem-Solving Skills: Strong analytical and problem-solving skills, with the ability to work under pressure.
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
5+ years of experience in the technology field
Proven experience in a reliability engineering role, preferably with a focus on AI/ML systems.
Experience in incident management and performance optimization.
Excellent communication and teamwork skills.

Company

CereCore

twittertwittertwitter
company-logo
CereCore has implemented EHR systems in more than 300 facilities and offers staffing and remote support services for major EHR acute.

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Curtis Watkins
President & CEO
linkedin
Company data provided by crunchbase
logo

Orion

Your AI Copilot