Anyscale · 3 days ago
Engineering Manager, Observability (TLM)
Anyscale is on a mission to democratize distributed computing and make it accessible to software developers of all skill levels. The Engineering Manager for Observability will lead a team focused on building user-facing features for the Anyscale AI platform, ensuring robust monitoring tools that enhance the development lifecycle for AI applications.
Artificial Intelligence (AI)Developer PlatformInformation TechnologyMachine LearningOpen Source
Responsibilities
Interacting with users, understanding their requirements, designing and implementing features, and finally maintaining and improving these features over time
The Ray Dashboard observability tool which gives users insight into their Ray application including what code is running in which machine, how much data is being moved between various machines, and the hardware utilization of each machine
Library-specific observability tools like the Ray Train dashboard or Ray Serve dashboard which accelerates our users ability to develop distributed training or model serving applications
Unified log viewer, a tool that ingests logs across a ray cluster and presents the ability to query those logs in meaningful ways, such as by function name, log level, timestamp, or machine
Anomaly detection. The ability for the Anyscale platform to automatically detect performance bottlenecks or bugs in our users workloads and suggest or automatically fix these issues
Work with a team of leading distributed systems and machine learning experts
Communicate your work to a broader audience through talks, tutorials, and blog posts
Help us to build and shape a world class company
Qualification
Required
Proficiency in backend or full stack development, including experience with web API frameworks and databases
Proficiency in Python or an ability to quickly learn new programming languages
Good understanding of AI and machine learning concepts
Experience with observability tools and monitoring solutions (e.g., Datadog, Splunk, AWS CloudWatch)
Familiarity with Ray or similar distributed systems frameworks
Solid background in debugging, architecture design, and coding
Excellent problem-solving skills and a collaborative mindset
Passion for building tools that enhance user experience and optimize workflows
Benefits
Stock Options
Healthcare plans, with premiums covered by Anyscale at 99% for both employees and dependents
401k Retirement Plan
Education & Wellbeing Stipend
Paid Parental Leave
Fertility Benefits
Paid Time Off
Commute reimbursement
100% of in-office meals covered
Company
Anyscale
Anyscale accelerates the development and productionization of any AI app, on any cloud, at any scale.
H1B Sponsorship
Anyscale has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (33)
2024 (14)
2023 (10)
2022 (10)
2021 (4)
2020 (1)
Funding
Current Stage
Growth StageTotal Funding
$259MKey Investors
New Enterprise AssociatesAndreessen Horowitz
2022-08-23Series C· $99M
2021-12-07Series C· $100M
2020-10-21Series B· $40M
Recent News
Foundation Capital
2026-01-02
2025-11-04
The New Stack
2025-10-23
Company data provided by crunchbase