Alldus · 8 hours ago
Staff Software Engineer (Data)
Alldus builds autonomous AI agents for regulated healthcare environments, supporting enterprise and managed-service customers with systems that prioritize clinical safety and compliance. As a Staff Software Engineer (Data), you will own the technical direction of a large-scale data platform, architecting and operating streaming and batch infrastructure that processes clinical events and outcome data at scale.
Responsibilities
Own the technical architecture of the data platform across Databricks, Delta Lake, and supporting infrastructure
Establish engineering standards for pipeline reliability, data quality, observability, and operational excellence
Architect streaming and CDC pipelines that enable real-time analytics and agent feedback loops
Design the backend data architecture for internal research and analytics platforms, including natural-language query capabilities
Build data mining systems for persona discovery, scenario extraction, and edge-case detection
Design anonymization and secure data-sharing infrastructure for research partnerships
Own multi-region data architecture and compliance requirements
Make build-versus-buy decisions for data tooling and evaluate technical tradeoffs
Mentor engineers and define patterns that raise the technical bar across the data team
Collaborate closely with data scientists, agent engineers, and domain experts to align data capabilities with business and product needs
Qualification
Required
7+ years of production data engineering experience, including time at high-caliber engineering organizations
Deep expertise with Databricks, Spark, and Delta Lake operating at scale
Strong Python and SQL skills with a solid understanding of distributed data systems
Proven experience designing data architectures that scale reliably
Extensive experience with streaming systems, CDC patterns, and real-time data processing
Strong command of data modeling, medallion architecture, and query optimization
Track record of setting engineering standards and mentoring other engineers
Extremely high bar for data quality, reliability, and operational rigor
Execution-oriented yet defensive-minded: able to ship infrastructure while anticipating failure modes
Clear communication across engineering, data science, and executive stakeholders
Preferred
Experience working with healthcare data platforms or regulated data environments
Background designing multi-tenant data systems with strict isolation requirements
Experience building natural-language query interfaces or LLM-powered data tools
Familiarity with ML infrastructure such as feature stores, training pipelines, or model serving
Experience with cross-organization data sharing technologies
Knowledge of vector search systems and embedding infrastructure at scale
Benefits
Comprehensive medical, dental, and vision coverage
Mental health support and wellness coaching
Flexible wellness stipend for fitness, therapy, or personal development
Daily catered lunch and dinner
Annual learning budget for courses, books, or conferences
Support for professional conference attendance
Development environment of your choice
Opportunities for academic and research collaboration