Unstructured · 2 weeks ago
Principal + Staff Engineers
Unstructured is defining the standard for enterprise data transformation in the age of LLMs and generative AI. They are seeking Staff and Principal Engineers to define the architectural foundation for processing and transforming unstructured data for LLM applications, focusing on building scalable distributed systems and leading technical direction.
Artificial Intelligence (AI)Machine LearningNatural Language ProcessingOpen Source
Responsibilities
Define and evolve the end-to-end architecture for Unstructured’s data transformation and retrieval platform
Build and scale distributed systems that process massive volumes of unstructured data across diverse formats and sources
Serve as the company-wide authority on Kubernetes orchestration, cluster design, performance tuning, and reliability
Lead Python architecture and best practices—ensuring performance, modularity, and maintainability across services
Design and optimize Postgres schemas, queries, and indexing strategies to support large-scale metadata and retrieval pipelines
Mentor senior engineers through design reviews and code guidance, raising the bar for technical excellence across the org
Partner with the infrastructure and product teams to translate research prototypes into production-grade systems
Evaluate emerging technologies and open-source tools in LLM infrastructure, retrieval, and orchestration—deciding where and how to integrate them
Qualification
Required
Have 10+ years of software engineering experience with a focus on distributed systems, infrastructure, or data architecture
Are a Python expert—capable of building frameworks and performance-critical services from scratch
Have deep Kubernetes expertise; you can design, deploy, and debug at scale and could teach others how to productionize it securely
Are fluent in Postgres—you understand query planning, partitioning, and tuning for high-throughput environments
Are obsessed with clean, scalable architecture and can lead design reviews that shape how entire systems evolve
Have experience in high-performance data or AI/ML systems—especially those involving retrieval pipelines, embeddings, or hybrid workloads
Thrive in fast-moving, ambiguous environments where technical depth and judgment matter more than process
Preferred
Experience building or scaling LLM-powered or RAG systems in production
Familiarity with open-source orchestration frameworks, vector databases, or hybrid cloud infrastructure
Contributions to open-source projects in Python, Kubernetes, or distributed systems
Benefits
Day-one medical, dental, and vision coverage
Life and disability insurance
Unlimited PTO
Flexible parental leave
401(k) options
Competitive referral incentives
Company
Unstructured
At Unstructured, we're on a mission to give organizations access to all their data.
H1B Sponsorship
Unstructured has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)
Funding
Current Stage
Growth StageTotal Funding
$65MKey Investors
Menlo VenturesBain Capital Ventures
2024-03-14Series B· $40M
2023-07-19Series A· $25M
2023-07-19Seed
Leadership Team
Recent News
Dynamic Business
2026-01-22
Company data provided by crunchbase