Unstructured · 1 day ago
Principal Engineer
Unstructured is looking for a Principal Engineer to define the architectural foundation for how enterprises process and transform unstructured data for large-language-model applications. This high-impact role involves owning the technical direction for the core platform and collaborating with various teams to ensure system performance and resilience.
Artificial Intelligence (AI)Machine LearningNatural Language ProcessingOpen Source
Responsibilities
Define and evolve the end-to-end architecture for Unstructured’s data transformation and retrieval platform
Build and scale distributed systems that process massive volumes of unstructured data across diverse formats and sources
Serve as the company-wide authority on Kubernetes orchestration, cluster design, performance tuning, and reliability
Lead Python architecture and best practices—ensuring performance, modularity, and maintainability across services
Design and optimize Postgres schemas, queries, and indexing strategies to support large-scale metadata and retrieval pipelines
Mentor senior engineers through design reviews and code guidance, raising the bar for technical excellence across the org
Partner with the infrastructure and product teams to translate research prototypes into production-grade systems
Evaluate emerging technologies and open-source tools in LLM infrastructure, retrieval, and orchestration—deciding where and how to integrate them
Qualification
Required
15+ years of software engineering experience with a focus on distributed systems, infrastructure, or data architecture
Expertise in Python—capable of building frameworks and performance-critical services from scratch
Deep Kubernetes expertise; able to design, deploy, and debug at scale and teach others how to productionize it securely
Fluency in Postgres—understanding query planning, partitioning, and tuning for high-throughput environments
Obsessed with clean, scalable architecture and able to lead design reviews that shape how entire systems evolve
Experience in high-performance data or AI/ML systems—especially those involving retrieval pipelines, embeddings, or hybrid workloads
Ability to thrive in fast-moving, ambiguous environments where technical depth and judgment matter more than process
Preferred
Experience building or scaling LLM-powered or RAG systems in production
Familiarity with open-source orchestration frameworks, vector databases, or hybrid cloud infrastructure
Contributions to open-source projects in Python, Kubernetes, or distributed systems
Benefits
Competitive salary
Equity
Full benefits package
Company
Unstructured
At Unstructured, we're on a mission to give organizations access to all their data.
H1B Sponsorship
Unstructured has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)
Funding
Current Stage
Growth StageTotal Funding
$65MKey Investors
Menlo VenturesBain Capital Ventures
2024-03-14Series B· $40M
2023-07-19Series A· $25M
2023-07-19Seed
Leadership Team
Recent News
2025-04-07
Company data provided by crunchbase