GitLab · 2 hours ago
Distinguished Data Systems Architect, Data Engineering
GitLab is an open-core software company that develops an AI-powered DevSecOps Platform used by over 100,000 organizations. The Distinguished Data Systems Architect will drive the strategic evolution of the data platform by architecting scalable solutions and establishing governance frameworks across SaaS and self-managed deployments.
Cloud SecurityDeveloper ToolsDevOpsOpen SourceSaaS
Responsibilities
Drive architectural vision for scalable, distributed data systems across SaaS and self-managed deployments, designing database solutions that balance OLTP/OLAP performance, scalability, and cost-efficiency
Establish enterprise data governance frameworks including lineage, quality controls, versioning, and compliance practices that meet regulatory requirements across global markets
Architect monetizable data services and APIs with semantic models serving internal analytics and external product offerings, enabling new revenue streams while maintaining security and performance SLAs
Create a cohesive architectural blueprint of GitLab's data ecosystem, identifying gaps against modern platforms and establishing opinionated design principles grounded in proven cloud-native patterns
Design event-driven architectures and end-to-end data lifecycle systems spanning ingestion, orchestration (Argo, Airflow, Kubernetes), transformation workflows, and unified metadata management with comprehensive observability
Partner with product and engineering leadership to embed AI-driven patterns into data infrastructure and align senior engineering leaders on common design tenets and platform standards
Transform ambiguous business challenges into strategic technical roadmaps, leading high-stakes architectural engagements where data platforms create measurable competitive differentiation
Qualification
Required
Experience architecting large-scale distributed data systems in complex, regulated domains with unified platforms integrating cloud-native compute, orchestration, and semantic modeling
Demonstrated leadership building multi-modal data services with strong developer experience principles, focusing on monetization, governance, and data product lifecycle management
Hands-on expertise with modern data stack technologies including Python, Docker, Airflow, Trino, Postgres, distributed query engines, and graph-based metadata systems
Advanced knowledge bridging cloud and on-premises deployments with automation, developer self-service focus, and data integration through connector marketplaces
Deep understanding of data processing paradigms and standards including synchronous vs. asynchronous processing, schema management, logical data modeling, and formats like OpenTelemetry, OpenMetadata, and OpenLineage
Experience with AI-driven architectures and emerging technologies including model orchestration, agentic patterns, and standards like MCP (Model Context Protocol)
Strong architectural opinions on cost-aware, resilient solutions that optimize entire data lifecycle decisions with focus on scalability and performance trade-offs
Passion for open source platforms, team mentorship, and collaborative values with ability to build scalable solutions that align with organizational culture and technical excellence
Design and implement Model Driven Architecture (MDA) framework to establish clear separation between logical/conceptual data models and platform-specific physical implementations, enabling agility and reducing technical debt across enterprise data systems
Benefits
Flexible Paid Time Off
Team Member Resource Groups
Equity Compensation & Employee Stock Purchase Plan
Growth and Development Fund
Parental leave
Home office support
Company
GitLab
GitLab is a web-based Git repository manager that offers a variety of features for software development teams.
Funding
Current Stage
Public CompanyTotal Funding
$413.5MKey Investors
ICONIQ GrowthGoogle VenturesAugust Capital
2021-10-14IPO
2019-09-17Series E· $268M
2018-09-19Series D· $100M
Recent News
2026-01-02
2025-12-29
2025-12-24
Company data provided by crunchbase