Maze · 2 months ago
Backend Engineer (Data Engineering)
Maze is a well-funded startup focused on building data infrastructure at the intersection of generative AI and cybersecurity. As a Backend Engineer (Data Engineering), you will design and implement data pipelines and systems that process security data at scale, playing a crucial role in enhancing the company's AI capabilities.
Artificial Intelligence (AI)Machine LearningNetwork Security
Responsibilities
Build Production Data Pipelines: Design, implement, and maintain scalable data pipelines that ingest gigabytes to terabytes of security data daily, processing millions of records in single-digit minutes while maintaining reliability and data quality
Architect Distributed Data Systems: Build and evolve our S3-based data lake infrastructure using Apache Iceberg, creating self-managed, distributed systems that enable rapid data transformations and efficient storage at massive scale
Own the Complete Data Lifecycle: Take end-to-end ownership from data ingestion through Kafka streams to transformation via Spark/EMR, ensuring seamless data flow from customer environments to our AI-powered analysis platform
Enable Platform Scalability: Build data infrastructure with platform thinking, creating systems that support current product needs while laying the foundation for future products and exponential data growth
Optimize for Enterprise Scale: Continuously improve data processing performance and cost efficiency as we scale from current volumes to supporting the world's largest enterprise security environments
Drive Technical Excellence: Establish data engineering best practices, participate in code reviews as a software engineer, and mentor team members on building robust, maintainable data systems
Collaborate Cross-Functionally: Work closely with infrastructure engineers, backend engineers, and product teams to ensure data systems seamlessly integrate with our AI agents and security analysis capabilities
Qualification
Required
7+ years of software engineering experience with at least 4+ years focused specifically on data engineering
Proven track record building and scaling data ingestion systems that handle gigabytes to terabytes daily
Deep, hands-on production experience with Python, Apache Kafka, and Apache Spark
Strong expertise with AWS data services including S3, EMR, and building data lakes at scale
Proven experience with Apache Iceberg, data lakehouse concepts, and building distributed data systems
Exceptional care and precision in data ingestion and transformation work
Direct experience working at companies that deal with serious data scale
Currently active as a developer, writing production code regularly
Preferred
Experience with Temporal workflow orchestration (very important for our architecture)
Knowledge of Apache Hoodie, Parquet, or ORC file formats for optimized data storage
Background with RDS, PostgreSQL optimization, or other database performance tuning
Previous experience at technical security product companies or handling security-related data
Track record of building self-service data platforms that enable other teams to operate independently
Benefits
Significant equity upside
Company
Maze
Maze is a platform that uses AI agents to investigate and resolve cloud security vulnerabilities.