GEICO · 2 weeks ago
Senior Staff Engineer – Data Lakehouse Platform
GEICO is a leading insurance company that offers quality coverage and innovative solutions to its customers. They are seeking a Senior Staff Engineer to build high-performance data infrastructure and lead the implementation of a core Data Lakehouse, driving the company's transformation into a tech organization focused on engineering excellence.
Auto InsuranceFinancial ServicesGovernmentInsuranceInternetMobile
Responsibilities
Scope, design, and build scalable, resilient Data Lakehouse components
Lead architecture sessions and reviews with peers and leadership
Spearhead new software evaluations and innovate with new tooling
Design and lead the development & implementation of Compute Efficiency projects like Smart Spark Auto-Tuning Feature
Drive performance regression testing, benchmarking, and continuous performance profiling
Accountable for the quality, usability, and performance of the solutions
Determine and support resource requirements, evaluate operational processes, measure outcomes to ensure desired results, and demonstrate adaptability and sponsoring continuous learning
Collaborate with customers, team members, and other engineering teams to solve our toughest problems
Be a role model and mentor, helping to coach and strengthen the technical expertise and know-how of our engineering community
Consistently share best practices and improve processes within and across teams
Share your passion for staying on top of the latest open-source projects, experimenting with, and learning recent technologies, participating in internal and external OSS technology communities, and mentoring other members of the engineering community
Qualification
Required
Deep knowledge of Spark internals, including Catalyst, Tungsten, AQE, CBO, scheduling, shuffle management, and memory tuning
Proven experience in tuning and optimizing Spark jobs on Hyper-Scale Spark Compute Platforms
Mastery of Spark configuration parameters, resource tuning, partitioning strategies, and job execution behaviors
Experience building automated optimization systems – from config auto-tuners to feedback loops and adaptive pipelines
Strong software engineering skills in Scala, Java, and python are required
Ability to build tooling to surface meaningful performance insights at scale
Deep understanding of auto-scaling and cost-efficiency strategies in cloud-based Spark environments
Exemplary ability to design and develop, perform experiments, and influence engineering direction and product roadmap
Advanced experience developing new and enhancing existing open-source based Data Lakehouse platform components
Experience cultivating relationships with and contributing to open-source software projects
Experience with open-source table formats (Apache Iceberg, Delta, Hudi or equivalent)
Advanced experience with open-source compute engines (Apache Spark, Apache Flink, Trino/Presto, or equivalent)
Experience with cloud computing (AWS, Microsoft Azure, Google Cloud, Hybrid Cloud, or equivalent)
Expertise in developing distributed systems that are scalable, resilient, and highly available
Experience in container technology like Docker and Kubernetes platform development
Experience with continuous delivery and infrastructure as code
In-depth knowledge of DevOps concepts and cloud architecture
Experience in Azure Network (Subscription, Security zoning, etc.) or equivalent
10+ years of professional experience in data software development, programming languages and developing with big data technologies
8+ years of experience with architecture and design
6+ years of experience with distributed systems, with at least 3 years focused on Apache Spark
6+ years of experience in open-source frameworks
4+ years of experience with AWS, GCP, Azure, or another cloud service
Bachelor's or Master's degree in Computer Science, Software Engineering, or related field like physics or mathematics
Preferred
Active or past Apache Spark Committer (or significant code contributions to OSS Apache Spark)
Experience with ML-based optimization techniques (e.g., reinforcement learning, Bayesian tuning, predictive models)
Contributions to other big data/open-source projects (e.g., Delta Lake, Iceberg, Flink, Presto, Trino)
Background in designing performance regression frameworks and benchmarking suites
Deep understanding of Spark accelerators (Spark RAPIDS, Apache Gluten, Apache Comet, Apache Auron, etc.) committer status in one or more project is a plus
Skilled in documenting methodologies and producing publication-style papers, whitepapers, and internal research briefs
Benefits
Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being.
Financial benefits including market-competitive compensation; a 401K savings plan vested from day one that offers a 6% match; performance and recognition-based incentives; and tuition assistance.
Access to additional benefits like mental healthcare as well as fertility and adoption assistance.
Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year.
Company
GEICO
GEICO, Government Employees Insurance Company, has been providing affordable auto insurance since 1936. It is a sub-organization of Berkshire Hathaway.
Funding
Current Stage
Late StageTotal Funding
unknown1996-01-01Acquired
Leadership Team
Recent News
Business Wire
2026-01-07
2025-12-15
Company data provided by crunchbase