Spark developer (San Jose/CA) jobs in United States
cer-icon
Apply on Employer Site
company-logo

IBM · 2 weeks ago

Spark developer (San Jose/CA)

IBM is a leader in AI-powered, cloud-native software solutions, dedicated to transforming customer challenges into innovative solutions. The role involves developing and maintaining high-quality software products, specifically focusing on big data applications using Apache Spark and Scala, while collaborating with various teams to optimize data strategies and ensure compliance standards.

Business DevelopmentBusiness Information SystemsCRMData ManagementFoundational AISoftware
check
Growth Opportunities
check
H1B Sponsor Likelynote

Responsibilities

Design, develop, and optimize big data applications using Apache Spark and Scala
Architect and implement scalable data pipelines for both batch and real-time processing
Collaborate with data engineers, analysts, and architects to define data strategies
Optimize Spark jobs for performance and cost-effectiveness on distributed clusters
Build and maintain reusable code and libraries for future use
Work with various data storage systems like HDFS, Hive, HBase, Cassandra, Kafka, and Parquet
Implement data quality checks, logging, monitoring, and alerting for ETL jobs
Mentor junior developers and lead code reviews to ensure best practices
Ensure security, governance, and compliance standards are adhered to in all data processes
Troubleshoot and resolve performance issues and bugs in big data solutions

Qualification

Apache SparkScalaDistributed computingCloud platformsData pipelinesProblem-solvingCommunicationLeadership

Required

12+ years of total software development experience
Minimum 5+ years of hands-on experience with Apache Spark and Scala
Strong experience with distributed computing, parallel data processing, and cluster computing frameworks
Proficiency in Scala with deep knowledge of functional programming
Solid understanding of Spark tuning, partitions, joins, broadcast variables, and performance optimization techniques
Experience with cloud platforms such as AWS, Azure, or GCP (especially EMR, Databricks, or HDInsight)
Hands-on experience with Kafka, Hive, HBase, NoSQL databases, and data lake architectures
Familiarity with CI/CD pipelines, Git, Jenkins, and automated testing
Strong problem-solving skills and the ability to work independently or as part of a team

Preferred

Bachelor's Degree
Exposure to machine learning pipelines using Spark MLlib or integration with ML frameworks
Experience with data governance tools (e.g., Apache Atlas, Collibra)
Contributions to open-source big data projects are a plus
Excellent communication and leadership skills

Company

IBM is an IT technology and consulting firm providing computer hardware, software, infrastructure, and hosting services.

H1B Sponsorship

IBM has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (3032)
2024 (3301)
2023 (2160)
2022 (1809)
2021 (1157)
2020 (2669)

Funding

Current Stage
Public Company
Total Funding
unknown
2011-01-14IPO

Leadership Team

leader-logo
Alain Bénichou
Chief Executive Officer, IBM Greater China Group
linkedin
leader-logo
Alex Yang
CTO and Chief Architect
Company data provided by crunchbase