Data Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Abaka AI · 2 months ago

Data Engineer

Abakaai is a company focused on data engineering and artificial intelligence solutions. They are seeking a Data Engineer to collaborate with clients on data requirements, develop scalable data pipelines, and address technical challenges in multimodal data processing.

Data Collection and LabelingMachine LearningNatural Language Processing
check
H1B Sponsor Likelynote

Responsibilities

Collaborate closely with foundation model clients to understand their data requirements; coordinate internal teams to develop tailored delivery plans and ensure on-time, high-quality data delivery (e.g., meeting format, precision, and volume expectations)
Lead the development of mid- to long-term plans for the data engineering function. Build scalable, end-to-end pipelines for multimodal data (text, image, audio, video, 3D point cloud, etc.) including data sourcing, cleaning, annotation, QA, storage, and iterative optimization for training, fine-tuning, and evaluation
Drive solutions to core technical challenges in multimodal data processing, such as cross-modal alignment (e.g., image-text semantic matching), large-scale data cleaning (e.g., deduplication, denoising, format normalization), annotation efficiency, and data encryption/security
Work cross-functionally with algorithm, product, and business teams: for example, providing feedback to model teams on data bottlenecks, helping refine internal tooling and services, or supporting client-facing teams with technical documentation and pre-sales support
Assess and optimize the cost structure of data processing operations, including headcount, infrastructure, and tooling—striking a balance between quality, efficiency, and scalability

Qualification

Data engineeringMultimodal data workflowsTechnical architecture designLarge-scale data systemsData privacySecurityResilienceTeam managementCommunication skillsOwnership

Required

Strong background in computer science, data engineering, artificial intelligence, or related fields, with hands-on experience in large-scale data systems
1+ years of experience in data engineering or data operations; leadership experience is highly valued. Prior involvement in LLM or multimodal dataset preparation is a strong plus
Deep understanding of end-to-end multimodal data workflows, with hands-on experience in at least two modalities (e.g., text, images, audio, video)
Proficient in designing technical architectures for large-scale data pipelines (e.g., distributed processing, automation frameworks). Familiarity with data privacy and security best practices (e.g., access control, data anonymization)
Strong execution and team management skills—able to translate high-level objectives into actionable plans and drive team outcomes
Excellent communication and cross-functional collaboration skills—able to clearly convey technical and operational requirements, resolve conflicts, and manage stakeholder expectations
High sense of ownership and resilience—comfortable working in a fast-paced, evolving AI landscape and capable of navigating urgent delivery timelines

Company

Abaka AI

twittertwittertwitter
company-logo
Abaka AI is a leading AI company and we are committed to becoming the data partner in artificial intelligence industry.

H1B Sponsorship

Abaka AI has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Minyi (Crystal) Chen
North American Business Partner
linkedin
Company data provided by crunchbase