Boson AI · 1 day ago
Member of Technical Staff, Multimodal
Boson AI is an early-stage startup building large language tools for everyone to use. They are seeking research scientists and engineers to join their team full-time in their Santa Clara office, where you will work on designing model architectures and improving generative multimodal models.
Artificial Intelligence (AI)Information TechnologyMarket Research
Responsibilities
Design model architectures and loss objectives to handle combinations of images, video, text, speech, and audio data
Build diverse datasets to support multimodality learning, including data collection and processing
Develop new evaluation pipelines to adapt to various forms of generative outputs
Qualification
Required
Master or Doctoral degree in computer science or equivalent
Experience in writing clean and efficient code
Proficiency in at least one deep learning framework, such as PyTorch or JAX
Participated in at least 1 research project related to multimodality learning
Preferred
Experience in generic multimodality learning research (e.g., multimodal joint embedding, text-to-image generation, text-to-video generation, etc.)
Experience in document understanding (e.g., layout analysis, structured data extraction, OCR)
Experience in audio transcribe, diarization, audio generation, etc
Active Github contributions are a big plus
Experience in handling data at billions-scale
Company
Boson AI
Boson AI is an AI company that develops large language model tools.
H1B Sponsorship
Boson AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (4)
2024 (7)
2023 (2)
Funding
Current Stage
Early StageCompany data provided by crunchbase