Together AI · 5 months ago
Senior Software Engineer - Together Cloud Infrastructure
Together AI is building the AI Acceleration Cloud, an end-to-end platform for the full generative AI lifecycle. As a Senior AI Infrastructure Engineer, you will be responsible for designing and maintaining backend services that run in data centers, automating hardware management, and building a global high-performance object store for massive datasets.
AI InfrastructureArtificial Intelligence (AI)Generative AIInternetIT InfrastructureOpen Source
Responsibilities
Perform architecture and research work for decentralized AI workloads
Work on the core, open-source Together AI platform
Create services, tools, and developer documentation
Create testing frameworks for robustness and fault-tolerance
Qualification
Required
5+ years of professional software development experience and proficiency in at least one backend programming language (Golang desired)
5+ years experience writing high-performance, well-tested, production quality code
Demonstrated experience with building and operating high-performance and/or globally distributed micro-service architectures across one or more cloud providers (AWS, Azure, GCP)
Excellent communication skills – able to write clear design docs and work effectively with both technical and non-technical team members
Deep experience with Kubernetes internals a big plus, such as implementing non-trivial Kubernetes operators, device/storage/network plugins, custom schedulers, or patches thereon or Kubernetes itself
Deep experience with VMs/hypervisors a big plus, such as QEMU/KVM, cloud-hypervisor, VFIO, virtio, PCIE passthrough, Kubevirt, SR-IOV
Deep experience with DC networking tech + solutions a big plus, such as VLAN, VXLAN, VPN, VPC, OVS/OVN
Experience with Cluster API or similar a big plus
Experience working on high-performance compute, networking, and/or storage a big plus
Experience virtualizing GPUs and/or Infiniband a big plus
Strong systems knowledge across compute, networking, and storage, including concurrency, memory management, performant I/O, and scale
Experience with infrastructure automation tools (Terraform, Ansible), monitoring/observability stacks (Prometheus, Grafana), and CI/CD pipelines (GitHub Actions, ArgoCD)
Experience building IaaS or PaaS systems at scale a plus
Experience with DPUs/SmartNICs a plus
GPU programming, NCCL, CUDA knowledge a plus
Benefits
Competitive compensation
Startup equity
Health insurance
Other benefits
Flexibility in terms of remote work
Company
Together AI
Together AI is a cloud-based platform designed for constructing open-source generative AI and infrastructure for developing AI models.
H1B Sponsorship
Together AI has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (19)
2024 (6)
2023 (3)
Funding
Current Stage
Growth StageTotal Funding
$533.5MKey Investors
Salesforce VenturesLux Capital
2025-02-20Series B· $305M
2024-03-13Series A· $106M
2023-11-29Series A· $102.5M
Leadership Team
Recent News
2025-11-27
Company data provided by crunchbase