Arc.dev · 1 day ago
Tech Lead - Software Engineer (AI Infrastructure & Model Serving)
Arc.dev is building a high-performance AI platform focused on fast inference and scalable model serving. They are seeking a Tech Lead Software Engineer to architect and scale their core AI infrastructure, leading engineering decisions and building a team as the company grows.
Career PlanningHuman ResourcesRecruiting
Responsibilities
Own the architecture, implementation, and evolution of our core platform
Lead engineering decisions, work closely with founders on product direction, and build a team around you as we scale
Architect the LLM inference stack (load-balancing, batching, token streaming)
Optimize GPU utilization (tensor parallelism, quantization, batching, KV cache)
Design distributed systems for high throughput and low latency
Lead model hosting: LLMs, diffusion models, multimodal, embeddings
Build APIs and SDKs (Python/JS) for developers
Implement observability tools: token logs, latency traces, request analytics
Build model routing layers based on cost/latency/performance tradeoffs
Integrate evals (benchmarks, datasets, scoring) into platform surfaces
Create highly available serving clusters with autoscaling
Implement CI/CD, container orchestration, and deployment tooling
Improve system performance, throughput, and cost efficiency
Drive engineering best practices and code quality
Make key architectural decisions and own the tech roadmap
Mentor engineers and help build out the initial engineering team
Work cross-functionally with product/design to define user-facing features
Qualification
Required
5+ years in backend, infra, or systems engineering
Strong experience with: Python, Go, or Rust
Cloud infrastructure (AWS/GCP/Azure)
Containers + orchestration (Docker, Kubernetes, Ray, or similar)
Ability to design and implement low-latency, high-scale services
Experience owning architecture from 0 → 1 or leading major systems
Strong debugging skills with performance-oriented mindset
Preferred
Experience with: LLM inference (vLLM, TensorRT-LLM, DeepSpeed, HuggingFace TGI)
Model quantization / LoRA / speculative decoding / paged attention
Distributed training or fine-tuning pipelines
CUDA/PyTorch, model inference kernels, or GPU programming
Distributed systems (microservices, RPC, autoscaling, scheduling)
GPU cluster management (NVLink, MIG, scheduling, multi-node topology)
Building developer tools or API-based platforms
Startup or early-stage company experience
Strong communication + leadership instincts
Benefits
Competitive salary + meaningful early equity
Company
Arc.dev
Arc is a global marketplace of top remote talent, vetted and ready to interview.
H1B Sponsorship
Arc.dev has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (2)
2022 (3)
Funding
Current Stage
Early StageTotal Funding
$1MKey Investors
Hyphen Capital
2021-04-28Equity Crowdfunding· $1M
2021-03-01Seed
Recent News
linkedin.com
2024-12-02
Company data provided by crunchbase