SIGN IN
Sr Linux Networking Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

fal · 7 hours ago

Sr Linux Networking Engineer

Fal is a company that orchestrates AI inference workloads across thousands of GPUs spread over multiple data centers and cloud providers. They are seeking a seasoned networking engineer to own the network layer, ensuring fast, reliable, and secure communication for model traffic and storage I/O.
Artificial Intelligence (AI)SoftwareInformation TechnologyAI InfrastructureDeveloper PlatformMachine Learning
check
H1B Sponsorednote

Responsibilities

Design, build, and operate the network fabric that interconnects our GPU fleet, including spine-leaf architectures, RDMA/RoCEv2 networks for distributed inference, and overlay networks for tenant isolation
Own L2/L3 network design across bare-metal and cloud environments, including BGP peering, ECMP, VXLAN/EVPN, and high-bandwidth interconnects between data centers
Develop and maintain network automation using Ansible, Terraform, and custom tooling to provision, configure, and validate switches, routers, DPUs, and SmartNICs at scale
Instrument deep network observability—build dashboards, alerting, and anomaly detection across our fabric using Prometheus, Grafana, and packet-level telemetry
Partner with the Compute and ML Performance teams to tune network paths for AI workloads, minimizing latency for model serving and maximizing throughput for large tensor transfers
Drive incident response and root-cause analysis for network-related production issues and build automation to prevent recurrence
Evaluate and qualify new networking hardware and software—NICs, switches, DPUs, SONiC, Cumulus, and similar—as we scale to next-generation GPU clusters

Qualification

Linux networking internalsRoutingSwitching protocolsHigh-performance networkingNetwork automationNetwork observability stacksNetwork operating systemsPythonGoGitAnsible

Required

8+ years of experience building and operating large-scale networks, ideally in GPU cloud, HPC, or hyperscale environments
Deep expertise in Linux networking internals: kernel networking stack, iptables/nftables, tc, eBPF, network namespaces, bonding/teaming, and SR-IOV
Strong command of routing and switching protocols: BGP, OSPF, ECMP, VXLAN, EVPN, MPLS, and segment routing
Hands-on experience with high-performance networking for AI/ML: RDMA, RoCEv2, InfiniBand, GPUDirect, and NCCL tuning
Proficiency automating network infrastructure with Ansible, Python, Go, and Git
Experience with network-as-code workflows
Familiarity with modern network operating systems such as SONiC, Cumulus Linux, Arista EOS, or Nokia SR Linux
Experience with network observability stacks: Prometheus, Grafana, sFlow/NetFlow, and packet capture tools

Preferred

Experience with DPU/SmartNIC programming (NVIDIA BlueField, AMD Pensando) and SDN/NFV architectures
Contributions to open-source networking projects (SONiC, FRR, DPDK, eBPF/XDP)
Experience operating networks that support Kubernetes and container-native workloads (Calico, Cilium, MetalLB)
Familiarity with data center physical layer design, optics, and cabling at scale

Benefits

Competitive salary and equity
Health, dental, and vision insurance (US)
Regular team events and offsite

Company

fal

twittertwittertwitter
company-logo
Fal is a generative media platform that helps developers create applications using AI models.

Funding

Current Stage
Late Stage
Total Funding
$337M
Key Investors
Sequoia CapitalMeritech Capital PartnersAndreessen Horowitz,Notable Capital
2025-12-09Series D· $140M
2025-07-31Series C· $125M
2025-02-12Series B· $49M

Leadership Team

leader-logo
Burkay Gur
Co-Founder
linkedin
leader-logo
Gorkem Yurtseven
Co-Founder
linkedin
Company data provided by crunchbase