DAT Freight & Analytics · 1 day ago
Principal Machine Learning Engineer
DAT Freight & Analytics is an award-winning employer of choice and a next-generation SaaS technology company in transportation supply chain logistics. They are seeking a Principal Machine Learning Engineer to scale and evolve their critical Data and ML Platform capabilities, focusing on building foundational infrastructure for ML and AI systems.
LogisticsTransportation
Responsibilities
Deliver lower-latency data to models, unlocking online learning, adaptive policies, and improved real-time decision-making for Convoy’s auction mechanism, fraud detection apparatus, and carrier engagement campaigns
Evolve our ML platform to support generative AI, including orchestration, retrieval, standardized service patterns, and scalable model serving needed for foundational model applications in document digitization and voice-based features
Experiment faster and safer, through robust causal inference tooling, richer randomized experimentation, and reliable evaluation infrastructure to help us learn more about the unique spatio-temporal dynamics of a Trucking marketplace
Drive the evolution of Convoy’s experimentation and model-evaluation foundations
Enable rigorous causal measurement, reliable online experimentation, scalable model iteration, and adaptive learning systems that continuously improve marketplace and policy decisions
Evolve Convoy’s experimentation stack (TestDrive): Add richer randomized experiments, causal inference tooling, exposure/assignment logging, and metric pipelines; evaluate third-party solutions where beneficial
Enable adaptive learning approaches (RL, contextual bandits, online learning) for dynamic marketplace and policy decisions (e.g., inferring the best timing, cohort, or communication channel to maximize carrier engagement)
Harden our evaluation infrastructure, including offline/online pipelines, drift detection mechanisms, and structured feedback loops that ensure reliable model behavior over time
Implement orchestration layers that combine inference, retrieval, business logic, guardrails, and human-in-loop flows into reliable, auditable multi-step AI agents
Iterate on and expand Convoy Platform’s low-latency Feature Store and real-time streaming platform (on RisingWave) to deliver signals such as app analytics, carrier behavior, and digital fingerprints to support marketplace optimization, fraud detection, and other decision systems
Ensure unified online/offline semantics to improve online decision-making, support real-time optimization, and enable future reinforcement-learning and online-learning workflows
Build high-throughput streaming pipelines for carrier engagement, risk indicators, and fraud signals that power sub-minute marketplace and policy decisions
Develop platform-level trucking knowledge systems, including RAG indexes, domain adapters, structured benchmarks, and retrieval strategies that ground AI systems in operational realities
Design and scale the platform ecosystem, leveraging Kafka, Snowflake, Kubernetes, and modern data formats (Avro, JSON, Iceberg), and use Python/Go to build the “connective tissue” that ensures platform reliability and scale
Build low-latency, production-grade Python services and contribute to TypeScript/Node where needed (e.g., emitting high-quality data signals, wiring model calls into product flows, enabling experimentation and feature-flag pathways)
Partner with scientists to define durable service patterns (API design, serving workflows, monitoring) and uplift the platform that enables fast, safe iteration on ML-backed services
Mature platform infrastructure, including Terraform/IaC, CI/CD, observability, logging/tracing, incident readiness, and cost/performance optimization
Improve SQL/dbt workflows and batch/streaming pipelines to increase reliability, correctness, and scalability
Extend model-serving infrastructure to support more advanced ML workloads (managed inference →self-hosted GPU), with standardized versioning, canary/A/B rollouts, and granular monitoring
Qualification
Required
8–12+ years of experience in ML engineering, data infrastructure, platform engineering, or closely related production engineering roles
Deep hands-on experience with real-time ML platforms, including feature stores, stream processing, low-latency data services, and online inference systems
Strong proficiency in Python, with the ability to work across non-Python stacks including TypeScript/Node, gRPC services, and Kubernetes-based microservice ecosystems
Expertise in modern data and ML infrastructure, including Kafka, Kubernetes, Postgres-like OLTP systems, cloud platforms, and production observability tooling
Experience building and operating robust data and ML pipelines (both batch and streaming), ideally in high-scale environments such as marketplaces, fraud detection systems, pricing, personalization, or real-time decision platforms
Strong DevOps and MLOps fundamentals, including CI/CD, containerization, infrastructure-as-code (Terraform/Helm), automated monitoring, and cloud cost and performance optimization
Collaborative platform mindset, with a track record of partnering with scientists and product engineers to co-design durable service patterns for model serving, deployment, monitoring, and API design, enabling fast and safe iteration on ML-backed systems
Ability to operate at Principal scope, setting technical direction, identifying and retiring platform risk, mentoring engineers, and delivering solutions whose impact scales across teams and the broader organization
Preferred
Experience building ML systems in two-sided marketplaces, financial markets, or other economically complex environments, with intuition for incentives, pricing, and market dynamics
Deep experience with data reliability and correctness at scale, including schema evolution, data quality enforcement, backfills, late data handling, and incident response for production data systems
Applied advanced ML techniques such as reinforcement learning, bandits, or optimization to unlock real-world business impact, ideally within freight, logistics, or transportation technology
Benefits
Medical, Dental, Vision, Life, and AD&D insurance
Parental Leave
Flexible Vacation Time (FVT)
An additional 10 holidays of paid time off per calendar year
401k matching (immediately vested)
Employee Stock Purchase Plan
Short- and Long-term disability sick leave
Flexible Spending Accounts
Health Savings Accounts
Employee Assistance Program
Additional programs - Employee Referral, Internal Recognition, and Wellness
Free TriMet transit pass (Beaverton Office)
Competitive salary and benefits package
Work on impactful projects in a cutting-edge environment
Collaborative and supportive team culture
Opportunity to make a real difference in the trucking industry
Employee Resource Groups
Company
DAT Freight & Analytics
DAT Freight & Analytics maintains a market for truckload freight.
Funding
Current Stage
Late StageRecent News
2025-10-24
2025-10-16
Company data provided by crunchbase