Flock Safety · 1 month ago
Staff AI Systems Engineer
Flock Safety is seeking a Sr. AI Systems Engineer to support their emerging product, Night Shift, an AI research assistant designed to enhance investigative processes. The role involves working closely with engineering teams to develop and implement AI systems that improve the efficiency and accuracy of case investigations.
ManufacturingPublic SafetySecuritySensor
Responsibilities
Immerse yourself in the current system design and agent/tooling landscape. Understand the core customer use cases and data flows
Support the team by shipping a few quick wins (e.g., refining tool APIs, prompt engineering, fixing bugs)
Stand up the foundational eval and observability scaffolding (datasets, metrics, KPIs, reporting)
Propose a technical architecture and implementation plan for an agent evaluation framework
Deliver the MVP evaluation harness to produce initial metrics, enable debugging and perform regression testing
Take on a system feature that offers demonstrated improvement against your MVP evaluation suite
Productionize the evaluation and observability platform and make it the source of truth for quality and safety. (e.g. Online/offline tracing, alerting, dashboards, evaluations and PR-gated regression suite)
Own the roadmap for evolving the agent evaluation platform
Lead deeper R&D threads (e.g., lightweight fine-tuned projection layers, specialized embeddings, multimodal understanding) that can improve system performance on core metrics
Qualification
Required
6+ years of experience
Hands-on experience with LLM agents including LLM API use (e.g. LangChain/LangGraph, vLLM, OpenAI/Gemini/Anthropic APIs)
Agent Design: tool use (e.g. via MCP), retrieval, memory, grounding/attribution for claims, and guardrails
Architectural patterns: planning and hand-off for multi-agent systems, context management
RAG: vector/hybrid search (e.g. pgvector, turbopuffer, rerankers, etc.)
5+ years building and shipping ML systems to production
Backend Python and JS familiarity required; Typescript/Golang familiarity welcome
Web services (e.g. Express/FastAPI, REST, SSE, JWTs)
Cloud Infrastructure (e.g. AWS, Terraform, VPC, Networking)
Backend databases/stores (e.g. Postgres, Redis)
Observability (e.g. Prometheus, Grafana, OpenTelemetry, LangSmith/Langfuse)
Experience with LLM Evaluations at scale
Built offline/online eval harnesses and are familiar with the methodologies and metrics to measure search, retrieval, and recommendation performance
Safety & robustness (security, compliance, red-teaming, regression testing)
Cost, performance and latency trade-offs
Preferred
Durable execution (e.g. Temporal, Hatchet)
OLAP (e.g. ClickHouse, Bigquery)
ML Inference (e.g. PyTorch, TensorRT, NVIDIA Triton), ideally in multimodal domains (text/image/video)
Compute orchestration (e.g. Kubernetes, Prefect, Ray)
Agentic task success, trajectory quality, preference learning (e.g. SFT, DPO, RLHF, LLM-as-judge)
Benefits
Flock Safety Stock Options
Company
Flock Safety
Flock Safety provides end-to-end surveillance solutions to support law enforcement and communities.
H1B Sponsorship
Flock Safety has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2024 (1)
2022 (2)
2021 (2)
Funding
Current Stage
Late StageTotal Funding
$655.58MKey Investors
Andreessen HorowitzTiger Global ManagementMeritech Capital Partners
2025-03-13Series F· $275M
2022-02-15Series E· $150M
2021-07-13Series D· $150M
Recent News
Press Telegram
2026-01-18
2026-01-16
2026-01-16
Company data provided by crunchbase