Software Developer (Backend – Integration) jobs in United States
cer-icon
Apply on Employer Site
company-logo

Guidehouse · 1 day ago

Software Developer (Backend – Integration)

Guidehouse is seeking a Software Developer to join their Technology / AI and Data team, supporting mission-critical initiatives for Defense and Security clients. In this role, you will lead the design and implementation of secure, scalable ingestion and data processing workflows that power advanced AI-driven platforms.

AdviceConsultingManagement Consulting
badNo H1BnoteSecurity Clearance RequirednoteU.S. Citizen Onlynote

Responsibilities

Serves as the lead backend integration engineer responsible for architecting and implementing ingestion, preprocessing, normalization, and transformation workflows for the FBI adjudication AI platform
Designs ingestion frameworks supporting SF-86 forms, investigative attachments, summaries, financial/criminal records, and continuous vetting alerts using both traditional OCR and VLM/LLM-based document understanding
Ensures ingestion workflows comply with FedRAMP High, RMF, CJIS, and FBI ATO requirements, including logging, auditability, encryption, and secure processing of PII and sensitive investigative information
Collaborates with AI/ML engineers, backend API developers, cloud engineers, and security engineers to ensure ingestion outputs are optimized for RAG workflows, SEAD-4 scoring, anomaly detection, and adjudicator review
Data Ingestion, Parsing & ETL Architecture
Design ingestion pipelines supporting LLMs and VLMs for OCR, document understanding, multimodal extraction, and parsing of complex investigative materials including forms, tables, handwritten elements, and embedded imagery
Build scalable ingestion and ETL workflows capable of processing hundreds of pages per case using OCR engines (Textract, Tesseract) and VLM-based parsing models such as LayoutLM, Qwen-VL, Donut, or LLaVA
Implement normalization and transformation workflows including deduplication, schema harmonization, field mapping, classification labeling, chunking, segmentation, and tokenization optimized for downstream LLM/RAG operations
Develop fault-tolerant ingestion systems with checkpointing, idempotency, retry frameworks, ingestion-state tracking, and structured error reporting
Backend Integration & System Connectivity
Build secure, compliant integrations with FBI systems, case repositories, identity/HR systems, and continuous vetting alert sources using APIs, ETL endpoints, SFTP, and message queues
Develop backend microservices that assemble case packages, correlate evidence across disparate sources, and produce structured adjudication-ready datasets
Integrate ingestion outputs with vector databases, embedding pipelines, and LLM inference services, ensuring data is structured, enriched, and optimized for reasoning workflows
Ensure all integrations enforce strict authentication, authorization, validation, and data-handling policies
RAG / LLM Data Preparation
Create ingestion workflows that prepare documents and extracted content for embeddings, retrieval indexing, semantic search, and long-context reasoning
Implement chunking, segmentation, labeling, and evidence-tagging strategies designed to maximize retrieval precision and reduce hallucination risk in LLM inference
Develop heuristics for filtering, prioritizing, and contextualizing extracted information to enable fact-grounded SEAD-4 scoring and memo generation
Support preparation of vector representations, metadata fields, and retrieval keys for large-scale evidence collections
Security, Compliance & Logging
Implement secure ingestion pipelines aligned with FedRAMP High, RMF, CJIS, and FBI security requirements including encryption, access control, PII-handling rules, and secure logging
Apply advanced PII-safe processing techniques including automated redaction, VLM-aided sensitive field detection, classification tagging, and compliance-driven filtering
Ensure ingestion systems generate detailed logs, lineage metadata, provenance trails, and audit events supporting adjudication oversight and accreditation documentation
Collaborate with Security Engineers to ensure ingestion controls map to SSP requirements and POA&M items are remediated promptly
Performance Optimization & Reliability
Optimize ingestion pipelines for parallelization, concurrency, batching, memory efficiency, and large-scale document processing throughput
Implement distributed ETL frameworks such as Step Functions, Airflow, Dagster, Glue, or Spark depending on workload and security constraints
Develop monitoring dashboards capturing ingestion throughput, VLM/LLM OCR accuracy metrics, error frequencies, latency patterns, and retry trends
Implement resilience features including dead-letter queues, backoff retry mechanisms, fault isolation, and disaster-recovery patterns
Collaboration, Leadership & Mission Enablement
Align ingestion outputs directly with AI/ML engineer requirements for long-context LLM inference, retrieval indexing, and SEAD-4 scoring workflows
Work with backend API developers to ensure ingestion flows integrate seamlessly with scoring engines, entity explorers, memo builders, and anomaly detection pipelines
Participate in sprint ceremonies, architecture reviews, backlog refinement, and cross-functional coordination with mission stakeholders
Mentor mid-level engineers in ETL design, multimodal OCR techniques, distributed system patterns, and secure ingestion best practices

Qualification

PythonETL workflowsOCR/VLM document understandingAirflowData processing workflowsDistributed processing frameworksJavaScalaCollaborationLeadership

Required

An ACTIVE and MAINTAINED 'TOP SECRET' Federal or DoD security clearance
Requires a University Degree and minimum 4-6 years of prior relevant experience; (Relevant experience may be substituted for formal education or advanced degree)
5 years of backend/integration engineering experience, including 3 years in large-scale ETL or ingestion workflows
Deep experience with Python, Java, or Scala; ingestion frameworks such as Airflow, Step Functions, Dagster, or Glue
Experience with ETL pipelines, large-scale document ingestion, OCR/VLM document understanding, unstructured data parsing
Experience developing secure data processing/normalization workflows
Experience with distributed processing frameworks

Preferred

An ACTIVE and MAINTAINED 'TOP SECRET' Federal or DoD security clearance
Once onboard with Guidehouse, new hire MUST be able to OBTAIN and MAINTAIN a Federal or DoD 'TOP SECRET/SCI (TS/SCI)' security clearance
8+ years of backend/integration engineering experience, including 4+ years in large-scale ETL or ingestion workflows
Experience integrating FBI, DCSA, or NBIB systems or adjudication-related data sources
Experience designing ingestion workflows for RAG, embeddings, vector databases, or long-context LLM pipelines
Experience training or applying VLMs such as LayoutLM, Donut, Qwen-VL, or LLaVA for OCR replacement or enhancement
Experience with knowledge graphs, entity resolution, evidence-linking workflow development
Familiarity with SEAD-4, continuous vetting, or investigative case analysis processes
Airflow vs. Dagster vs. Step Functions
Textract, Tesseract, LayoutLM, Donut, Qwen-VL, LLaVA
Specific AWS ingestion tools (Glue, Batch, S3 eventing)

Benefits

Medical, Rx, Dental & Vision Insurance
Personal and Family Sick Time & Company Paid Holidays
Position may be eligible for a discretionary variable incentive bonus
Parental Leave and Adoption Assistance
401(k) Retirement Plan
Basic Life & Supplemental Life
Health Savings Account, Dental/Vision & Dependent Care Flexible Spending Accounts
Short-Term & Long-Term Disability
Student Loan PayDown
Tuition Reimbursement, Personal Development & Learning Opportunities
Skills Development & Certifications
Employee Referral Program
Corporate Sponsored Events & Community Outreach
Emergency Back-Up Childcare Program
Mobility Stipend

Company

Guidehouse

company-logo
Guidehouse offers consulting services for public and commercial markets with expertise in management, technology, and risk consulting.

Funding

Current Stage
Late Stage
Total Funding
$0.75M
Key Investors
Mission Daybreak
2023-11-06Acquired
2023-02-16Grant· $0.75M

Leadership Team

leader-logo
Scott McIntyre
Chairman and CEO
linkedin
leader-logo
Alicia Harkness
Partner
linkedin
Company data provided by crunchbase