Semantic Backend Engineer (Contract, Remote) jobs in United States
cer-icon
Apply on Employer Site
company-logo

INFUSE · 1 month ago

Semantic Backend Engineer (Contract, Remote)

Infuse is focused on building a rich catalog of marketing-grade B2B content. They are seeking an Applied ML Engineer to own the semantic ingestion pipeline, transforming raw PDFs into structured resources and ensuring high-quality content through various filtering and classification processes.

AdvertisingB2BContent MarketingDigital MarketingLead GenerationLead ManagementPublishing
check
H1B Sponsor Likelynote

Responsibilities

Own the ETL pipeline from raw PDFs (S3-ingested) to structured resources
Finalize our summarization + classification flow using open-source models with GPT-4o fallback
Apply filtering logic (≤3 years old, ≤100 pages, etc) to enforce resource quality
Map each asset to the specific topic taxonomy (10+ per topic across ~9,000 topics)
Generate dense embeddings using sentence-transformers
Load and query embeddings using Milvus or pgvector
Implement “freshness” logic to identify and index only new or updated content based on file diffing, crawl timestamp, or document hash
Build a QA/eval harness: format compliance, recall@5, drift monitoring
Expose /v1/semantic-search via FastAPI, with filtering and rank fusion
Collaborate closely with our Tech Lead on UX integration and snippet generation

Qualification

ETL pipelineSemantic searchPythonFastAPISentence-transformersOpenAI APIsMilvusPgvectorCollaborationProblem-solving

Required

Own the ETL pipeline from raw PDFs (S3-ingested) to structured resources
Finalize our summarization + classification flow using open-source models with GPT-4o fallback
Apply filtering logic (≤3 years old, ≤100 pages, etc) to enforce resource quality
Map each asset to the specific topic taxonomy (10+ per topic across ~9,000 topics)
Generate dense embeddings using sentence-transformers
Load and query embeddings using Milvus or pgvector
Implement “freshness” logic to identify and index only new or updated content based on file diffing, crawl timestamp, or document hash
Build a QA/eval harness: format compliance, recall@5, drift monitoring
Expose /v1/semantic-search via FastAPI, with filtering and rank fusion
Collaborate closely with our Tech Lead on UX integration and snippet generation
Python, PyTorch, sentence-transformers, OpenAI APIs, or similar pretrained LLMs
FastAPI, Milvus or pgvector, PyPDF/Tika, Airflow or Lambda for orchestration
Docker, GPU scheduling, Athena/Redshift SQL
You've built ML pipelines that touched real users, not just notebooks
You've worked on semantic search, embeddings, or large-scale tagging
You've wrestled with unstructured data and love turning chaos into clarity
You like working fast, iterating with feedback, and tracking metrics that matter

Company

INFUSE

twittertwitter
company-logo
DEMAND EXCELLENCE DELIVERED

H1B Sponsorship

INFUSE has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (2)

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Alexander Kesler
Founder & CEO
linkedin
Company data provided by crunchbase