INFUSE · 1 month ago
Semantic Backend Engineer (Contract, Remote)
Infuse is focused on building a rich catalog of marketing-grade B2B content. They are seeking an Applied ML Engineer to own the semantic ingestion pipeline, transforming raw PDFs into structured resources and ensuring high-quality content through various filtering and classification processes.
AdvertisingB2BContent MarketingDigital MarketingLead GenerationLead ManagementPublishing
Responsibilities
Own the ETL pipeline from raw PDFs (S3-ingested) to structured resources
Finalize our summarization + classification flow using open-source models with GPT-4o fallback
Apply filtering logic (≤3 years old, ≤100 pages, etc) to enforce resource quality
Map each asset to the specific topic taxonomy (10+ per topic across ~9,000 topics)
Generate dense embeddings using sentence-transformers
Load and query embeddings using Milvus or pgvector
Implement “freshness” logic to identify and index only new or updated content based on file diffing, crawl timestamp, or document hash
Build a QA/eval harness: format compliance, recall@5, drift monitoring
Expose /v1/semantic-search via FastAPI, with filtering and rank fusion
Collaborate closely with our Tech Lead on UX integration and snippet generation
Qualification
Required
Own the ETL pipeline from raw PDFs (S3-ingested) to structured resources
Finalize our summarization + classification flow using open-source models with GPT-4o fallback
Apply filtering logic (≤3 years old, ≤100 pages, etc) to enforce resource quality
Map each asset to the specific topic taxonomy (10+ per topic across ~9,000 topics)
Generate dense embeddings using sentence-transformers
Load and query embeddings using Milvus or pgvector
Implement “freshness” logic to identify and index only new or updated content based on file diffing, crawl timestamp, or document hash
Build a QA/eval harness: format compliance, recall@5, drift monitoring
Expose /v1/semantic-search via FastAPI, with filtering and rank fusion
Collaborate closely with our Tech Lead on UX integration and snippet generation
Python, PyTorch, sentence-transformers, OpenAI APIs, or similar pretrained LLMs
FastAPI, Milvus or pgvector, PyPDF/Tika, Airflow or Lambda for orchestration
Docker, GPU scheduling, Athena/Redshift SQL
You've built ML pipelines that touched real users, not just notebooks
You've worked on semantic search, embeddings, or large-scale tagging
You've wrestled with unstructured data and love turning chaos into clarity
You like working fast, iterating with feedback, and tracking metrics that matter
Company
INFUSE
DEMAND EXCELLENCE DELIVERED
H1B Sponsorship
INFUSE has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (6)
2024 (2)
Funding
Current Stage
Late StageRecent News
Newswire
2026-01-09
2025-11-15
Company data provided by crunchbase