Data Engineer - Dallas, TX jobs in United States
cer-icon
Apply on Employer Site
company-logo

Photon · 1 day ago

Data Engineer - Dallas, TX

Photon is seeking a Data Engineer to build and scale the data infrastructure for their Agentic AI products. The role involves designing data pipelines, optimizing vector databases, and ensuring data quality for AI applications.

E-CommerceInformation TechnologyMobile AppsWeb DesignWeb Development
check
H1B Sponsor Likelynote

Responsibilities

Design and implement scalable ETL/ELT pipelines that process both structured (SQL, logs) and unstructured (PDFs, emails, docs) data specifically for LLM consumption
Architect and optimize Vector Databases (e.g., Pinecone, Weaviate, Milvus, or Qdrant) to ensure high-speed, relevant similarity searches for agentic retrieval
Collaborate with AI Engineers to optimize data chunking strategies and embedding models to improve the "recall" and "precision" of the agent's knowledge retrieval
Develop automated "Data Cleaning" workflows to remove noise, PII (Personally Identifiable Information), and toxicity from training/context datasets
Enrich raw data with advanced metadata tagging to help agents filter and prioritize information during multi-step reasoning tasks
Build low-latency data streams (using Kafka or Flink) to provide agents with "fresh" data, enabling them to act on real-time market or operational changes
Construct "Gold Datasets" and versioned data snapshots to help the team benchmark agent performance over time

Qualification

Data EngineeringPythonVector DatabaseData ToolingCloud InfrastructureSearch KnowledgeData-Centric AIGraph Databases

Required

4+ years in Data Engineering, with at least 1 year focusing on data for LLMs or AI/ML applications
Deep expertise in Python (Pandas, Pydantic, FastAPI) for data manipulation and API integration
Strong experience with modern data stack tools (e.g., dbt, Airflow, Dagster, Snowflake, or Databricks)
Hands-on experience with at least one major Vector Database and knowledge of similarity search algorithms (HNSW, Cosine Similarity)
Familiarity with hybrid search techniques (combining semantic search with traditional keyword search like Elasticsearch/BM25)
Proficiency in managing data workloads on AWS, Azure, or GCP

Preferred

Experience with LlamaIndex or LangChain for data ingestion
Knowledge of Graph Databases (e.g., Neo4j) to help agents understand complex relationships between data points
Familiarity with 'Data-Centric AI' principles—prioritizing data quality over model size

Benefits

Medical, vision, and dental benefits
401k retirement plan
Variable pay/incentives
Paid time off
Paid holidays

Company

Photon is a technology corporation that provides Strategy Consulting, Creative Design, and Technology Services to global enterprise.

H1B Sponsorship

Photon has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (233)
2024 (168)
2023 (236)
2022 (184)
2021 (157)
2020 (249)

Funding

Current Stage
Late Stage

Leadership Team

leader-logo
Mukund Balasubramanian
Chief Technology Officer
linkedin
leader-logo
Sanjiv C Lochan
CFO
linkedin
Company data provided by crunchbase