Edgesource Corporation · 16 hours ago
Data Engineer
Edgesource Corporation has been an innovative technology service provider for over 25 years, focusing on delivering solutions for various federal and state clients. They are seeking a data engineer to support data structuring for a custom-developed enterprise system, involving responsibilities such as designing ETL pipelines, integrating data from multiple sources, and collaborating with data scientists and stakeholders.
Information ServicesInformation Technology
Responsibilities
Design, develop, and maintain ETL/ELT pipelines for batch and real-time processing using Python and SQL
Integrate data from multiple sources, including databases, APIs, streaming platforms, PDFs, and MS Office files
Build scalable data architectures to support analytics and machine learning workloads
Optimize data processing and queries for performance and cost efficiency in AWS S3
Exposure to PySpark or other big data frameworks is a plus for future pipeline scalability
Develop and implement web scraping and data ingestion workflows to collect open-source data, integrating content and producing structured datasets and visualizations for analytics and stakeholder consumption
Collect, clean, and validate large volumes of structured and unstructured data
Track data versions, implement data quality checks, and ensure data reliability
Design and optimize data storage in AWS S3, including raw, intermediate, and final datasets
Implement data governance practices, including documentation, cataloging, lineage, and security
Ensure compliance security standards
Work closely with Data Scientists, Analysts, and stakeholders to understand data requirements
Prepare clean, structured, and feature-ready datasets for analytics and machine learning
Support feature engineering, aggregations, and transformations at scale
Assist in deploying ML models to production, ensuring monitoring, versioning, and performance optimization
Integrate with REST APIs
Utilize Docker, Kubernetes, Git, and CI/CD pipelines to deploy and manage workflows
Document pipelines, data schemas, and transformations clearly
Communicate technical concepts effectively with cross-functional teams
Participate in code reviews and promote best practices across the team
Qualification
Required
3–5 years + of professional experience in data engineering or related roles
Strong collaboration skills to work effectively with Data Scientists, Analysts, and Engineering teams
Ability to communicate complex technical concepts to non-technical stakeholders
Detail-oriented, curious, and committed to data quality
Capable of managing multiple priorities in a fast-paced environment
Python, SQL, and PySpark (highly desired) for data processing and pipeline development
Elastic/OpenSearch for search and analytics solutions
Experience with AWS cloud services and Linux environments
Git for version control and collaborative development
Understanding of machine learning workflows and MLOps concepts
Preferred
Hands-on experience modeling, querying, and optimizing graph databases, especially Neo4j highly desired
Benefits
Flexible PTO Policy + 11 Paid Holidays
Flexible Work Schedules (Remote / Hybrid)
Medical / Dental / Vision / Flexible Spending Account (FSA)
401k Plan with Match
Tuition & Professional Development Support
Commuter Benefits
Bonus & Employee Referral Programs
Career Growth Opportunities
Company
Edgesource Corporation
NATIONAL SECURITY EXPERTISE THAT POWERS MISSION SUCCESS Edgesource doesn't observe the mission from afar—we advance it from within.
Funding
Current Stage
Growth StageRecent News
Company data provided by crunchbase