Senior Data Architect jobs in United States
cer-icon
Apply on Employer Site
company-logo

SteerBridge · 4 hours ago

Senior Data Architect

SteerBridge Strategies is a CVE-Verified Service-Disabled, Veteran-Owned Small Business delivering professional services to the U.S. Government and private sector. They are seeking a highly skilled Sr. Data Architect to support operations and sustainment of the F-35 and C-130 aircraft, focusing on designing, implementing, and managing data systems for maintenance and logistics.

GovernmentInformation TechnologySoftware
badNo H1BnoteSecurity Clearance RequirednoteU.S. Citizen Onlynote
Hiring Manager
Brendan DiBari
linkedin

Responsibilities

Designing and implementing scalable data architectures that support business intelligence, analytics, and machine learning workflows
Leading the development of highly available, fault-tolerant, and scalable data pipelines, integrating multiple data sources, and ensuring data quality
Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services, with a focus on cost efficiency and scalability
Building and managing big data platforms to enable large-scale analytics, often incorporating structured and unstructured data
Performance tuning for complex queries, implementing database replication and sharding strategies to support high availability and scalability
Developing and implementing data governance policies and security controls across the organization’s data assets, ensuring compliance with industry standards
Supporting data scientists with feature engineering, data wrangling, and model deployment, and designing architectures that support AI/ML initiatives

Qualification

Data managementData warehousingCloud platformsBig data technologiesData governancePythonSQLETL processesData modelingDatabase optimizationLeadershipMentorshipProject managementCollaboration

Required

Must be a U.S. Citizen
Masters's Degree or Above in Systems Engineering, Computer Science or related field
An active security clearance or the ability to obtain one is required
Minimum 10+ years of experience to include:
Experience in data management, utilizing advanced analytics tools and platforms and Python
Experience with Data Warehousing consulting/engineering or related technologies (Redshift, Databricks, BigQuery, OADW, Apache Hive, Apache Lucene)
Experience in scripting, tooling, and automating large-scale computing environments
Extensive experience with major tools such as Python, Pandas, PySpark, NumPy, SciPy, SQL, and Git; Minor experience with TensorFlow, PyTorch, and Scikit-learn
Data modeling (conceptual, logical, and physical)
Database schema design
Understanding of different database paradigms (relational, NoSQL, graph databases, etc.)
ETL (Extract, Transform, Load) processes and tools
Experience with modern data warehousing solutions (e.g., Redshift, Snowflake, BigQuery)
Understanding of dimensional modeling (star/snowflake schemas) and data vault techniques
Experience designing for both OLTP and OLAP workloads
Familiarity with metadata-driven design and schema evolution in data systems
Experience defining data SLAs and lifecycle management policies
Project Experience: Designing and implementing scalable data architectures that support business intelligence, analytics, and machine learning workflows
Proficiency in tools like Apache Kafka, Airflow, Spark, Flink, or NiFi
Experience with cloud-based data services (AWS Glue, Google Cloud Dataflow, Azure Data Factory)
Real-time and batch data processing
Automation and monitoring of data pipelines
Strong understanding of incremental processing, idempotency, and backfill strategies
Knowledge of workflow dependency management, retries, and alerting
Experience writing modular, testable, and reusable Python-based ETL code
Project Experience: Leading the development of highly available, fault-tolerant, and scalable data pipelines, integrating multiple data sources, and ensuring data quality
Expertise in cloud environments (AWS, GCP, Azure)
Understanding of cloud-based storage (S3, Blob Storage), databases (RDS, DynamoDB), and compute resources
Implementing cloud-native data solutions (Data Lake, Data Warehouse, Data Mesh)
Experience with cost monitoring and optimization for data workloads
Familiarity with hybrid and multi-cloud architectures
Understanding of serverless data patterns (e.g., Lambda + S3 + Athena, Cloud Functions + BigQuery)
Project Experience: Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services, with a focus on cost efficiency and scalability
Experience with big data ecosystems (Hadoop, HDFS, Hive, Spark)
Distributed computing, parallel processing, and handling petabyte-scale data
Tools for querying large datasets (Presto, Athena)
Understanding of lakehouse frameworks (Delta Lake, Iceberg, Hudi)
Familiarity with data compaction, schema evolution, and ACID guarantees in distributed storage
Project Experience: Building and managing big data platforms to enable large-scale analytics, often incorporating structured and unstructured data
Expertise in database technologies (SQL, NoSQL, GraphDBs)
Query optimization, indexing, and partitioning strategies
Backup, replication, and disaster recovery planning
Understanding of query execution plans, cost-based optimization, and caching strategies
Experience performing index and partition design based on query patterns
Familiarity with data versioning and temporal tables
Experience profiling and optimizing application code interacting with databases
Project Experience: Performance tuning for complex queries, implementing database replication and sharding strategies to support high availability and scalability
Data privacy, encryption, and compliance with regulations (GDPR, CCPA)
Implementing data governance frameworks (data lineage, cataloging, metadata management)
Role-based access control and user management for sensitive data
Experience with automated policy enforcement and data lineage visualization tools (e.g., DataHub, Collibra, Alation)
Knowledge of data quality frameworks integrated into CI/CD pipelines
Familiarity with data contract testing between producer and consumer teams
Project Experience: Developing and implementing data governance policies and security controls across the organization's data assets, ensuring compliance with industry standards
Proficiency in Python and SQL
Experience with version control (Git) and CI/CD for data engineering (Gitlab, Jenkins, CircleCI)
API design and integration (Postman)
Strong understanding of object-oriented programming (OOP) principles and design patterns in Python
Familiarity with software engineering best practices (modularity, testing, documentation, linting)
Understanding of algorithmic complexity (Big O notation) and ability to optimize code for scale
Experience with parallel and distributed computation frameworks (Spark, Dask, Ray)
Ability to profile and debug performance bottlenecks in data workflows
Use of type hinting, logging frameworks, and automated testing frameworks (pytest, unittest)
Experience in supporting data scientists with feature engineering, data wrangling, and model deployment
Knowledge of ML orchestration tools (MLflow, Kubeflow)
Hands-on experience with analytics tools (e.g., Tableau, PowerBI)
Familiarity with feature store design and model feature lineage tracking
Understanding of data versioning and reproducibility for ML workflows
Experience supporting real-time model inference pipelines
Project Experience: Designing architectures that support AI/ML initiatives, enabling scalable data pipelines for training models, and supporting experimentation in the production environment
Leading data engineering teams, cross-functional collaboration with data scientists, analysts, and business units
Project management (Agile, Scrum, Kanban) and stake holder communication
Experience with mentorship and growing junior data engineers
Experience establishing data architecture standards and best practices
Ability to review and approve technical designs for consistency and scalability
Proven success in mentoring engineers in code quality, modeling, and system design
Project Experience: Leading the technical direction for large-scale data initiatives, such as enterprise data lake implementations or the creation of a unified data platform

Benefits

Health insurance
Dental insurance
Vision insurance
Life Insurance
401(k) Retirement Plan with matching
Paid Time Off
Paid Federal Holidays

Company

SteerBridge

twittertwitter
company-logo
SteerBridge specializes in providing professional services and cutting-edge solutions to the U.S. Government and corporate clients.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Doug Lee
Partner
linkedin
leader-logo
Rob Schroder
Founder and Managing Partner
linkedin
Company data provided by crunchbase