SIGN IN
Cloud Systems Engineer jobs in United States
info-icon
This job has closed.
company-logo

ExpediteInfoTech, Inc. · 6 hours ago

Cloud Systems Engineer

ExpediteInfoTech, Inc. is looking for a Cloud Systems Engineer to manage their Databricks platform. The role involves ensuring platform operations, security, and governance configuration to support secure analytics and AI/ML workloads.
AppsConsultingInformation ServicesInformation TechnologyOutsourcing
badNo H1BnoteU.S. Citizen Onlynote

Responsibilities

Administer Databricks (workspace administration, clusters/compute policies, jobs, SQL warehouses, repos, runtime management) and expertise using Databricks CLI
Strong Unity Catalog administration: metastores; catalogs/schemas; grants; service principals; external locations; storage credentials; governed storage access
Identity & Access Management proficiency: SSO concepts, SCIM provisioning, group-based RBAC, service principals, and least-privilege patterns
Security fundamentals: secrets management, secure connectivity, audit logging, access monitoring, and evidence-ready operations
Automation skills: scripting and/or IaC using Terraform/CLI/REST APIs for repeatable configuration and environment promotion
Experience implementing data governance controls (classification/tagging, lineage/metadata integrations) in partnership with governance teams
CI/CD practices for jobs/notebooks/config promotion across SDLC environments
Understanding of lakehouse concepts (e.g., Delta, table lifecycle management, separation of storage/compute)
Strong troubleshooting and problem-solving; communicate clearly during incidents and changes
Experience administering Databricks serverless compute, Workspace Git integrations (GitLab), Databricks Asset Bundles (DABs) for deployment automation, and modern workspace features supporting DevOps workflows
Configure and maintain clusters/compute, job compute, SQL warehouses, and runtime versions, libraries, repos, and workspace settings
Implement platform monitoring/alerting, operational dashboards, and health checks; maintain runbooks and operational procedures
Manage change control for upgrades, feature rollouts, configuration changes, and integration changes; document impacts and rollback plans
Enforce least privilege across platform resources (workspaces, jobs, clusters, SQL warehouses, repos, secrets) using role/group-based access patterns
Enable and maintain audit logging and access/event visibility, support security reviews and evidence requests
Administer Unity Catalog governance: metastores, catalogs/schemas/tables, ownership, grants, and environment/domain patterns
Configure and manage external locations, storage credentials, and governed access to cloud object storage
Coordinate secure connectivity and guardrails with cloud/network teams: private connectivity patterns, egress controls, firewall/proxy needs
Implement cost guardrails: cluster policies, auto-termination, scheduling, workload sizing standards, and capacity planning
Produce usage/cost insights and optimization recommendations; address waste drivers (idle compute, oversized clusters, inefficient jobs)
Automate administration and configuration using APIs/CLI/IaC (e.g., Terraform) to reduce manual drift and improve repeatability
Maintain platform documentation: configuration baselines, security/governance standards, onboarding guides, and troubleshooting references
Monitor and optimize platform performance, including SQL warehouse query tuning, cluster autoscaling configuration, Photon enablement, and Delta Lake optimization guidance (OPTIMIZE, VACUUM, Z-ordering strategies)
Administer Delta Live Tables (DLT) pipelines and coordinate with data engineering teams on pipeline health, data quality monitoring, failed job remediation, and pipeline configuration best practices
Manage third-party integrations and ecosystem connectivity, including BI tool integrations (e.g., Power BI), and external metadata catalog integrations
Implement Databricks Asset Bundles (DABs) for standardized deployment patterns; automate workspace resource deployment (jobs, pipelines, dashboards) across SDLC environments using bundle-based CI/CD workflows
Conduct capacity planning and scalability analysis, including forecasting concurrent user/workload growth, platform scaling strategies, and proactive resource allocation during peak usage periods
Facilitate user onboarding and enablement, including new user/team onboarding procedures, training coordination, workspace access provisioning, and creation of self-service documentation/guides

Qualification

Databricks administrationUnity Catalog administrationIdentity & Access ManagementAutomation skillsCloud platform expertiseCI/CD practicesData governance controlsSecurity fundamentalsSQL proficiencyTroubleshooting skillsCommunication skillsTeam collaborationProblem-solving skills

Required

Bachelor's degree in Engineering, Computer Science, Information Systems, or IT related discipline
Minimum of 10 years' experience in Information Technology in an Enterprise environment
Intermediate level qualifications and capabilities per Section 3.5.6.1 Experience analyzing cloud requirements, concept of operations documents, and high-level system architectures to develop cloud system specifications
Hands-on experience administering Databricks (workspace administration, clusters/compute policies, jobs, SQL warehouses, repos, runtime management) and expertise using Databricks CLI
Strong Unity Catalog administration: metastores; catalogs/schemas; grants; service principals; external locations; storage credentials; governed storage access
Identity & Access Management proficiency: SSO concepts, SCIM provisioning, group-based RBAC, service principals, and least-privilege patterns
Security fundamentals: secrets management, secure connectivity, audit logging, access monitoring, and evidence-ready operations
Automation skills: scripting and/or IaC using Terraform/CLI/REST APIs for repeatable configuration and environment promotion
Experience implementing data governance controls (classification/tagging, lineage/metadata integrations) in partnership with governance teams
CI/CD practices for jobs/notebooks/config promotion across SDLC environments
Understanding of lakehouse concepts (e.g., Delta, table lifecycle management, separation of storage/compute)
Strong troubleshooting and problem-solving; communicate clearly during incidents and changes
Experience administering Databricks serverless compute, Workspace Git integrations (GitLab), Databricks Asset Bundles (DABs) for deployment automation, and modern workspace features supporting DevOps workflows
Bachelor's degree in a related field or equivalent practical experience
7+ years in cloud/data platform administration and operations, including 3+ years administering Databricks

Preferred

Cloud platform expertise (AWS): IAM roles/policies, object storage security patterns, networking basics (VPC concepts), logging/monitoring integration
SQL proficiency and data engineering fundamentals for troubleshooting query performance issues, understanding ETL/ELT workflow patterns, and debugging data pipeline failures; basic Python/Scala familiarity for notebook/code issue diagnosis
Experience with compliance and regulatory frameworks (FedRAMP, HIPAA, SOC2, or similar) including implementation of data residency requirements, retention policies, and audit-ready evidence collection
Hands-on experience with AWS security and networking services, including PrivateLink, Secrets Manager/Systems Manager integration, CloudWatch/CloudTrail integration, S3 bucket policies, cross-account access patterns, and KMS encryption key management
Demonstrated experience in Databricks and Cloud FinOps and budget management SLA/SLO management, incident management, and stakeholder communication skills; ability to define platform service levels, produce operational reports, translate technical issues to business stakeholders, and manage vendor relationships (Databricks account teams)
5+ years of demonstrated experience administering Databricks
Databricks Platform Administrator/Databricks AWS Platform Architect
Databricks Certified Data Engineer Associate/Professional
AWS Certified Solutions Architect Associate or Professional

Company

ExpediteInfoTech, Inc.

twittertwitter
company-logo
At ExpediteInfoTech, we harness advanced technologies like Artificial Intelligence (AI), Machine Learning (ML), Internet of Things (IoT), and immersive solutions to drive transformation across businesses and government operations.

Funding

Current Stage
Growth Stage

Leadership Team

leader-logo
Nageswara Tripuramallu
President & CEO
linkedin
Company data provided by crunchbase