VARITE INC ยท 1 month ago
USA_Database Administrator
VARITE INC is a company seeking a Database Administrator with expertise in CockroachDB. The role involves designing, deploying, and maintaining multi-region CockroachDB clusters while ensuring high availability and data consistency in production environments.
Information Technology & Services
Responsibilities
Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments
Ensure high availability, fault tolerance, and data consistency for globally distributed clusters
Monitor cluster health, latency, replication status, and resource utilization using observability tools
Perform capacity planning and proactive scaling for future growth
Troubleshoot complex database and infrastructure issues including:
Node failures
Network partitions
Leaseholder and range imbalance
Replication lag
Hotspotting
High latency / throughput bottlenecks
Design disaster recovery strategies (multi-region, backup/restore, failover/fallback)
Implement and test backup, restore, and point-in-time recovery processes
Automate provisioning, scaling, patching, and upgrades of CRDB clusters
Perform rolling upgrades with zero or near-zero downtime
Optimize SQL query performance and database schema efficiency
Create operational runbooks, SOPs, and on-call playbooks for CRDB
Participate in on-call rotations and incident response for production clusters
Qualification
Required
Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments
Ensure high availability, fault tolerance, and data consistency for globally distributed clusters
Monitor cluster health, latency, replication status, and resource utilization using observability tools
Perform capacity planning and proactive scaling for future growth
Troubleshoot complex database and infrastructure issues including: Node failures, Network partitions, Leaseholder and range imbalance, Replication lag, Hotspotting, High latency / throughput bottlenecks
Design disaster recovery strategies (multi-region, backup/restore, failover/fallback)
Implement and test backup, restore, and point-in-time recovery processes
Automate provisioning, scaling, patching, and upgrades of CRDB clusters
Perform rolling upgrades with zero or near-zero downtime
Optimize SQL query performance and database schema efficiency
Create operational runbooks, SOPs, and on-call playbooks for CRDB
Participate in on-call rotations and incident response for production clusters