Bright Vision Technologies · 21 hours ago
AI Infrastructure Engineer
Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. They are seeking a skilled AI Infrastructure Engineer to design and manage AI/ML infrastructure, develop scalable environments, and optimize AI workloads across cloud platforms.
Artificial Intelligence (AI)Cyber SecurityInformation TechnologySoftware
Responsibilities
Design and manage AI/ML Infrastructure optimized for GPU Computing using NVIDIA CUDA, enabling high-throughput training and inference workloads
Develop and automate scalable environments with Python scripting on Linux, leveraging Docker for containerization and Kubernetes for orchestration
Deploy and optimize AI workloads across Cloud Platforms (AWS, Azure, GCP), configuring GPU clusters for cost-effective scaling
Implement AI Workload Orchestration tools to schedule, manage, and monitor distributed training jobs across multi-node setups
Build High-Performance Computing (HPC) systems with Distributed Systems expertise, focusing on low-latency Storage & Networking for AI (e.g., NVMe, InfiniBand)
Provision infrastructure using Infrastructure as Code (Terraform), ensuring reproducible and version-controlled deployments
Establish CI/CD pipelines with Git integration for automated building, testing, and rollout of AI infrastructure components
Set up Monitoring & Observability stacks (e.g., Prometheus, Grafana) to track GPU utilization, cluster health, and performance bottlenecks
Collaborate in Agile methodologies, delivering iterative improvements to AI infrastructure through sprints and cross-functional teamwork
Optimize resource allocation for AI pipelines, reducing costs while maximizing throughput for large-scale model training and serving
Qualification
Required
Design and manage AI/ML Infrastructure optimized for GPU Computing using NVIDIA CUDA, enabling high-throughput training and inference workloads
Develop and automate scalable environments with Python scripting on Linux, leveraging Docker for containerization and Kubernetes for orchestration
Deploy and optimize AI workloads across Cloud Platforms (AWS, Azure, GCP), configuring GPU clusters for cost-effective scaling
Implement AI Workload Orchestration tools to schedule, manage, and monitor distributed training jobs across multi-node setups
Build High-Performance Computing (HPC) systems with Distributed Systems expertise, focusing on low-latency Storage & Networking for AI (e.g., NVMe, InfiniBand)
Provision infrastructure using Infrastructure as Code (Terraform), ensuring reproducible and version-controlled deployments
Establish CI/CD pipelines with Git integration for automated building, testing, and rollout of AI infrastructure components
Set up Monitoring & Observability stacks (e.g., Prometheus, Grafana) to track GPU utilization, cluster health, and performance bottlenecks
Collaborate in Agile methodologies, delivering iterative improvements to AI infrastructure through sprints and cross-functional teamwork
Optimize resource allocation for AI pipelines, reducing costs while maximizing throughput for large-scale model training and serving
Company
Bright Vision Technologies
Bright Vision Technologies is an information technology company that offers software development, AI, and cybersecurity services.
H1B Sponsorship
Bright Vision Technologies has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (41)
2024 (14)
2023 (7)
2022 (12)
2021 (1)
Funding
Current Stage
Growth StageCompany data provided by crunchbase