AIM Inc. ยท 1 month ago
Senior AI/ML Engineer (Remote)
AIM Inc. is seeking a skilled AI/ML Engineer to design, build, and deploy generative AI solutions using Python and Azure cloud services. The role involves owning the full lifecycle of AI/ML model development, ensuring operational excellence, and collaborating with cross-functional teams on AI initiatives.
ConsultingInformation TechnologyTraining
Responsibilities
Design and implement preprocessing pipelines, inference services, and evaluation frameworks using Python and Hugging Face transformers
Fine tune, optimize, and deploy large language models and generative AI models to production environments on Azure
Build and maintain retrieval augmented generation (RAG) systems integrating vector databases such as Azure AI Search or CosmosDB with LLM workflows
Develop robust unit and integration tests for ML pipelines ensuring reliability before production deployment
Implement prompt engineering strategies and optimize context management for production LLM applications
Containerize ML models using Docker and deploy to Azure Container Instances, Azure App Service, or Azure Kubernetes Service
Implement health check endpoints, structured logging, and telemetry instrumentation for all deployed services
Configure and monitor Azure Application Insights dashboards to track model performance, latency, error rates, and resource utilization
Troubleshoot production incidents by analyzing Application Insights logs, distributed traces, and request telemetry to identify root causes and implement fixes
Optimize inference latency and cloud costs through caching strategies, model quantization, and efficient resource allocation
Build and maintain CI/CD pipelines using Azure DevOps or GitHub Actions for automated testing, model validation, and deployment
Implement model versioning, A/B testing infrastructure, and rollback procedures for production ML services
Manage containerized deployments and orchestration for scalable inference workloads
Work with existing .NET API layer to integrate AI services using Azure SDKs
Collaborate with .NET developers to ensure seamless integration between Python ML services and .NET applications
Follow established patterns for dependency injection and service registration in .NET codebases
Partner with product, engineering, and DevOps teams to align AI initiatives with business objectives
Document architectures, runbooks, and troubleshooting guides to enable team knowledge sharing
Communicate technical concepts clearly to both technical and non-technical stakeholders
Qualification
Required
3+ years of production Python development with demonstrated ability to write clean, testable, and maintainable code
Hands on experience with Hugging Face transformers including model loading, tokenization, fine tuning, and inference optimization
Ability to implement a data preprocessing pipeline and wrap a Hugging Face model for inference with unit tests
Strong proficiency with PyTorch, scikit learn, pandas, and numpy
Experience building REST APIs using FastAPI or Flask for ML model serving
While we build on Azure we value strong conceptual knowledge of cloud operations. Experience with equivalent services in AWS (ECS, CloudWatch, Lambda) or GCP is fully acceptable provided you are willing to cross train
2+ years deploying and operating workloads on Azure including Container Instances, App Service, or AKS
Demonstrated proficiency in Azure Application Insights including custom metrics, log queries using KQL, distributed tracing, and alert configuration
Ability to containerize a Hugging Face model, deploy it to Azure, expose health and logging endpoints, and demonstrate telemetry in Application Insights
Proven ability to efficiently analyze failing request traces and Application Insights logs to rapidly identify root causes and propose fixes in a production environment
Experience with Azure DevOps or GitHub Actions for CI/CD pipelines
Hands on experience integrating Azure OpenAI Service, OpenAI APIs, or open source LLMs into production applications
Practical experience implementing RAG architectures with vector databases and embedding models
Understanding of prompt engineering techniques, context management, token optimization, and LLM output parsing
Understanding of semantic contextual relevance for improving search accuracy and response quality in AI applications
Familiarity with chunking strategies, semantic search, and hybrid retrieval approaches
Dependency Injection: Register and consume services using Microsoft.Extensions.DependencyInjection (same pattern as FastAPI/Flask dependency injection)
Azure SDKs: Use Azure.AI.OpenAI, Azure.Search.Documents, Azure.Storage.Blobs, and related packages (identical patterns to Python Azure SDKs)
Polly: Configure retry policies and resilience patterns for HTTP calls
LINQ: Write query expressions for data manipulation (similar to Python list comprehensions)
Standard Libraries: Use System.Text.Json, HttpClient, and common .NET packages for API development
Preferred
Experience with LangChain, LlamaIndex, or similar LLM orchestration frameworks
Familiarity with Semantic Kernel for .NET and Python
Experience with model quantization (ONNX, TensorRT) for inference optimization
Knowledge of evaluation frameworks for LLM applications (RAGAS, custom metrics)
Prior experience working with distributed teams across time zones
Company
AIM Inc.
AIM is a specialized company providing a wide range of API management solutions and services to ensure optimal performance and security for businesses.
Funding
Current Stage
Early StageCompany data provided by crunchbase