Apply on Employer Site

AIM Inc. · 1 month ago

Senior AI/ML Engineer (Remote)

United States

Full-time

Remote

Mid, Senior Level

3+ years exp

AIM Inc. is seeking a skilled AI/ML Engineer to design, build, and deploy generative AI solutions using Python and Azure cloud services. The role involves owning the full lifecycle of AI/ML model development, ensuring operational excellence, and collaborating with cross-functional teams on AI initiatives.

ConsultingInformation TechnologyTraining

Responsibilities

Design and implement preprocessing pipelines, inference services, and evaluation frameworks using Python and Hugging Face transformers

Fine tune, optimize, and deploy large language models and generative AI models to production environments on Azure

Build and maintain retrieval augmented generation (RAG) systems integrating vector databases such as Azure AI Search or CosmosDB with LLM workflows

Develop robust unit and integration tests for ML pipelines ensuring reliability before production deployment

Implement prompt engineering strategies and optimize context management for production LLM applications

Containerize ML models using Docker and deploy to Azure Container Instances, Azure App Service, or Azure Kubernetes Service

Implement health check endpoints, structured logging, and telemetry instrumentation for all deployed services

Configure and monitor Azure Application Insights dashboards to track model performance, latency, error rates, and resource utilization

Troubleshoot production incidents by analyzing Application Insights logs, distributed traces, and request telemetry to identify root causes and implement fixes

Optimize inference latency and cloud costs through caching strategies, model quantization, and efficient resource allocation

Build and maintain CI/CD pipelines using Azure DevOps or GitHub Actions for automated testing, model validation, and deployment

Implement model versioning, A/B testing infrastructure, and rollback procedures for production ML services

Manage containerized deployments and orchestration for scalable inference workloads

Work with existing .NET API layer to integrate AI services using Azure SDKs

Collaborate with .NET developers to ensure seamless integration between Python ML services and .NET applications

Follow established patterns for dependency injection and service registration in .NET codebases

Partner with product, engineering, and DevOps teams to align AI initiatives with business objectives

Document architectures, runbooks, and troubleshooting guides to enable team knowledge sharing

Communicate technical concepts clearly to both technical and non-technical stakeholders

Qualification

PythonHugging FaceAzureMLOpsCI/CDPyTorchScikit learnFastAPIDockerC#Dependency InjectionCommunicationCollaboration

Required

3+ years of production Python development with demonstrated ability to write clean, testable, and maintainable code

Hands on experience with Hugging Face transformers including model loading, tokenization, fine tuning, and inference optimization

Ability to implement a data preprocessing pipeline and wrap a Hugging Face model for inference with unit tests

Strong proficiency with PyTorch, scikit learn, pandas, and numpy

Experience building REST APIs using FastAPI or Flask for ML model serving

While we build on Azure we value strong conceptual knowledge of cloud operations. Experience with equivalent services in AWS (ECS, CloudWatch, Lambda) or GCP is fully acceptable provided you are willing to cross train

2+ years deploying and operating workloads on Azure including Container Instances, App Service, or AKS

Demonstrated proficiency in Azure Application Insights including custom metrics, log queries using KQL, distributed tracing, and alert configuration

Ability to containerize a Hugging Face model, deploy it to Azure, expose health and logging endpoints, and demonstrate telemetry in Application Insights

Proven ability to efficiently analyze failing request traces and Application Insights logs to rapidly identify root causes and propose fixes in a production environment

Experience with Azure DevOps or GitHub Actions for CI/CD pipelines

Hands on experience integrating Azure OpenAI Service, OpenAI APIs, or open source LLMs into production applications

Practical experience implementing RAG architectures with vector databases and embedding models

Understanding of prompt engineering techniques, context management, token optimization, and LLM output parsing

Understanding of semantic contextual relevance for improving search accuracy and response quality in AI applications

Familiarity with chunking strategies, semantic search, and hybrid retrieval approaches

Dependency Injection: Register and consume services using Microsoft.Extensions.DependencyInjection (same pattern as FastAPI/Flask dependency injection)

Azure SDKs: Use Azure.AI.OpenAI, Azure.Search.Documents, Azure.Storage.Blobs, and related packages (identical patterns to Python Azure SDKs)

Polly: Configure retry policies and resilience patterns for HTTP calls

LINQ: Write query expressions for data manipulation (similar to Python list comprehensions)

Standard Libraries: Use System.Text.Json, HttpClient, and common .NET packages for API development

Preferred

Experience with LangChain, LlamaIndex, or similar LLM orchestration frameworks

Familiarity with Semantic Kernel for .NET and Python

Experience with model quantization (ONNX, TensorRT) for inference optimization

Knowledge of evaluation frameworks for LLM applications (RAGAS, custom metrics)

Prior experience working with distributed teams across time zones

Company

AIM Inc.

AIM is a specialized company providing a wide range of API management solutions and services to ensure optimal performance and security for businesses.

Founded in 2006

Oakville, Ontario, CAN

11-50 employees

https://www.architectureinmotion.ca/

Funding

Current Stage

Early Stage

Company data provided by crunchbase