AvidXchange, Inc. · 16 hours ago
Senior Site Reliability Engineer
AvidXchange is a leading provider of accounts payable automation software and payment solutions for middle-market businesses. The Senior Site Reliability Engineer is responsible for contributing to site health, reliability, and user experience of Avid products while collaborating with product teams to enhance customer satisfaction and product availability.
FinTechPaymentsSaaSTransaction Processing
Responsibilities
Responsible for meaningfully contributing and providing continuous feedback on site health, reliability, availability and user experience Avid products
Understand the product in depth, collect and analyze meaningful measurements and provide feedback to the business, Software Engineering and Product teams
Work closely with key stakeholders to help drive changes to increase customer satisfaction, product availability, reliability, and the completion of strategic technical initiatives
Focus on automation opportunities and automating operational processes to maintain high availability of the product
Perform application specific SRE support, RCAs, and service restoration as needed to quickly respond to and resolve production issues
Plan and achieve high availability, performance, and availability of the product service
Ensure proactive monitoring of all core services and processes to prevent un-planned service disruption
Implement self-healing and scalability of technical services to avoid un-planned disruptions
Identify significant projects that result in substantial improvements in reliability, cost savings and/or revenue
Identify changes for the product architecture from the reliability, performance and availability perspectives with a data driven approach
Influence the product roadmap and work with engineering and product counterparts to influence improved resiliency and reliability of the product
Proactively work on efficiency and capacity planning to set clear requirements and optimize the system resources usage
Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives
Provide detailed analysis and troubleshooting for systems outages providing feedback to product/software engineering
Lead initiatives and problem definition and scoping, design, and planning through epics and blueprints
Deep domain knowledge and radiation that knowledge through recorded demos, technical presentations, discussions
Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again
Maintain awareness and actively influence stage group plans and priorities through participation in stage group meetings and async discussions. Act as a champion for reliability
Set an example for team of SREs with positive and inclusive leadership and discussion on work
Qualification
Required
7+ years of software Engineering or Site Reliability Engineering experience
Bachelor's degree in computer science, Information Technology or equivalent experience plus certifications
Understanding of web hosting infrastructure and architecture in highly available environments
Working knowledge and experience with C#, Javascript, and HTML
Experience with one of the Public Cloud architectures (Azure experience highly desired)
Familiarity with RESTful API and .Net Applications
Strong hands-on experience with Azure services (e.g., APIM, FunctionApps, Key Vault, App Services) or AWS
Expertise with Kubernetes, Docker, and Helm in production environments
Experience applying Chaos Engineering concepts or tools
Strong scripting and automation capabilities in PowerShell, Python, SQL, and use of workflow tools like Power Automate
Familiarity with CI/CD, DevSecOps practices, Linux system administration, and cloud security principles
General knowledge of most technical expertise areas, with deep knowledge in at least two areas Advanced Terraform syntax, Ansible (syntax, tasks, playbooks) and CI/CD configuration, pipelines, jobs
Advanced knowledge of cloud services (preferably Azure)
Monitoring Dynatrace, Azure App Insight, Prometheus, and Grafana: service catalog metrics and recording rules for alerts
Log shipping pipelines and incident debugging visualizations
Ability to understand and Contribute improvements to the codebase to resolve issues
Proven ability to develop internal applications/tools and contribute code to infrastructure or production systems
Performs application specific SRE support, RCAs, and service restoration as needed to quickly respond to and resolve production issues
Plan and achieve high availability, performance, and availability of the product service
Ensure proactive monitoring of all core services and processes to prevent un-planned service disruption
Implement self-healing and scalability of technical services to avoid un-planned disruptions
Identifies significant projects that result in substantial improvements in reliability, cost savings and/or revenue
Identifies changes for the product architecture from the reliability, performance and availability perspectives with a data driven approach
Influences the product roadmap and works with engineering and product counterparts to influence improved resiliency and reliability of the product
Proactively work on efficiency and capacity planning to set clear requirements and optimize the system resources usage
Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives
Provide detailed analysis and troubleshooting for systems outages providing feedback to product/software engineering
Leads initiatives and problem definition and scoping, design, and planning through epics and blueprints
Deep domain knowledge and radiation that knowledge through recorded demos, technical presentations, discussions, and
Perform and run blameless RCAs on incidents and outages aggressively looking for answers that will prevent the incident from ever happening again
For stable counterpart assignments, maintain awareness and actively influence stage group plans and priorities through participation in stage group meetings and async discussions. Act as a champion for reliability
Set an example for team of SREs with positive and inclusive leadership and discussion on work
Preferred
Microsoft Azure Administrator Associate (AZ-104)
Certified Kubernetes Administrator (CKA)
HashiCorp Terraform Associate
Experience in FinTech or payment systems environments
Experience with containerizing legacy applications for Kubernetes-based deployment
Benefits
18 days PTO*
11 Holidays (8 company recognized & 3 floating holidays)
16 hours per year of paid Volunteer Time Off (VTO)
Competitive Healthcare
High Deductible Heath Plan Option that has $0 monthly premium for teammate-only coverage
100% AvidXchange paid Dental Base Plan Coverage
100% AvidXchange paid Life Insurance
100% AvidXchange paid Long-Term Disability
100% AvidXchange paid Short-Term Disability
Employee Assistance Program (EAP) - Provides counseling services, legal and financial consultations and health advocacy for Teammates and their eligible dependents
Onsite Health Clinic with Atrium Health - available to Teammates and their eligible dependents
401(k) Match: 100% match on the first 3% of your salary, plus 50% match on the next 2%
Parental Leave: 8 weeks 100% paid by AvidXchange**
Discounts on Pet, Home, and Auto insurance
WeeCare Childcare Service: helps teammates find affordable daycare, childcare, and tutors 40% less expensive than traditional daycare centers
Perks at Work: free discount program that provides teammates the opportunity to save on items from electronics, movie tickets, car buying, vacations, and more
Onsite gym fitness center, yoga studio, and basketball court
Tuition Reimbursement up to the federal maximum of $5,250***
Hybrid Workplace Flexibility
Free parking
Company
AvidXchange, Inc.
We are a leading provider of accounts payable automation software and payment solutions for mid-market businesses and their suppliers.
H1B Sponsorship
AvidXchange, Inc. has a track record of offering H1B sponsorships. Please note that this does not
guarantee sponsorship for this specific role. Below presents additional info for your
reference. (Data Powered by US Department of Labor)
Distribution of Different Job Fields Receiving Sponsorship
Represents job field similar to this job
Trends of Total Sponsorships
2025 (56)
2024 (36)
2023 (32)
2022 (45)
2021 (16)
2020 (24)
Funding
Current Stage
Public CompanyTotal Funding
$1.13BKey Investors
CorpaySixth StreetFifth Third Bank
2025-05-06Corporate Round
2025-05-06Acquired
2021-10-13IPO
Leadership Team
Recent News
GlobalFinTechSeries
2025-12-30
2025-12-17
Crowdfund Insider
2025-11-10
Company data provided by crunchbase