Senior Observability & Monitoring Engineer (remote) @ First American Financial Group, Inc. | Jobright.ai
JOBSarrow
RecommendedLiked
0
Applied
0
External
0
Senior Observability & Monitoring Engineer (remote) jobs in Santa Ana, CA
Be an early applicantLess than 25 applicants
company-logo

First American Financial Group, Inc. · 1 day ago

Senior Observability & Monitoring Engineer (remote)

ftfMaximize your interview chances
Customer ServiceFinancial Services

Insider Connection @First American Financial Group, Inc.

Discover valuable connections within the company who might provide insights and potential referrals.
Get 3x more responses when you reach out via email instead of LinkedIn.

Responsibilities

Build solutions to provide monitoring patterns for various in-house and off-the-shelf applications across the company.
Measure and monitor all production systems with an eye toward availability, latency, and overall system health.
Engage with application teams to improve and evolve systems by lobbying for changes that enhance reliability, resilience, and observability.
Contribute to continuous improvement initiatives for the team and customers, with a goal of providing automation and enhancing client service, efficiency, and profitability.
Fine-tune existing tools, or research, develop, and implement new tools, to deliver additional monitoring capabilities.
Work on complex problems where analysis of situations or data requires an in-depth evaluation of multiple factors.

Qualification

Find out how your skills align with this job's requirements. If anything seems off, you can easily click on the tags to select or unselect skills to reflect your actual expertise.

ElasticsearchTerraformAWS Native toolsAzure DevOpsAPM toolsObservability toolsCloud infrastructureSoftware developmentSystem operationsMonitoring technologiesLog aggregationMicroservices architecturesCloud-native environmentsHybrid infrastructureAutomationCoding for automationProblem-solving techniques

Required

Proactive approach, designing telemetry strategies, implementing comprehensive monitoring systems, and leveraging advanced tools to gain real-time insights and identify potential issues before they escalate.
Possess in-depth knowledge and expertise in telemetry data collection, analysis, and implementation, fully understanding the intricacies of and how to derive meaningful insights from different telemetry sources such as: Metrics, Events, Logs, Traces.
Expertise in identifying patterns, detecting anomalies, and building a holistic understanding of system behavior beyond traditional monitoring approaches' current limitations.
Experience in software engineering, software development, and/or system operations.
Experience with APM and Observability using tools such as ELK Stack, AWS CloudWatch, Azure Monitor, New Relic, Splunk, Prometheus, Grafana, Sentry, etc.
Extensive understanding of the complexities native to modern distributed systems
Well-versed in the challenges posed by microservices architectures, cloud-native environments, and hybrid infrastructure setups.
Proven ability to lead complex initiatives/projects from inception to completion.
Ability to perform analysis on metrics & logs, using problem-solving techniques to provide guidance on monitoring, alerting, dashboarding and visualization.
Ability to work with a high level of autonomy and with a globally distributed team.
Excellent communication skills, both verbal and written; able to explain complex technical topics to both internal and external stakeholders with ease and in remote/distributed environments.

Preferred

Hands-on experience with Elasticsearch, including deployment and management of the Elastic Stack, Beats and/or Fleet Agents, APM, Dashboarding, and Reporting.
Hands-on experience with DevOps practices, including using GIT & Developing CI/CD Pipelines.
Hands-on experience with Infrastructure as Code (Terraform preferred)
Hands-on experience with Monitoring & Log Aggregation technologies
Hands-on experience with cloud infrastructure such as AWS, Azure, or Oracle Cloud Infrastructure.
Opinions about dashboards, metrics, and SLO’s
Strong knowledge of cloud design patterns for observability monitoring, resiliency, etc.
Ability to understand and write code to perform various tasks related to automation & monitoring.

Benefits

Medical
Dental
Vision
401k
PTO/paid sick leave
Employee stock purchase plan

Company

First American Financial Group, Inc.

company-logo
Purchasing a new home or refinancing? We have a solution.

Funding

Current Stage
Early Stage
Company data provided by crunchbase
logo

Orion

Your AI Copilot