Falcon Smart IT (FalconSmartIT) ยท 1 day ago
Site Reliability Engineer (SRE) with strong Middleware expertise
Falcon Smart IT is seeking a Site Reliability Engineer (SRE) with strong Middleware expertise to design, operate, and continuously improve highly available, secure, and scalable enterprise platforms. The role involves blending deep middleware operations with SRE principles, collaborating with various teams to ensure platform reliability while driving automation and operational excellence.
Responsibilities
Define, implement, and track SLIs, SLOs, and error budgets for middleware and platform services
Drive MTTR reduction , availability improvements, and operational resilience
Lead incident response , root cause analysis (RCA), and post-incident reviews
Implement proactive monitoring and alerting to reduce noise and prevent outages
Administer and support enterprise middleware platforms including:
Oracle WebLogic , Apache, NGINX
API Gateways (Apigee Edge / X)
Java application servers and JVM-based services
Perform patching, upgrades, configuration tuning , and capacity planning
Manage certificates, keystores, truststores , and TLS configurations
Ensure platform security, compliance, and performance standards
Design and maintain end-to-end observability using tools such as:
Dynatrace , ELK/Kibana, Splunk (or equivalent)
Build executive and operational dashboards for real-time health visibility
Reduce alert fatigue through smart alerting, thresholds, and suppression
Monitor JVM metrics, GC behavior, thread utilization, and API performance
Develop automation and self-healing solutions using:
Shell scripting, Python, Ansible, Terraform, or similar tools
Automate routine operational tasks (restarts, validations, health checks)
Enable CI/CD-friendly middleware deployments and configuration management
Standardize environments across DEV / QA / UAT / PROD
Support middleware workloads on:
Kubernetes / OpenShift
Public or hybrid cloud environments (AWS, Azure, GCP)
Integrate platform reliability into containerized and microservices architectures
Collaborate with DevOps teams on deployment pipelines and release strategies
Act as a reliability advisor to application and development teams
Partner with Unix/Linux, Database, Network, and Security teams
Provide mentoring, documentation, and best-practice guidance
Participate in on-call rotations and production support leadership
Qualification
Required
6+ years of experience in Middleware / Platform Operations / SRE
Strong expertise in WebLogic, Java middleware, Apache/NGINX
Hands-on experience with observability platforms (Dynatrace, ELK, Splunk)
Solid understanding of Linux/Unix systems and networking fundamentals
Experience with API platforms (Apigee preferred)
Automation and scripting skills (Shell, Python, Ansible, Terraform)
Experience with Kubernetes/OpenShift and containerized workloads
Practical experience implementing SRE principles in production
Strong troubleshooting skills (thread dumps, heap analysis, GC logs)
Experience with incident management, RCA, and change management
Ability to balance reliability vs delivery velocity
Preferred
Experience with cloud-native architectures and service meshes
Knowledge of IAM / Security integrations (OAuth, SAML, mTLS)
Exposure to CI/CD tools (Jenkins, GitHub Actions, GitLab CI)
Experience supporting 24x7 enterprise environments
ITIL or SRE certifications
Company
Falcon Smart IT (FalconSmartIT)
Global Specialist IT Recruitment Agency Leader in Digital Transformation , Business IT Solutions and IT Recruitment Specialists Falcon Smart IT is a Specialist recruitment agency can quickly find you highly skilled IT professionals who are the best fit for your project or full-time hiring needs.
Funding
Current Stage
Early StageCompany data provided by crunchbase