Jobs via Dice ยท 12 hours ago
Site Reliability Engineer
Jobs via Dice is seeking a Site Reliability Engineer for a W2 role. The role involves leading a team of infrastructure and application support engineers, focusing on the design, engineering, monitoring, and building of infrastructure that supports financial transactions in a regulated environment.
Computer Software
Responsibilities
Responsible for creating and executing on a technology focused plan based on project delivery, sustainment of the production environments, and talent development of the Infrastructure and Application engineering team members
Additionally, you'll lead engineering efforts to enhance availability capabilities and perform hardware and software lifecycle management
Responsible for developing direct relationships with various infrastructure technology and business leaders across the Enterprise to accomplish shared goals
Participate in selecting next generation technologies and strategies and lead a team of engineers charged with implementing, maintaining, and reporting on the infrastructure and application technologies that are directly consumed by customers
Responsibilities include leading complex designs covering footprint across large-scale data centers processing mission-critical, high-volume transactions
Research & development within the project life cycle; technical analysis and design; and support of operations staff in executing, testing, and rolling out the solutions
Participation on projects is focused on smoothing the transition of projects from development staff to production staff by performing operations activities within the project life cycle
Responsible for leading a team of Infra and Application engineers and additionally you'll plan, design, implement and support the company's Infrastructure and application measures from Servers, Software, Appliances Load Balancers and HSM
Select, develop, and evaluate personnel to ensure the efficient operation of the function
Responsible for system security and data integrity
Assign passwords and monitors use of resources, backs up files as required and responds to management requests for information
Build, configure and maintain physical and virtual servers for a large-scale Linux platform
Collaborate with team members in on a regular basis to maintain consistency in design and implementation
Collaborate with team members of the lab supporting our system so that the environment is stable and provide notification of any concerns or outages in advance
Create and modify scripts as needed to automate repetitive tasks
Provide operational support to development and test community
Monitor system performance, system/security logs and identifying potential issues with systems
Qualification
Required
Bachelor degree in computer or equivalent
10+ years relevant work experience
Preferred
UNIX, Sun Solaris, Linux, Shell Scripting
Excellent troubleshooting capabilities on H/W and OS
Design and maintain server build configuration for WebLogic servers
Perform patching, hardening, and automated deployments
Analyze, identify, and resolve Incidents and problems
To oversee the day-to-day operation of hosted infrastructure and Applications
Ensuring plans, policies and procedures are understood, administered, and adhered to by staff
Working with vendors as necessary to bring global scale, efficiency, and expertise to overall operations
Research and recommend innovative, and where possible automated approaches for system administration tasks. Identify approaches that leverage our resources and provide economies of scale
Ensuring timely resolution of customer issues and communication of issue status. Proactive customer and senior management communication of potential issues and reporting of current capabilities
Working closely with Systems and Release Management engineering and operations to provide guidance for Deployments, security, governance of processes and ensure that systems adhere to all policies and regulations
Interfacing with Internal Audit and Compliance to provide responses to audit questions and implement remediation plans for gaps or findings
Planning security measures to meet best practices and industry standards. This includes Application Security, system security, and access control
Engineering implementation of security measures, ensuring services meet or exceed policies and Industry standards. Reviewing current measures to determine efficiency and long-term plan. Audit response and remediation
Supporting Applications, including Incident Response, Middleware Upgrades, Certificate Management, documentation, and metrics reporting. Perform periodic performance reporting to support capacity plan
Company
Jobs via Dice
Welcome to Jobs via Dice, the go-to destination for discovering the tech jobs you want.
Funding
Current Stage
Early StageCompany data provided by crunchbase