(USA) Staff, Software Engineer jobs in United States
cer-icon
Apply on Employer Site
company-logo

Walmart Canada · 1 month ago

(USA) Staff, Software Engineer

Walmart Inc. is a leading retail company, and they are seeking a Staff Software Engineer to contribute to their cloud-based e-commerce and enterprise platform. The role involves system design, architecture, and working cross-functionally with various teams to enhance the reliability and scalability of Walmart's technology infrastructure.

DeliveryRetailShopping

Responsibilities

On Call responsibilities to help minimize MTTD and MTTR of SRE product
Experience with containerization and container platforms. (e.g., Docker, Kubernetes, Docker EE, OpenShift, Mesosphere)
Should have skills to understand debugging info , “Drain” traffic away from a cluster, Rollback a bad software push , block or rate limiting unwanted traffic, bring up additional serving capacity thru autoscaling features and use the monitoring systems(for alerting and dashboards)
Engage with enterprise and business/infrastructure functions to establish, track, and optimize operational metrics and targets in line with SRE principles (SLO/SLI, Latency percentiles , error budgets, tech debt and setup alert guidelines)
Programming/Tooling and Automation experience in one or more of the following languages: Golang, Java, Python, Typescript, Node and Shell
Good understanding of Kafka internals , SQL/noSQL databases like Cassandra , Elasticsearch and Postgress and In-Memory Caching frameworks like Memcached
Influence, design and create new architectures, standards, and methods for large-scale enterprise systems
Design, write and build tools to improve the reliability, latency, availability and scalability of Walmart e-commerce/Retail and Enterprise products
Engender reliability and availability starting with metrics and measurements
Enable scaling by providing tools, developing training and/or augmenting processes
Build tools/automate to prevent re-occurrence of problem to mission critical products/services
Augment existing instrumentation to build a cohesive picture of the characteristics of our systems with special attention to points of failure
Participate in capacity planning, demand forecasting, software performance analysis and system tuning
Develop a deep understanding of the numerous services and applications that come together to deliver Walmart e-commerce/Retail and Enterprise products
Working knowledge on any of the Observability tools and enterprise monitoring solutions like Dynatrace, AppDynamics, New Relic, Prometheus etc
Root-cause analysis complex problems involving multiple parties, networks, hardware, and software that relate to scaling and performance
Secure the system from issues, be they real, perceived, or notional

Qualification

System designLinuxNetworkingDistributed architecturesContainerizationGolangJavaPythonTypescriptNodeShell scriptingKafkaSQL databasesNoSQL databasesObservability toolsSoft skills

Required

Comfortable with System design, Architecture, deep technical Linux, networking topics, and distributed architectures
Experience with containerization and container platforms (e.g., Docker, Kubernetes, Docker EE, OpenShift, Mesosphere)
Skills to understand debugging info, 'Drain' traffic away from a cluster, Rollback a bad software push, block or rate limiting unwanted traffic, bring up additional serving capacity through autoscaling features and use the monitoring systems (for alerting and dashboards)
Engage with enterprise and business/infrastructure functions to establish, track, and optimize operational metrics and targets in line with SRE principles (SLO/SLI, Latency percentiles, error budgets, tech debt and setup alert guidelines)
Programming/Tooling and Automation experience in one or more of the following languages: Golang, Java, Python, Typescript, Node and Shell
Good understanding of Kafka internals, SQL/noSQL databases like Cassandra, Elasticsearch and Postgress and In-Memory Caching frameworks like Memcached
Influence, design and create new architectures, standards, and methods for large-scale enterprise systems
Design, write and build tools to improve the reliability, latency, availability and scalability of Walmart e-commerce/Retail and Enterprise products
Engender reliability and availability starting with metrics and measurements
Enable scaling by providing tools, developing training and/or augmenting processes
Build tools/automate to prevent re-occurrence of problem to mission critical products/services
Augment existing instrumentation to build a cohesive picture of the characteristics of our systems with special attention to points of failure
Participate in capacity planning, demand forecasting, software performance analysis and system tuning
Develop a deep understanding of the numerous services and applications that come together to deliver Walmart e-commerce/Retail and Enterprise products
Working knowledge on any of the Observability tools and enterprise monitoring solutions like Dynatrace, AppDynamics, New Relic, Prometheus etc
Root-cause analysis complex problems involving multiple parties, networks, hardware, and software that relate to scaling and performance
Secure the system from issues, be they real, perceived, or notional

Company

Walmart Canada

company-logo
Walmart Canada is a subsidiary of Walmart that operates a chain of more than 400 stores nationwide. It is a sub-organization of Walmart.