Apply on Employer Site

Tata Consultancy Services · 14 hours ago

Azure Architect

Grand Rapids, MI

Full-time

Onsite

Senior Level

Tata Consultancy Services is seeking an Azure Architect to lead the design and implementation of event-driven systems and disaster recovery strategies. The role involves ensuring real-time event pipeline stability, managing cross-region failover strategies, and overseeing cost governance and reporting for Azure services.

Business Information SystemsConsultingInformation TechnologyIT Management

H1B Sponsor Likely

Responsibilities

Ensure Real-Time Event Pipeline Stability

Design and enforce idempotent event processing, at-least-once or exactly-once delivery semantics

Currently Duplicate detection is enabled at Azure ServiceBus Infrastructure level, and at Application-level messages being validated against duplicate processing

Manage dead-letter queues (DLQ) and retry policies

Define event ordering, deduplication, schema evolution

Maintain SLOs for latency, throughput, and event loss. Define the same for new designs coming up for upcoming Transaction Groups

Event backlog monitoring

Consumer lag thresholds

Circuit breaker & backpressure strategies

Disaster Recovery & Multi-Region Maturity

Own the phased maturity roadmap

Move Toward Active-Active

Front Door/APIM routing

Stateless compute enablement

Geo-replicated messaging (Azure ServiceBus)

Cosmos DB multi-region writes

Azure SQL Hyperscale read replicas / failover groups

Design cross-region failover strategy (Example: EA2 → Central US)

DB2 connection resiliency validated through DR drills

Architect geo-redundant Function Apps

Ensure: Cross-region deployment pipelines

Storage replication (GRS / RA-GRS)

Mainframe failover connectivity strategy

Document file ingestion failover workflow

Maintain Reprocessing Framework

Event replay pipelines

Audit trail for rejected events

Data consistency & reconciliation logic

Replay throttling & idempotency safeguards

Operational Oversight

No silent event loss

Business-approved replay rules

SLA for rejected event recovery

Runbooks for reprocessing

Own cost visibility, forecasting, optimization, and executive reporting

Right-size Premium plans

Optimize cold start, scaling rules, memory allocation

Reduce idle instance costs

Use scale controllers & concurrency tuning

RU/s right-sizing & autoscale governance

Partition strategy optimization

TTL, indexing, and data archival strategy

Avoid hot partitions & cross-partition scans

Monitor Query Store & wait stats

Right-size compute replicas

Optimize IO & read replicas usage

Manage long-term storage & data lifecycle

Evaluate serverless where feasible

Own monthly FinOps reporting

Define budget thresholds & alerts

Track cost per event / cost per transaction

Recommend architectural cost trade-offs

Full-Stack Telemetry Ownership

Event throughput

Reject reprocess volume

Function execution failures

ADF pipeline failures

DB2 & mainframe latency

Cosmos RU consumption

SQL wait stats

Incident Response Leadership

Own RCA for major incidents

Maintain runbooks & playbooks

Lead DR drills & chaos testing

Maintain MTTR reduction roadmap

Managed Identity adoption

Secret elimination (Key Vault)

Network isolation (Private Endpoints)

Data encryption & classification

Mainframe secure gateway patterns

Zero-trust enforcement

Translate risk into business impact

Present DR readiness score

Present cost optimization roadmap

Provide resiliency maturity heatmaps

Define platform architecture standards

Review solution designs

Mentor dev & SRE teams

Prevent architecture drift

Qualification

Azure ServiceBusDisaster RecoveryEvent ProcessingCost ManagementObservabilityGeo-replicationFunction AppsData ConsistencyAzure SQL HyperscaleStakeholder CommunicationIncident ResponseLeadershipMentoring

Required

Design and enforce idempotent event processing, at-least-once or exactly-once delivery semantics

Currently Duplicate detection is enabled at Azure ServiceBus Infrastructure level, and at Application-level messages being validated against duplicate processing

Manage dead-letter queues (DLQ) and retry policies

Define event ordering, deduplication, schema evolution

Maintain SLOs for latency, throughput, and event loss. Define the same for new designs coming up for upcoming Transaction Groups

Event backlog monitoring

Consumer lag thresholds

Circuit breaker & backpressure strategies

Own the phased maturity roadmap toward Active-Active

Front Door/APIM routing

Stateless compute enablement

Geo-replicated messaging (Azure ServiceBus)

Cosmos DB multi-region writes

Azure SQL Hyperscale read replicas / failover groups

Design cross-region failover strategy (Example: EA2 → Central US)

Architect geo-redundant Function Apps

Ensure cross-region deployment pipelines

Storage replication (GRS / RA-GRS)

Mainframe failover connectivity strategy

Document file ingestion failover workflow

Maintain Reprocessing Framework

Event replay pipelines

Audit trail for rejected events

Data consistency & reconciliation logic

Replay throttling & idempotency safeguards

No silent event loss

Business-approved replay rules

SLA for rejected event recovery

Runbooks for reprocessing

Own cost visibility, forecasting, optimization, and executive reporting

Right-size Premium plans

Optimize cold start, scaling rules, memory allocation

Reduce idle instance costs

Use scale controllers & concurrency tuning

RU/s right-sizing & autoscale governance

Partition strategy optimization

TTL, indexing, and data archival strategy

Avoid hot partitions & cross-partition scans

Monitor Query Store & wait stats

Right-size compute replicas

Optimize IO & read replicas usage

Manage long-term storage & data lifecycle

Evaluate serverless where feasible

Own monthly FinOps reporting

Define budget thresholds & alerts

Track cost per event / cost per transaction

Recommend architectural cost trade-offs

Full-Stack Telemetry Ownership

Event throughput

Reject reprocess volume

Function execution failures

ADF pipeline failures

DB2 & mainframe latency

Cosmos RU consumption

SQL wait stats

Own RCA for major incidents

Maintain runbooks & playbooks

Lead DR drills & chaos testing

Maintain MTTR reduction roadmap

Managed Identity adoption

Secret elimination (Key Vault)

Network isolation (Private Endpoints)

Data encryption & classification

Mainframe secure gateway patterns

Zero-trust enforcement

Translate risk into business impact

Present DR readiness score

Present cost optimization roadmap

Provide resiliency maturity heatmaps

Define platform architecture standards

Review solution designs

Mentor dev & SRE teams

Prevent architecture drift

Company

Tata Consultancy Services

Glassdoor3.8

Tata Consultancy Services is a business solutions company that specializes on information technology services and consulting.

Founded in 1968

Mumbai, Maharashtra, IND

10001+ employees

http://www.tcs.com

H1B Sponsorship

Tata Consultancy Services has a track record of offering H1B sponsorships. Please note that this does not guarantee sponsorship for this specific role. Below presents additional info for your reference. (Data Powered by US Department of Labor)

Distribution of Different Job Fields Receiving Sponsorship

Represents job field similar to this job

Trends of Total Sponsorships

2025 (7880)

2024 (9690)

2023 (8537)

2022 (11159)

2021 (9813)

2020 (11984)

Funding

Current Stage

Public Company

Total Funding

unknown

2004-08-25IPO

Leadership Team

K. Krithivasan

Chief Executive Officer & Managing Director

Aarthi Subramanian

President and Chief Operating Officer

Recent News

The Hindu

Nifty IT drops up to 6% on Anthropic tool launch

2026-02-05

Investing.com

Anthropic’s AI push raises analyst concerns over IT services revenues

2026-02-05

The Hindu

IT stocks slump; Infosys dives over 7%

2026-02-05

Company data provided by crunchbase