SRE Lead

Tech Mahindra (formerly Mahindra Satyam)

Location

Secunderabad | India

Job description

Shift Timing- General

Exp range- 8+ years

Band- U4

Location- Hyderabad, hybrid

Primary Responsibilities

Site Reliability Engineering (SRE) is an engineering discipline that combines software and system engineering to build and run large scale, massively distributed, fault-tolerant systems. SREs ensure managed service offerings and customer deployments have reliability and uptime appropriate to user's needs and a fast rate of improvement while monitoring and validating capacity and performance. Focused on reliability, scalability, and the development of automation to manage a set of repetitive tasks at scale.

Knowledge &Skills

In depth knowledge on SRE practices and concepts like SLA, SLO, SLI, Error budget, Toil elimination, Post-mortem etc.
Mandatory have experience in Terraform.
Should have experience in Monitoring and Observability tools: Prometheus, Grafana, Elasticsearch Logstash Kibana, Splunk, Dynatrace, GCP operation suite, Azure Application Insights, any log analytics tools.
Should have understanding and knowledge of any APM tools App dynamics, Datadog etc. – preferably AppDynamics.
Should have experience in Infrastructure as a Code: Terraform, Ansible etc.
Should have experience working with cloud-native applications to manage them effectively in GCP or Azure.
Should have experience in creating pipelines in CI/CD tools like GitHub action, Azure Devops, Jenkins, preferably Scripted Pipelines.
Should have knowledge of version control tools like Git, Bitbucket etc.
Good to have knowledge of any of the scripting languages like PowerShell, python, bash etc.
Responsible for ensuring the availability, performance, and scalability of a website or application.
Knowledge of containerization and orchestration: Docker, Kubernetes, Docker compose, writing Dockerfile.
Involved in capacity planning and performance tuning to ensure that the site can handle increased traffic without issue.
Responsible for ensuring the availability, performance, and scalability of a website or application.
Should have experience working with cloud-native applications to manage them effectively.

Work closely with developers to identify and fix potential issues before they cause problems for users.

Deep understanding of how distributed systems work to be able to troubleshoot and optimize them.
Deep understanding of how different types of databases work to be able to effectively troubleshoot any issues that may arise.
Ability to communicate clearly and concisely about system alerts or outages to other members of your team.
Below points to be noted: Apart from JD, Customer is looking for a candidate who can mature their SRE practice across the division. Someone who is comfortable being a champion and leader in the SRE space.

Job tags

Salary

SRE Lead

GENERAL

Home

About

Contact

Blog

MORE PAGES

Popular searches

Urban popular searches

Cities

Companies

LEGAL

Privacy policy

Terms of service

eAccessibility commitment

JobNob HQ Address

1 E Broad St
Ste 130 - 1252
Bethlehem, PA 18018-5934
United States

SRE Lead

GENERAL

Home

About

Contact

Blog

MORE PAGES

Popular searches

Urban popular searches

Cities

Companies

LEGAL

Privacy policy

Terms of service

eAccessibility commitment

JobNob HQ Address

1 E Broad St Ste 130 - 1252 Bethlehem, PA 18018-5934 United States

1 E Broad St
Ste 130 - 1252
Bethlehem, PA 18018-5934
United States