Senior Site Reliability Engineer
Location
Bangalore | India
Job description
- The Cloud Enablement team is responsible for accelerating the delivery and improving the operation of our cloud-based software by providing and supporting tools and patterns which reduce the cognitive load on our development teams
- We free up our developers to focus on solving problems for our customers rather than spending time on extraneous tasks
- Drawing on the shared experience and expertise from our organization and industry; we create, support and evolve the paved path for teams to build, deploy and run secure and reliable software
What will you do
- Design, build, advocate for and support the common tools and delivery platform used by Flexera developers.
- Improve developer experience and operational excellence.
- Foster collaboration and knowledge sharing across Flexera.
- Select and rollout supported defaults and standards for CI/CD tooling, Observability, Security and Runtime Environment.
- Work with teams across several continents, build relationships with our engineers by listening and understanding their needs and balancing this with the needs of our business.
- Research new tools and patterns and continuously measure and evolve our ways of doing things.
- Cloud Cost Optimization uses a combination of strategies, techniques, best practices and tools to help manage/reduce cloud costs.
You have
- Developer/DevOps/SRE/Platform experience and a strong interest in software delivery and ongoing operation.
- Rolled out automation, tools, technologies, patterns and guardrails across an organization.
- Experience working in a globally distributed team.
- Deep & extensive public cloud (preferably AWS) knowledge & experience.
- Deep knowledge of containers (Docker) orchestration (Kubernetes).
- Knowledge of tools and patterns around CI/CD (familiar with Travis CI, Circle CI, Buildkite or similar).
- Observability knowledge; Logs, Tracing, Metrics and experience in a few of Elastic Stack, XRay, Jaeger, Zipkin, Prometheus, Honeycomb or LightStep. Enterprise observability tools such as NewRelic, DataDog etc.
- Cloud cost optimization; Using automation to keep Cloud cost under control and within budget. Enabling individual Engineering teams with cloud cost optimization.
- Knowledge of operations, including incident management, immutable infrastructure as code (esp. Terraform or CloudFormation), and problem-solving.
- Produced robust well-tested code preferably in Golang; however, we will also consider Python, JavaScript, Ruby, Java or C# if you are happy to learn Go.
- Excellent communication skills, including experience in writing good documentation and running workshops.
- Vendor selection and/or management experience.
Critical Skills / Competencies
- Agile software delivery methodologies
- Experience managing cloud-based services e.g. AWS, Azure at scale
- Experience with DevOps
- Experience with docker Containers, Kubernetes, EKS, ECS
- Infrastructure as code e.g. Terraform, CloudFormation
- CI/CD pipelines using Jenkins, travisCI, teamcity, pipeline as code
Automation / Configuration Management at scale e.g. Puppet, Chef, Ansible, Salt, Packer etc. - Service mesh such as ishtio, Consul or similar
- Expertise in one or more of the following languages: Python / Go / Java / C# / C / C++
- Experience with IaaS and Serverless services from a cloud provider
- A strong understanding in TCP/IP, DNS and experience designing networksLinux & Windows system administration experience
- Experience implementing fault detection, and automating fixes
- Experience designing scalable services
- Experience designing distributed, fault-tolerant systems
- A good understanding of SQL, No-SQL databases
- A solid understanding of data structures and algorithms
- A positive attitude and willingness to learn
- Strong conflict resolution competence
- Excellent written and verbal communication skills
- Detail oriented. The ideal candidate is one who naturally digs as deep as they need to understand the why
Minimum Qualifications
- Bachelors or higher degree in Computer Science, Information Technology, or a related field.
- At least 4 years of hands-on job experience managing services in a public cloud
- At least 2 years of experience working as a senior member of a centralized Cloud enablement / Platform or a similar team
Bonus Skills
- Python / Golang / Java / C# / C / C++ / Bash experience
Big Data, Machine Learning, AI (DataBricks, Snowflake etc) Platforms - Experience with Monitoring systems such as New Relic, ELK, Prometheus, Datadog, X-ray etc
- Security background
- SQL, NOSQL and Graph databases
- Relevant Certification eg AWS, GCP, Azure
- Experience of Disciplined Agile Delivery (DAD)
Job tags
Salary