Production Engineer - Site Reliability Engineer (CLOUD AWS)

Pivotree

Location

Mumbai | India

Job description

Our goal at Pivotree is to help accelerate the future of frictionless commerce. We will help lead this change over the next decade because we believe a future where technology is embedded intimately into all aspects of our everyday lives can benefit everyone and will shape the interactions with the brands we love. We will help shape the future of frictionless commerce by working together with some of the best brands in the world and some of the best people in the industry to leverage converging technologies that will make it possible to accelerate frictionless commerce faster than ever.

This is a journey of technology acceleration combined with consumer readiness and adoption. We are looking for people capable of adapting relentlessly to the rapidly evolving world around us.

Position Summary
We are currently seeking a Site Reliability Engineer to join our team.In this role you will contribute to the reliability and enhancement of the technology engine that powers multiple Pivotree solutions. The primary function of this role is the direct responsibility for the availability of platform solutions, focusing on several key areas, including availability, performance, change management, monitoring and emergency response. You will work with other members of the platform, solutions, operations, and application teams to understand and ultimately address changing and evolving requirements through extending and exposing capabilities in a simple and consistent fashion. You will be a member of a team who maintains expertise with Utility Computing services and will advise management and the organization as a whole on this mode of computing.

You will

Contribute to ensuring pooled and independent utility services are highly available
Actively take part and initiate continuous improvement: measure and reduce manual tasks and overhead
Be a subject matter expert for Utility Computing providers and respective services both existing and emerging - with particular focus on AWS
Complete systems development, administration, and engineering tasks including integration, documentation and testing
Develop and maintain tools, processes, and workflows for automated infrastructure resource(s) and application deployment, configuration management & maintenance
Own the responsibility for platform management, supporting services, and all related tooling and automation
Investigate and troubleshoot relevant platform-based issues and incidents, (high availability, performance, security, etc.)
Participate in recurring stand-ups with other team members located in different locations and time zones
Participate in on-call rotation, escalations, and shift work (generally Monday to Friday, Wednesday to Sunday)
Work with other team members to improve processes and advance relevant and related competencies

You are

Super comfortable with Linux (RHEL-based / Debian-based)
Experienced with supporting software development teams and workflows
A team player, one that recognizes the power of Agile and team based delivery
Well versed in infrastructure & application monitoring, logging, and tracing
Able to effectively decompose problems into workable chunks
Experienced at working on large projects with deadlines
Committed to high quality and attention to detail
Focused and committed to delivering high quality services
A strategic thinker who is able to link business and technical objectives
Someone that can go wide and deep, who work with several disparate systems and services and ultimately acquires expert knowledge and who can navigate accordingly

You have (MUST HAVE)

Minimum one Associate-level Amazon AWS certification, or will achieve this wthin 3 months
A mature understanding of and lots of experience with infrastructure-as-code concepts and practices
1+ years - working with tools to support version control, build automation and automated testing (e.g. the usual suspects... Git, Jenkins, TravisCI, Selenium, etc.)
1+ years - production experience operating container and container orchestration technologies (ideally Docker and Kubernetes / managed Kubernetes service)
2+ years - infrastructure lifecycle management with tooling such as AWS CloudFormation, HashiCorp Terraform, or similar
2+ years - monitoring system performance
2+ years - implementing and maintaining security and compliance for all aspects of system and components where possible
2+ years - implementation and operating experience in respectable scale API-driven production environments on AWS
3+ years - system administration experience (OS, network, storage, virtualization management, etc.) in challenging production environments and have associated war stories
3+ years - Debian-based and RHEL-based Linux
3+ years - web service, application, middleware, and database support
Exceptional communication skills and are able to convey decisions and ideas in a clear and concise manner
The ability to work independently as well as collaboratively
The ability to learn and adapt to new and overlapping technologies quickly and independently, and to formulate and implement standards, procedures and best practices
The ability to think in systems
Experience with the likes of Python, Bash, or similar to extend and increase efficiencies

NICE TO HAVE

Experience and/or exposure to the Serverless Framework
Experience with APM tools such as AppDynamics, NewRelic or Dynatrace, Amazon X-Ray
Experience with the following Amazon AWS services in a production environment (API Gateway, Cognito, DynamoDB, ECS, EMR, Lambda)
AWS Certified Developer
AWS Certified SysOps Administrator

Pivotree is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive and accessible workplace

Apply Now

Job tags

Salary

Production Engineer - Site Reliability Engineer (CLOUD AWS)

GENERAL

Home

About

Contact

Blog

MORE PAGES

Popular searches

Urban popular searches

Cities

Companies

LEGAL

Privacy policy

Terms of service

eAccessibility commitment

JobNob HQ Address

1 E Broad St
Ste 130 - 1252
Bethlehem, PA 18018-5934
United States

Production Engineer - Site Reliability Engineer (CLOUD AWS)

GENERAL

Home

About

Contact

Blog

MORE PAGES

Popular searches

Urban popular searches

Cities

Companies

LEGAL

Privacy policy

Terms of service

eAccessibility commitment

JobNob HQ Address

1 E Broad St Ste 130 - 1252 Bethlehem, PA 18018-5934 United States

1 E Broad St
Ste 130 - 1252
Bethlehem, PA 18018-5934
United States