Solutions Reliability Engineer - Datadog
Location
Hyderabad | India
Job description
Solugenix is an information technology services firm that has a rich history of providing comprehensive technology services and solutions for more than five decades.
As a pioneer in IT services, we’ve partnered with some of the biggest global corporations across many industries. Our history was built on a foundation of partnerships with global brands like McDonald’s, Microsoft, CIT Group, Johnson & Johnson, Herbalife, Sony Pictures Entertainment, and many others. Whether it’s providing dedicated support centers, staffing quality teams, or delivering business service solutions, clients can always count on Solugenix.
Job Location: Hyderabad Office
Position Title: Solutions Reliability Engineer – Datadog
Job Summary:
As a Solutions Reliability Engineer (SRE), the candidate is expected to provide technical contribution, work with the team closely on providing solution, implementing, and maintaining our monitoring and observability infrastructure using Datadog.
Responsibilities:
- The candidate will be working with the client teams directly.
- The candidate must ensure utmost quality of the service is delivered.
- Configure and maintain Datadog alerts and integrations to meet the organization's monitoring requirements.
- Participate in code reviews and provide constructive feedback to peers.
- Write clean, efficient, and maintainable code to meet performance and scalability requirements.
- Collaborate with clients to understand their monitoring needs and provide customized Datadog solutions.
- Responsible for working with users and customers to document and strategize for process improvements.
Qualifications:
- At least 5 years’ experience in working with development team.
- Hands-on knowledge on Datadog and good knowledge on any other monitoring tools
- Sound knowledge on any of automation tools like Terraform, Ansible, Python
- Proficient in Datadog configurations, dashboards, and alerting.
- Experience in working with Bitbucket or GitHub would be a big plus.
- Good understanding of SDLC, CI/CD, and building deployment pipelines on the cloud.
- Knowledge on Windows and Linux OS, infrastructure and platform management, networking, AWS, IT troubleshooting, automation, microservices, and data analytical skills would be a value add.
- Utilize Datadog metrics and analytics to perform capacity planning and resource optimization.
- Proactively identify potential performance bottlenecks and areas for improvement.
- Understanding of the application and web security controls, TCP/IP concepts, firewalls, and database preferred
- Work closely with other IT teams to troubleshoot and resolve incidents using Datadog insights.
- Design and implement effective monitoring and alerting strategies using Datadog.
- Configure and customize Datadog dashboards to provide real-time visibility into system and application performance.
- Experience in tools such as JIRA, Confluence, ServiceNow, Other monitoring tools etc.
- Flexible with working hours based on the service requirement.
- Excellent communication (written and spoken) skills.
- Demonstrates good active listening skills with an ability to interpret customers’ communications, to understand the gist of the problem.
Education & Certifications:
- Graduation in computer science, Information Technology, or a related field.
- Datadog & Terraform Certification is a plus.
Job tags
Salary