logo

JobNob

Your Career. Our Passion.

Incident Manager


hackajob


Location

Hyderabad | India


Job description

hackajob is a matching platform partnering with Verisk helping them to hire the best talent and build the future. To get the chance to get matched to this role and other similar roles, click on Apply to set up your free profile.

About the company:

Verisk helps the world see new possibilities and inspire change for better tomorrows. Their analytic solutions bridge content, data, and analytics to help business, people, and society become stronger, more resilient, and sustainable.

Job Overview:

As an Incident Manager (IM) in their dynamic IT team, you will play a crucial role in ensuring the reliability, availability, and performance of their applications and systems. You will be responsible for managing incidents, addressing problems, and handling changes to maintain a robust and scalable infrastructure. This position requires a seasoned professional with a deep understanding of IT operations, system architecture, and a proactive approach to problem-solving.

Job Duties:

Lead and coordinate the response to IT incidents, ensuring swift resolution and minimal disruption to operations. Take ownership of incidents, perform root cause analysis, and implement corrective actions to prevent recurrence. Establish and maintain an effective incident response process, including clear escalation paths and communication procedures. Collaborate with cross-functional teams to address incidents promptly and efficiently. Work closely with internal teams and external vendors to coordinate incident resolution efforts. Identify trends and patterns in incidents to proactively address potential issues before they escalate. Develop and maintain a library of incident response documentation and playbooks. Maintain detailed records of incidents, including timelines, actions taken, and resolution details. Generate incident reports for management review, outlining key metrics, lessons learned, and recommendations for improvement. Participate in an On-Call rotation during day hours for critical incidents. Develop and implement comprehensive change management plans for all IT changes, ensuring alignment with organizational goals. Conduct meetings to evaluate proposed changes, assessing impacts, potential risks, and benefits on existing systems and processes. Participate in the planning and execution of major changes or migrations to the production environment, coordinating with relevant teams for seamless deployment. Communicate effectively with stakeholders, providing clear and timely information on upcoming changes, progress, and post-implementation status. Proactively identify and address potential issues before they escalate into incidents. Work towards preventing the recurrence of incidents by analyzing trends and patterns in historical data. Maintain comprehensive documentation for known problems, including detailed information on their root causes and resolution strategies. Possess knowledge in monitoring solutions for the prompt detection and response to issues. Collaborate with cross-functional teams to monitor and alert on critical application functionalities and systems for early identification and swift resolution. Conduct periodic alert discussions to validate the effectiveness of alerting thresholds, avoiding false positives, and refining response procedures. Integrate monitoring and alerting solutions with incident management tools to streamline the incident response process. Ensure highly available and scalable infrastructure are maintained to meet business requirements. Develop and enforce best practices for system reliability, including fault tolerance, monitoring, and disaster recovery.

Qualifications:

Bachelor’s in computer science, Information Technology, or a related field. Proven experience as an incident manager or similar role, demonstrating a track record of successful incident resolution. Certification in relevant processes and technologies (e.g., ITIL foundation, AWS Cloud practitioner). Proficiency in Amazon Web Services (AWS). Knowledge with CI/CD pipelines and version control systems. Knowledge of microservices architecture and distributed systems. Good understanding of networking principles and storage solutions. Familiarity with monitoring and alerting tools (Nagios, CloudWatch, Dynatrace). Proven experience in IT change management, showcasing a strong understanding of change control processes. Familiarity with incident management, change management tools, and methodologies. Strong analytical, problem-solving, communication, and interpersonal skills. Ability to work in a fast-paced and collaborative environment.


Job tags



Salary

All rights reserved