Kezan Consulting- A Unit of Kezan India Private Limited
Location
Mumbai | India
Job description
Role Overview
As a Machine Learning Operations Engineer, you will play a pivotal role in developing, deploying, and managing machine learning models and large language model (LLM) systems across our diverse portfolio of companies. This position calls for a blend of technical expertise, a passion for innovation, and the ability to work alongside entrepreneurs to drive growth and transform industries.
Responsibilities - Design, build, and maintain efficient, reliable, and scalable ML and LLM operations infrastructure.
- Implement robust ML model lifecycle management practices, including development, testing, deployment, and monitoring.
- Work closely with data scientists and ML engineers to facilitate the seamless transition of models from experimentation to production.
- Ensure the highest levels of security and compliance are maintained in all ML and LLM operations.
- Optimize model performance and resource utilization to meet the demands of rapidly scaling ventures.
- Stay abreast of the latest developments in ML and LLM technologies and methodologies, integrating these innovations to enhance operational efficiency and model effectiveness.
Must have - Proven experience in ML and LLM operations, with a strong understanding of ML model lifecycle management.
- Proficiency in Python, and experience with ML frameworks like TensorFlow or PyTorch.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration abilities, with a knack for working effectively in a dynamic, team-oriented environment.
- Familiarity with CI/CD pipelines, automation tools, and ML monitoring solutions.
- Knowledge of data engineering principles and practices is highly desirable.
- A Bachelor's or Master's degree in Computer Science, Engineering, Data Science, or a related field.
- Minimum of 3 years of relevant experience in machine learning operations.
Nice to have - Minimum of 5 years of relevant experience in machine learning operations, with a preference for candidates who have experience managing large language models.
- Experience with cloud computing platforms (AWS, GCP, or Azure) and containerization technologies (Docker, Kubernetes).
Skills: machine learning,operations,analytical skills,python,pytorch,tensorflow,cloud,machine learning models,docker,kubernetes
Job tags
Salary