Location
Mumbai | India
Job description
Role Overview
As a Machine Learning Operations Engineer you will play a pivotal role in developing deploying and managing machine learning models and large language model (LLM) systems across our diverse portfolio of companies. This position calls for a blend of technical expertise a passion for innovation and the ability to work alongside entrepreneurs to drive growth and transform industries.
Responsibilities
- Design build and maintain efficient reliable and scalable ML and LLM operations infrastructure.
- Implement robust ML model lifecycle management practices including development testing deployment and monitoring.
- Work closely with data scientists and ML engineers to facilitate the seamless transition of models from experimentation to production.
- Ensure the highest levels of security and compliance are maintained in all ML and LLM operations.
- Optimize model performance and resource utilization to meet the demands of rapidly scaling ventures.
- Stay abreast of the latest developments in ML and LLM technologies and methodologies integrating these innovations to enhance operational efficiency and model effectiveness.
Must have
- Proven experience in ML and LLM operations with a strong understanding of ML model lifecycle management.
- Proficiency in Python and experience with ML frameworks like TensorFlow or PyTorch.
- Excellent problemsolving and analytical skills.
- Strong communication and collaboration abilities with a knack for working effectively in a dynamic teamoriented environment.
- Familiarity with CI/CD pipelines automation tools and ML monitoring solutions.
- Knowledge of data engineering principles and practices is highly desirable.
- A Bachelors or Masters degree in Computer Science Engineering Data Science or a related field.
- Minimum of 3 years of relevant experience in machine learning operations.
Nice to have
- Minimum of 5 years of relevant experience in machine learning operations with a preference for candidates who have experience managing large language models.
- Experience with cloud computing platforms (AWS GCP or Azure) and containerization technologies (Docker Kubernetes).
machine learning,operations,analytical skills,python,pytorch,tensorflow,cloud,machine learning models,docker,kubernetes
Job tags
Salary