HubbleHox
Location
Mumbai | India
Job description
Job Title: Data Scientist
We are seeking a highly analytical and creative Data Scientist to join our data team. The Data Scientist will be responsible for analyzing and interpreting complex data sets to inform business decision-making. The ideal candidate has a strong background in statistical analysis, Machine Learning, Artificial Intelligence, NLP, LLMs, and Text-Image models.
Responsibilities:
1) Data Collection and Analysis:
● Collect and preprocess large datasets from various sources, ensuring data quality and reliability.
● Perform exploratory data analysis (EDA) to discover trends, patterns, and anomalies in the data.
2) Statistical Analysis:
● Apply statistical methods and hypothesis testing to draw meaningful insights from data.
● Build predictive models and conduct statistical analysis to support business objectives.
3) Machine Learning:
● Develop and implement machine learning/AI models for various applications, such as recommendation systems, personalization, chatbots, OCR/ICR, Image segmentation, proctoring, customer segmentation, etc.
● Evaluate model performance and fine-tune models for optimal results.
● Code new architectures using Pytorch, and Tensorflow to solve business use cases.
● Establish an end-to-end link between the model performance and business KPIs
4) MLOps:
● Develop and maintain automated pipelines for data preprocessing, feature engineering, model training, and deployment.
● Integrate machine learning models into CI/CD pipelines to automate testing, validation, and deployment processes, ensuring fast and reliable model updates.
● Implement version control for ML artifacts (such as models, datasets, and code) to track changes and manage model versions effectively.
● Set up monitoring and logging systems to track model performance, data drift, and any anomalies. This allows for proactive issue detection and resolution.
5) Data Visualization:
● Create clear and compelling data visualizations to communicate insights to stakeholders.
● Utilize data visualization tools such as Tableau, Power BI, or custom Python libraries including Seaborn, and Matplotlib.
6) Collaboration:
● Collaborate with cross-functional teams to understand business objectives and translate them into data-driven solutions.
● Communicate complex technical findings to non-technical stakeholders clearly and understandably.
7) Continuous Learning:
● Stay up-to-date with the latest trends and technologies in the field of data science and machine learning including LLMs, Text-Text, Text-Image, Text-Audio, and Text-Video models.
Qualifications:
● Bachelor's or Master's degree in Data Science, Computer Science, Statistics, or a related field.
● At least 2 years of experience as a Data Scientist or in a similar role.
● Strong programming skills in languages such as Python.
● Proficiency in data manipulation libraries (e.g., Pandas, NumPy) and machine learning libraries (e.g., Scikit-Learn, TensorFlow, or PyTorch).
● Knowledge of frameworks including Llama2, Langchain, and Huggingface is a big plus.
● Solid understanding of statistical analysis, hypothesis testing, and experimental design.
● Experience with data visualization tools (e.g., Tableau, Power BI, Matplotlib, Seaborn).
● Strong problem-solving skills and the ability to work independently or as part of a team.
● Excellent communication skills and the ability to present complex findings to both technical and non-technical audiences.
● Knowledge of cloud platforms (e.g., AWS, Azure, Google Cloud) and big data technologies (e.g., Hadoop, Spark) is a plus.
Job tags
Salary