Senior Data Research Engineer
Location
Mumbai | India
Job description
- Collaborate with editors, writers, and cross-functional teams to understand data visualization and manipulation needs.
- Develop methods and processes for data quality assurance (QA) to ensure accuracy, completeness, and integrity.
- Define and implement data validation rules and automated data quality checks.
- Perform data profiling and analysis to identify anomalies, outliers, and inconsistencies.
- Contribute to the design and implementation of data models and schemas.
- Collaborate with the database engineer/developer in the team to optimize database performance.
- Assist in acquiring and integrating data from various sources, including web crawling and API integration.
- Develop and maintain scripts in Python for data extraction, transformation, and loading (ETL) processes.
- Assist in building and maintaining data processing workflows using tools like Knime.
- Stay updated with emerging data visualization tools, techniques, and best practices.
- Stay updated with emerging technologies and industry trends to contribute innovative ideas for research and development enhancements.
- Explore third-party technologies as alternatives to legacy approaches for efficient data pipelines.
- Contribute to cross-functional teams in understanding data requirements and participating in data strategy and governance initiatives.
- Collaborate with the Data Research Engineer Team Lead to estimate development efforts and meet project deadlines.
- Assume accountability for achieving development milestones.
- Prioritize tasks to ensure timely delivery, in a fast-paced environment with rapidly changing priorities.
- Collaborate with and assist fellow members of the Data Research Engineering Team as required.
- Leverage online resources effectively like StackOverflow, ChatGPT, Bard, etc., while considering their capabilities and limitations.
Skills and Experience
- Bachelors degree in Computer Science, Data Science, or a related field.
- Strong proficiency in Python programming for data extraction, transformation, and loading.
- Proficiency in SQL and data querying.
- Knowledge of Python modules such as Pandas, SQLAlchemy, gspread, PyDrive, BeautifulSoup and Selenium, sklearn, Plotly.
- Knowledge of web crawling techniques and API integration.
- Knowledge of database concepts and data modeling principles.
- Knowledge of data quality assurance methodologies and techniques.
- Knowledge of data visualization tools and libraries (e.g., D3.js, Plotly, Tableau).
- Knowledge of cloud platforms: AWS (RDS, S3, EC2 and ECS), Google Cloud Platform, and big data technologies.
- Familiarity with Knime or similar tools for data integration and analysis is a plus.
- Familiarity with machine learning concepts and techniques.
- Familiarity with front-end technologies in web development such as HTML, CSS, JavaScript, and Angular for implementing data visualizations.
- Familiarity with Docker containers or similar technologies is a plus.
- Familiarity with Agile development methodologies is a plus.
- Strong problem-solving and analytical skills with attention to detail.
- Creative and critical thinking.
- Ability to work collaboratively in a team environment.
- Good and effective communication skills.
- Eagerness to learn and expand knowledge in data engineering.
- Experience with version control systems, such as Git, for collaborative development.
- Ability to thrive in a fast-paced environment with rapidly changing priorities.
- Comfortable with autonomy and ability to work independently.
Perks:
Day off on the 3rd Friday of every month (one long weekend each month)
Monthly Wellness Reimbursement Program to promote health well-being
Monthly Office Commutation Reimbursement Program
Paid paternity and maternity leaves
Group Medical Insurance
Group Term Life Insurance (2.5X of the CTC)
Group Personal Accident Insurance (3 X of the CTC)
Qualifications
Bachelors degree in Computer Science, Data Science, or a related field
Job tags
Salary