logo

JobNob

Your Career. Our Passion.

Data Engineer


EXL


Location

Gurgaon | India


Job description

Job Location:

Bangalore/Gurgaon Shift Timing:

12:00PM IST – 10:30 PM IST Experience:

3+ years

Job Summary: Data Engineer (DE) is responsible for designing, developing, and maintaining data assets and data related products by liaising with multiple stakeholders. Responsibilities: · Collaborate with project stakeholders (client) to identify product and technical requirements. Conduct analysis to determine integration needs. Use different data warehousing concepts to build a data warehouse for reporting purpose. · Build data pipelines to ingest and transform the data into our Data platform. · Apply best approaches for large scale data movement, capture data changes and apply incremental data load strategies. · Develop, implement, and tune large-scale distributed systems and pipelines that process large volume of data. Assist Data Science / Modelling teams in setting up data pipelines & monitoring daily jobs. Develop and test ETL components to high standards of data quality and act as hands-on development lead. Oversee and contribute to the creation and maintenance of relevant data artifacts (data lineages, source to target mappings, high level designs, interface agreements, etc.). Ensuring that developer responsibilities are being met by mentoring, reviewing code and test plans, verifying that design best practices as well as coding and architectural guidelines, standards, and frameworks. Work with stakeholders to understand the data requirements to design, develop, and maintain complex ETL processes. Create the data integration and data diagram documentation. Qualifications (Must have): 3+ years as Data Engineer with proficiency in SQL, Python & PySpark programming. Strong knowledge on Databricks and related services/functionalities and how to utilize them across the DE & Analytics spectrum Strong knowledge on Hadoop, Hive, Databricks and RDBMS like Oracle, Teradata, SQL server etc Expectation is to write SQL to query metadata and tables from different data management system such as, Oracle, Hive, Databricks and Greenplum. Familiarity with big data technologies like Hadoop, Spark, and distributed computing frameworks. Expectation is to use Hue and run Hive SQL queries, schedule Apache Oozie jobs to automate the data workflows. Degree in Data Science, Statistics, Computer Science or other related fields or an equivalent combination of education and experience. Proficiency in at least one cloud platform (AWS, Azure, GCP) & developing ETL processes using ETL tools, big data processing and analytics with Databricks. · Expertise in building data pipelines in big data platforms; Good understanding of Data warehousing concepts Good working experience of communicating with the stakeholders and collaborate effectively with the business team for data testing. Expectation is to have strong problem-solving and troubleshooting skills. Expectation is to establish comprehensive data quality test cases, procedures and implement automated data validation processes. Strong communication, problem solving and analytical skills with the ability to do time management and multi-tasking with attention to detail and accuracy. Strong business acumen & demonstrated aptitude for analytics that incite action.

Qualifications (Preferred): Good experience building Real-Time streaming data pipelines with Kafka, Kinesis etc. Knowledge of Jinja/YAML templating in Python is a plus. Knowledge and experience in designing and developing RESTful services. Working knowledge of DevOps methodologies, including designing CI/CD pipelines Experience building distributed architecture-based systems, especially handling large data volumes and real-time distribution. Good working experience of communicating with the stakeholders and collaborate effectively with the business team for data testing. Expectation is to have strong problem-solving and troubleshooting skills. Expectation is to establish comprehensive data quality test cases, procedures and implement automated data validation processes. Initiative and problem-solving skills when working independently. Familiarity with Big Data Design Patterns, modelling, and architecture. Exposure to NoSQL databases and cloud-based data transformation technologies. Understanding of object-oriented design principles. Knowledge of enterprise integration patterns. Experience with messaging middleware, including queues, pub-sub channels, and streaming technologies. Expertise in building high-performance, highly scalable, cloud-based applications. Experience with SQL and No-SQL databases.


Job tags



Salary

All rights reserved