In order to meet data processing and analytics requirements, design and create SQL queries, stored procedures, functions, and views.
To make SQL code for data extraction, transformation, and loading operations as efficient as possible, collaborate closely with data engineers and data scientists.
To ensure effective data retrieval and processing, troubleshoot and optimize database performance.
Design and maintain data models in conjunction with the data engineering team to ensure consistency and integrity of the data.
Create and keep up with documentation for database schemas, data models, and SQL code.
2. Data Engineer Sub-Role
Data Modeling: Understand and implement data modeling for the data lake.
Data Curation and Transformation: Translate business requirements into technical solutions for data curation and transformation, using Python.
Azure Data Engineering: Develop data pipelines and orchestration using Azure Data Factory and other Azure Modern Data Warehouse platform components.
Azure Services: Work with Azure services such as ADLS (Azure Data Lake Storage), ADX (Azure Data Explorer), and Databricks.
Data Migration: Experience in SQL Server migration to the cloud and handling structured and unstructured datasets.
DevOps and CI/CD: Good knowledge of DevOps tools and processes, and experience in automating data pipelines through a CI/CD delivery methodology.
Databricks Expertise: In-depth hands-on implementation knowledge on Azure Databricks, Databricks Delta Lake, and managing Delta Tables.
Effort Estimation: Provide effort estimation for new projects related to data engineering.
Design, implement, and maintain data pipelines for data ingestion, processing, and transformation in Azure.
Work together with data scientists and analysts to understand the needs for data and create effective data workflows.
Create and maintain data storage solutions including Azure SQL Database, Azure Data Lake, and Azure Blob Storage.
Utilizing Azure Data Factory or comparable technologies, create and maintain ETL (Extract, Transform, Load) operations.
Implementing data validation and cleansing procedures will ensure the quality, integrity, and dependability of the data.
Monitoring and resolving data pipeline problems will guarantee consistency and availability of the data.
The ability to automate tasks and deploy production standard code (with unit testing, continuous integration, versioning etc.)
Load transformed data into storage and reporting structures in destinations including data warehouse, high speed indexes, real-time reporting systems and analytics applications
Other responsibilities include extracting data, troubleshooting and maintaining the data warehouse