Location
Bangalore | India
Job description
Responsibilities:
1.Data Architecture and Design:
- Design, implement, and maintain scalable and robust data architecture.
- Work closely with data architects to ensure that data solutions align with the overall architecture strategy.
- Proficient in utilizing at least one of the major cloud platforms such as Azure or AWS.
2. Schema Design:
- Well-versed in designing and managing data schemas, especially in distributed environments & experienced with schema evolution and compatibility.
3. Data Modelling:
- Develop and maintain data models for efficient storage and retrieval of large datasets.
- Collaborate with data scientists and analysts to understand data requirements for analytics and reporting.
4. ETL / Data Pipeline Development:
- Build and optimize ETL processes to ingest, transform, and load data from various sources into our data warehouse.
- Implement best practices for data extraction, transformation, and loading to ensure data quality and integrity.
- Python or any programming expertise
5. Data Integration:
- Integrate data from different sources to provide a unified view for analytics and reporting purposes.
- Collaborate with software engineers to implement real-time data integration solution
6. Monitoring and Logging for Data Pipelines:
- Discuss your approach to monitoring and logging within data pipelines. Highlight your use of cloud monitoring services like AWS CloudWatch,Azure Monitor, or Google Cloud Monitoring to ensure pipeline reliability.
7. Error Handling and Retry Mechanisms:
- Showcase your skills in implementing robust error handling and retry mechanisms within data pipelines. Discuss how you ensured the resilience of the pipelines against transient failures
8. Optimizing Data Throughput & Performance:
- Highlight instances where you optimized data throughput and performance within pipelines. Discuss how you leveraged cloud services and features to achieve optimal processing speeds.
- Identify and address performance bottlenecks in data pipelines and optimize for speed and efficiency.
- Monitor and tune database performance for optimal query execution.
9. Data Quality and Governance:
- Implement data quality checks and ensure data integrity throughout the data lifecycle.
- Collaborate with data stewards to enforce data governance policies and standards.
10. DataOps:
- End to End of understanding Data Engineering life cycle that includes CI-CD, Code repository.
11.Collaboration:
- Work closely with cross-functional teams, including data scientists, analysts, and business stakeholders, to understand data needs and deliver solutions.
- Provide technical guidance and mentorship to junior data engineers.
12.Documentation
- Document data engineering processes, data models, and ETL workflows.
- Create and maintain documentation for data-related policies and standards
Experience - 8-10 years
Location - Bangalore
- Bachelor's or Master's degree in Computer Science, Information Technology, or related field.
- Proven experience as a Data Engineer for over 10 years, with a focus on designing and implementing large-scale data solutions.
- Strong programming skills in languages such as Python, Java, or Scala.
- Experience with big data technologies (e.g., Hadoop, Spark) and cloud platforms (e.g., AWS, Azure, GCP).
- Proficient in SQL and database management systems (e.g., PostgreSQL, MySQL, Oracle).
- Comprehensive knowledge of data warehousing concepts and dimensional modelling.
- Familiarity with data governance and best practices for data quality.
- Excellent problem-solving and communication skills.
- Ability to work in a fast-paced, dynamic & collaborative environment.
Job tags
Salary