logo

JobNob

Your Career. Our Passion.

Datalake Engineer


Location

Bangalore | India


Job description

Datalake Engineer

Position Description
As a Data Lake Engineer, your primary responsibility is to ensure the smooth operation and reliability of the data lake infrastructure. You will work closely with data engineers, administrators, and other stakeholders to address issues, provide technical support, and maintain the overall health of the data lake environment. This role requires a strong understanding of data lake technologies, troubleshooting skills, and the ability to collaborate effectively with cross-functional teams. You will be responsible for designing, implementing, and optimizing big data and analytics solutions using the Databricks platform on the Amazon Web Services (AWS) cloud. This role requires expertise in both AWS services and Databricks capabilities, along with strong programming and data engineering skills. You will collaborate with data scientists, analysts, and other stakeholders to build scalable and efficient data processing workflows.

- System Monitoring and Maintenance:
Monitor the data lake infrastructure for performance, availability, and potential issues.
Conduct routine maintenance tasks, such as software updates, patches, and system optimizations.

- Incident Response and Troubleshooting:
Respond to incidents and service requests related to the data lake promptly.
Troubleshoot and resolve issues related to data ingestion, processing, and storage.

- Collaboration with Data Engineers/Administrators:
Collaborate with data engineers and administrators to understand the data lake architecture and workflows.
Provide support for the development and deployment of data pipelines and processing jobs.

- User Support and Training:
Assist users in accessing and utilizing the data lake effectively.
Provide training and documentation to help users troubleshoot common issues independently.

- Data Security and Access Control:
Implement and maintain access controls and security measures to protect sensitive data.
Work with security teams to address and mitigate security vulnerabilities.

- Performance Optimization:
Monitor and analyze system performance metrics to identify areas for optimization.
Collaborate with the engineering team to implement improvements and enhancements.

- Documentation:
Maintain documentation for system configurations, troubleshooting procedures, and best practices.
Contribute to the knowledge base for common issues and solutions.

- Collaboration:
Coordinate with technology vendors for issue resolution, updates, and escalations.
Evaluate and recommend improvements based on vendor recommendations and industry best practices.
Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and provide the necessary infrastructure and tools.
Document data lake design, processes, and configurations for knowledge sharing and future reference.

- Data Ingestion and Integration:
Develop robust and scalable processes for ingesting and integrating diverse datasets into the data lake.
Implement data transformation and cleansing processes to ensure high-quality data within the lake.

- Data Lake Infrastructure Management:
Manage the underlying infrastructure of the data lake, including storage, compute resources, and data processing engines.
Implement and optimize data partitioning, indexing, and compression strategies for performance and cost efficiency.

- Data Security and Governance:
Establish and enforce data governance policies to ensure compliance with regulatory requirements and internal standards.
Implement security measures to protect sensitive data stored in the data lake.

- Data Cataloging and Metadata Management:
Develop and maintain a comprehensive data catalog to facilitate data discovery and understanding.
Implement metadata management processes to capture and maintain information about data lineage, quality, and usage.

- Stay Current with Industry Trends:
Keep abreast of emerging technologies and trends in big data, data lakes, and related fields.
Evaluate and recommend new tools and technologies that can enhance the data lake infrastructure.

Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- Proven experience as a Data Engineer or similar role, with a focus on AWS and Databricks.
- Strong proficiency in programming languages such as Python, Scala, or SQL.
- Experience in designing and implementing scalable and efficient data processing workflows.
- Knowledge of big data technologies, ETL processes, and data modeling.
- AWS certifications (e.g., AWS Certified Big Data - Specialty) are a plus.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.

Your future duties and responsibilities

Required qualifications to be successful in this role

Insights you can act on

While technology is at the heart of our clients’ digital transformation, we understand that people are at the heart of business success.

When you join CGI, you become a trusted advisor, collaborating with colleagues and clients to bring forward actionable insights that deliver meaningful and sustainable outcomes. We call our employees “members” because they are CGI shareholders and owners, and, as owners, we enjoy working and growing together to build a company we are proud of. This has been our Dream since 1976, and it has brought us to where we are today—one of the world’s largest independent providers of IT and business consulting services.

At CGI, we recognize the richness that diversity brings. We strive to create a work culture where everyone belongs, and we collaborate with clients in building more inclusive communities. As an equal opportunity employer, we empower all our members to succeed and grow. If you require an accommodation at any point during the recruitment process, please let us know. We will be happy to assist.

Ready to become part of our success story? Join CGI—where your ideas and actions make a difference.


Job tags



Salary

All rights reserved