Location
Bangalore North | India
Job description
At Thoucentric, we work on various problem statements.
- The most popular ones are -
- Building capabilities that address a market need, basis our ongoing research efforts
- Solving a specific use case for a current or potential client based on challenges on-ground
- Developing new systems that help be a better employer and a better partner to clients
All of these need the best of minds to work on them day-to-day ; and we do exactly that! Your contribution to organization development is as important as outward facing consulting. We are invested in both, employee growth and client success!
As a Data Engineer, you will play a pivotal role in designing, implementing, and maintaining our data infrastructure. You will work with a wide variety of data sources, process and transform data, and orchestrate workflows to support data-driven decision-making. Your expertise in Delta Lake, Apache Spark, and Apache Airflow will be crucial in ensuring the reliability and performance of our data systems.
Key Responsibilities:
- Data Ingestion and Integration:
- Develop and maintain data pipelines to ingest data from diverse sources.
- Implement data integration solutions to harmonize data from various platforms.
- Data Transformation:
- Use Delta Lake to process and transform data efficiently.
- Build and optimize data processing workflows using Apache Spark.
- Data Orchestration:
- Create, schedule, and manage data pipelines with Apache Airflow.
- Monitor and troubleshoot workflow execution, ensuring data quality.
- Data Modeling and Optimization:
- Design and maintain data models to support efficient querying and reporting.
- Implement performance tuning and optimization strategies for Spark and Delta Lake.
- Data Governance:
- Establish data security and access controls to protect sensitive information.
- Ensure compliance with data governance policies and best practices.
- Collaboration:
- Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and provide data support.
Requirements
Qualifications:
- Bachelors degree in Computer Science, Information Technology, or a related field.
- 3+ years of experience in data engineering, focusing on Delta Lake, Apache Spark, and Apache Airflow.
- Strong proficiency in Delta Lake concepts and best practices.
- Proven expertise in Apache Spark for large-scale data processing.
- Experience with Apache Airflow for workflow orchestration.
- Strong SQL and NoSQL database knowledge.
- Familiarity with data warehousing and data modeling.
- Excellent problem-solving and troubleshooting skills.
- Strong communication and teamwork skills.
Preferred Qualifications:
- Experience with cloud-based data platforms (e.g., AWS, Azure, Google Cloud).
- Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes).
- Familiarity with data visualization tools (e.g., Tableau, Power BI).
- Certification in Delta Lake, Apache Spark, or Apache Airflow.
Benefits
What is in it for You:
Be part of the exciting Growth Story of Thoucentric! Work on projects that help you stay ahead of the curve. Not just exciting projects, if you are a self-starter, you will also get multiple opportunities to design, drive and contribute to the organizational and practice initiatives. Challenge yourself in an environment with higher expectations ensuring constant learning curve and steep growth opportunities Be part of One Extended Family. We bond beyond work - sports, get-togethers, common interests etc. Work in a very enriching environment with Open Culture, Flat Organization and Excellent Peer Group.
Job tags
Salary