DANAHAR CORPORATION
Location
Bangalore | India
Job description
Key Responsibilities:
1. Data Pipeline Development: Develop, maintain, and optimize robust ETL (Extract, Transform, Load) processes and data pipelines to ensure efficient data flow from various sources into our data warehouse.
2. Data Architecture: Design and implement data architectures that support the storage, retrieval, and analysis of large datasets, while ensuring data security and compliance.
3. Data Integration: Integrate data from diverse sources, including databases, APIs, streaming data, and more, to create a unified and accessible data ecosystem.
4. Performance Optimization: Continuously improve the performance and scalability of data systems, including query optimization and infrastructure enhancements.
5. Data Quality: Implement data quality checks and data cleansing processes to maintain high-quality gold standard data in the data warehouse.
6. Monitoring and Maintenance: Proactively monitor data pipelines and systems, troubleshoot and resolve issues, and ensure data availability and reliability.
7. Documentation: Maintain thorough documentation of data infrastructure, processes, and best practices for knowledge sharing.
8. Collaboration: Collaborate with cross-functional teams, including data scientists, analysts, embedded and software engineers, to understand data requirements and provide data solutions.
9. Data Governance: Implement data governance policies and standards to ensure data security, privacy, and compliance with regulatory requirements and audits in the strict Life Sciences field.
10. Technology Evaluation: Stay up-to-date with the latest data engineering technologies and evaluate their suitability for our infrastructure.
Qualifications:
- Bachelors with 7+ years of experience or Masters degree with 5+ years of experience in Computer Science, Information Technology, Engineering or a related field.
- Proven experience as a Data Engineer in a complex data environment.
- Strong proficiency in data warehousing and ETL tools (e.g., Informatica, SQL, Apache Spark, Hadoop, etc).
- Proficiency in programming languages such as Python, Java, or Scala.
- Expertise in database management systems (e.g., SQL, NoSQL, columnar databases).
- Experience with cloud platforms (e.g., AWS, Azure, GCP) and data technologies (e.g., BigQuery, Redshift, Snowflake).
- Excellent problem-solving and analytical skills.
- Strong communication and teamwork skills.
- Knowledge of data governance, security, and compliance best practices.
Preferred Qualifications:
- Data engineering certifications.
- Experience with real-time data processing and streaming platforms (e.g., Apache Kafka).
- Familiarity with data orchestration and workflow management tools.
- Experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Experience with Operations and MES systems is a strong plus
Job tags
Salary