Proven experience as a Data Engineer with a focus on Spark and Scala.
Strong programming skills in Scala and experience with Apache Spark for large-scale data processing.
Knowledge of distributed computing concepts and experience optimizing Spark jobs for performance.
Experience with data modeling, ETL processes, and data warehousing concepts.
Familiarity with big data technologies such as Hadoop, Hive, and HDFS.
Scala development and design using Scala 2.10+ or Java development and design using Java 1.8+.
Experience with most of the following technologies (Apache Hadoop, Scala, Apache Spark, PySpark, Spark streaming, YARN, Kafka, Hive, Python, ETL frameworks, Map Reduce, SQL, RESTful services).
Sound knowledge on working Unix/Linux Platform.
Hands-on experience building data pipelines using Hadoop components - Hive, Spark, Spark SQL,PySpark.
Experience with industry standard version control tools (Git, GitHub), automated deployment tools (Ansible & Jenkins) and requirement management in JIRA.
Understanding of big data modelling techniques using relational and non-relational techniques.
Experience on Debugging the Code issues and then publishing the highlighted differences to the development team/Architects