Developer must have sound knowledge in Apache Spark and Python programming.
Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
Ability to design, build and unit test the application in Spark/Pyspark.
In-depth knowledge of Hadoop, Spark, and similar frameworks.
Ability to understand existing ETL logic to convert into Spark/PySpark/ Spark SQL.