PETADATA
Location
McLean, VA | United States
Job description
POSITION: ETL / Spark Developer
LOCATION: McLean, VA. (Hybrid)
Experience: 10+ years
Work type: Fulltime (W2 / C2C)
PETADATA is currently looking to hire for the ETL / Spark Developer role for one of their clients.
Roles & Responsibilities:
Should have experience with both streaming and batch workflows will be essential in ensuring the efficient flow and processing of data to support our clients.
Collaborate with cross-functional teams to understand data requirements and design robust data architecture solutions.
Design, develop, and implement scalable data processing solutions using Apache Spark
Ability to organize and to keep the projects well-arranged and structured.
Ensure data quality, integrity, and consistency throughout the ETL pipeline.
Integrate data from different systems and sources to provide a unified view for analytical purposes.
Collaborate with data analysts to implement solutions that meet their data integration needs.
Design and implement streaming workflows using PySpark Streaming or other relevant technologies.
Develop batch processing workflows for large-scale data processing and analysis.
Analyze the business requirement to determine the volume of data extracted from different sources, data models, to ensure the quality of the data involved.
Should be able to figure out the best storage medium required for the data warehouse needed.
Identify the data storage needs to determine the amount of data to deal with the company's requirements.
Must ensure the data quality that everything is in place at the transformation stage to eliminate errors and fix unstructured and unorganized data extracted.
Responsible for ensuring that the data is loaded into the warehouse system and meets the business needs and standards.
Should be responsible for data flow validation, creating and building a secured database warehouse that meets a given company's needs and standards.
Must be responsible for determining the storage needs of a business and the volume of data involved.
Required skills:
Should have 10+ years of experience in implementing ETL processes to extract, transform, and load data from various sources to ensure data quality, integrity, and consistency throughout the ETL pipeline.
Must be expertise in Python, PySpark, ETL processes, CI/CD (Jenkins or GitHub).
Experienced in Python and PySpark to develop efficient data processing and analysis scripts.
Optimize code for performance and scalability, keeping up-to-date with the latest industry best practices.
Must load data and be proficient in valuable technical skills such as SQL, JAVA, XML, and DOM, among others.
Extensive knowledge and experience with Spark and its technologies.
Hands-on Experience with Apache Spark framework, including the Spark SQL module for querying databases.
Good knowledge on data analysis, design, and programming skills such as JavaScript, SQL and XML, and DOM.
Familiar with various coding languages used in web development, including HTML, CSS, JavaScript, Python, Java, Scala, or R proficiency.
Should have experience in writing clean code that's free of bugs and reproducible by other developers.
Experience in managing SQL databases and organizing big data.
Hands-on experience with ETL such as MS SQL, SSIS (Server Integration Services), Python / Perl, Oracle, SQL Server/ MySQL.
Solid understanding of Data warehousing schemes, Dimensional modeling, and implementing data storage solutions to support efficient data retrieval and analysis.
Must be an expert in debugging ETL processes, optimizing data flows, and ensuring that the data pipeline is robust and error-free.
Good knowledge of Verbal and communication skills.
Educational Qualification:
Bachelor's/ Master's degree in Computer Science, Engineering, or a related field.
We offer a professional work environment and are given every opportunity to grow in the Information technology world.
Note:
Candidates required to attend Phone/Video Call / In person interviews and after Selection of candidate (He/She) should go through all background checks on Education and Experience.
Please email your resume to : [email protected]
After carefully reviewing your experience and skills one of our HR team members will contact you on the next steps.
Job tags
Salary