Design, build, test and deploy cutting edge solutions at scale, impacting millions of customers worldwide drive value from data at Walmart Scale
Develop high performance and scalable solutions that extract, transform, and load Spark/big data.
Experience performing root cause analysis on data and processes to answer specific business questions and identify opportunities for improvement.
Experience building and optimizing spark/big data data pipelines, architectures and data sets involving petabyte and terabyte of data.
Interact with Walmart engineering teams across geographies to leverage expertise and contribute to the tech community.
Engage with Product Management and Business to drive the agenda, set your priorities and deliver awesome product features to keep platform ahead of market scenarios.
Closely interact with Data Engineers from within Walmart to identify right open-source tools to deliver product features by performing research, POC/Pilot.
Engage with Product Management and Business to support and build data solutions and develop expertise w.r.t data thereby being known as the true data analyst.
You also get to collaborate with team members to develop best practices and client requirements for the software.
You will show your skills in analyzing and testing programs/products before formal launch to ensure flawless performance.
Software security is of prime importance and by developing programs that monitor sharing of private information, you will be able to add tremendous credibility to your work.
You will also be required to seek ways to improve the software and its effectiveness.
You will be called upon to support the coaching and training of other team members to ensure all employees are confident in the use of software applications.
What youll bring:
Minimum qualifications:
Bachelor s or master s degree in computer science or related technical field.
Over all 6-9 years of object-oriented programming experience in Python/Scala..
6-9 years of experience in building of large-scale data pipelines using big data technologies (i.e., Spark / Kafka / Cassandra / Hadoop / Hive / Presto / Airflow).
6+ years of experience in systems design, algorithms, and distributed systems.
Good Knowledge on PySpark, Parquet, Python Framework Development
Good Azure DevOps and its interaction with Databricks and Data Factory
Good understanding of on-premises big data technology.
Good knowledge on GBQ or Azure Synapse.
Good/Strong understanding of ETL & Data Warehousing concepts
Strong in connecting to and ingesting from multiple source types to Azure ADLS or Google Cloud Storage
Additional Qualifications:
Large scale distributed systems experience, including scalability and fault tolerance.
Exposure to cloud infrastructure, such as Open Stack, Azure, GCP, or AWS
A continuous drive to explore, improve, enhance, automate and optimize systems and tools.
Strong computer science fundamentals in data structures and algorithms
Exposure to information retrieval, statistics, and machine learning.
Excellent oral and written communication skills.
Good understanding of metadata driven development
Excellent problem solving, Critical and Analytical thinking skills.