GCP Data Architect (Remote)
SPAR Information Systems LLC
Location
Texas | United States
Job description
Role: Sr Data Architect (GCP)
Location: Remote
Duration: 12+ Months
Key responsibilities:
- Lead the requirement gathering process and create comprehensive technical designs (high-level and detailed).
- Develop a robust data ingestion framework for diverse data sources.
- Participate actively in architectural discussions, perform system analysis (reviewing existing systems and operational methodologies), and analyze emerging technologies to propose optimal solutions that address current needs and simplify future modifications.
- Design data models suitable for transactional and big data environments, serving as input for machine learning processing.
- Design and build the necessary infrastructure to enable efficient ETL from various data sources, leveraging GCP services.
- Develop data and semantic interoperability specifications.
- Collaborate with the business to define and scope project requirements.
- Partner with external vendors to facilitate data acquisition.
- Analyze existing systems to identify suitable data sources.
- Implement and continuously improve data automation processes.
- Champion continuous improvement in DevOps automation.
- Provide design expertise in Master Data Management, Data Quality, and Meta Data Management.
Required Skills:
- Active Google Cloud Data Engineer or Google Professional Cloud Architect Certification.
- Minimum 8 years of experience designing, building, and operationalizing large-scale enterprise data solutions and applications using GCP data and analytics services alongside 3rd party tools (Spark, Hive, Cloud DataProc, Cloud Dataflow, Apache Beam/Composer, BigTable, Cloud BigQuery, Cloud Pub/Sub, Cloud Storage, Cloud Functions, & Github).
- Minimum 5 years of experience performing detailed assessments of current data platforms and crafting strategic migration plans to GCP cloud.
- Strong Python development experience (mandatory).
- 2+ years of data engineering experience with distributed architectures, ETL, EDW, and big data technologies.
- Demonstrated knowledge and experience with Google Cloud BigQuery (mandatory).
- Experience with DataProc & DataFlows using Java on GCP.
- Experience with serverless data warehousing concepts on Google Cloud.
- Experience with DWBI modelling frameworks.
- Strong understanding of Oracle databases and familiarity with GoldenGate is highly desired.
- Expertise in Debezium and Apache Flink for change data capture and processing.
- Experience working with both structured and unstructured data sources using cloud analytics platforms (e.g., Cloudera, Hadoop).
- Experience with Data Mapping and Modelling.
- Experience with Data Analytics tools.
- Proven proficiency in one or more programming/scripting languages: Python, JavaScript, Java, R, UNIX Shell, php, or ruby.
- Experience with Google Cloud services: Streaming + Batch, Cloud Storage, Cloud Dataflow, DataProc, DFunc, BigQuery, BigTable.
- Knowledge and proven use of contemporary data mining, cloud computing, and data management tools: Microsoft Azure, AWS Cloud, Google Cloud, Hadoop, HDFS, MapR, and Spark.
- Bachelor's degree or equivalent (minimum 12 years) work experience.
Arvind Kumar Bind
|| SPAR Information Systems || Phone: 469-750-0607 ||
|| Email: [email protected] || Web: ||
Job tags
Salary