Koantek
Location
Hyderabad | India
Job description
This is a remote position.
Job Description Summary
The Databricks AWS/Azure/GCP Architect at Koantek builds secure highly scalable big data solutions to achieve tangible datadriven outcomes all the while keeping simplicity and operational effectiveness in mind. This role collaborates with teammates product teams and crossfunctional project teams to lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture. This role is responsible for implementing securely architected big data solutions that are operationally reliable performant and deliver on strategic initiatives.
Specific requirements for the role include:
Expertlevel knowledge of data frameworks data lakes and opensource projects such as Apache Spark MLflow and Delta Lake
Expertlevel handson coding experience in Spark/ScalaPython or Pyspark
Expert proficiency in Python C Java R and SQL
Midlevel knowledge of code versioning tools such as Git Bitbucket or SVN
In depth understanding of Spark Architecture including Spark Core Spark SQL Data Frames Spark Streaming RDD caching Spark MLib
IoT/eventdriven/microservices in the cloud Experience with private and public cloud architectures pros/cons and migration considerations.
Extensive handson experience implementing data migration and data processing using AWS/Azure/GCP services
Expertise in using Spark SQL with various data sources like JSON Parquet and Key Value Pair
Extensive handson experience with the Technology stack available in the industry for data management data ingestion capture processing and curation: Kafka StreamSets Attunity GoldenGate Map Reduce Hadoop Hive Hbase Cassandra Spark Flume Hive Impala etc.
Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git Jenkins Jira and Confluence
Experience in creating tables partitioning bucketing loading and aggregating data using Spark SQL/Scala
Able to build ingestion to ADLS and enable BI layer for Analytics
Experience in Machine Learning Studio Stream Analytics Event/IoT Hubs and Cosmos
Strong understanding of Data Modeling and defining conceptual logical and physical data models.
Proficient level experience with architecture design build and optimization of big data collection ingestion storage processing and visualization
Working knowledge of RESTful APIs OAuth2 authorization framework and security best practices for API Gateways
Familiarity of working with unstructured data sets (i.e. voice image log files social media posts email)
Experience in handling escalations from customer s operational issues.
Responsibilities :
Work closely with team members to lead and drive enterprise solutions advising on key decision points on tradeoffs best practices and risk mitigation
Guide customers in transforming big data projectsincluding development and deployment of big data and AI applications
Educate clients on Cloud technologies and influence direction of the solution.
Promote emphasize and leverage big data solutions to deploy performant systems that appropriately autoscale are highly available faulttolerant selfmonitoring and serviceable
Use a defenseindepth approach in designing data solutions and AWS/Azure/GCP infrastructure
Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling
Aid developers to identify design and implement process improvements with automation tools to optimizing data delivery
Build infrastructure required for optimal extraction loading and transformation of data from a wide variety of data sources
Work with the developers to maintain and monitor scalable data pipelines
Perform root cause analysis to answer specific business questions and identify opportunities for process improvement
Build out new API integrations to support continuing increases in data volume and complexity
Implement processes and systems to monitor data quality and security ensuring production data is accurate and available for key stakeholders and the business processes that depend on it
Employ change management best practices to ensure that data remains readily accessible to the business
Maintain tools processes and associated documentation to manage API gateways and underlying infrastructure
Implement reusable design templates and solutions to integrate automate and orchestrate cloud operational needs
Experience with MDM using data governance solutions
Qualifications:
Overall experience of 12 years in the IT field.
2 years of handson experience designing and implementing multitenant solutions using Azure Databricks for data governance data pipelines for near realtime data warehouse and machine learning solutions.
3 years of design and development experience with scalable and costeffective Microsoft Azure/AWS/GCP data architecture and related solutions
5 years experience in a software development data engineering or data analytics field using Python Scala Spark Java or equivalent technologies
Bachelor s or Master s degree in Big Data Computer Science Engineering Mathematics or similar area of study or equivalent work experience
Nice to have
Advanced technical certifications: Azure Solutions Architect Expert
AWS Certified Data Analytics DASCA Big Data Engineering and Analytics
AWS Certified Cloud Practitioner Solutions Architect.
Professional Google Cloud Certified
Job tags
Salary