This position is for a Technical Leader for the Data Platform team who will have an in-depth knowledge Data Lake and Data warehousing using Dimensional Modelling and has worked with Large Datasets in newer data technolog.
Responsibilities of this role are briefly as follows:
- Collaborate with Product Management to drive/understand business requirements and design appropriate solutions
- Implementation and enhancement of Data Ingestion Frameworks, Pipelines
- End-to-End implementation for building and enhancing the Data Lake and related Eco-system
- Envision and influence technical standards for deliverables of projects.
- Periodic Review of Architecture, Data models, ETL Design Strategy
- Defines and maintains the integrity of the overall Data Platform
- Continouus Assessment of current business processes and technologies and influence change wherever appropriate
- Working with cross functional teams to ensure the necessary data and analysis environments are available
- Responsible for executing technical user stories for sprint.
- Mentor Software engineers in the Data technologies
- 7+ year's professional experience in Data Warehousing, Data Lakes, knowledge of Big Data Pipelines using Airflow or similar tools is a plus
- High level of Proficiency in SQL (Advanced) and experience with Semi-structured data
- ETL/Data Processing Technology for Data warehouse/ Data Lakes: Apache Spark, Talend, any other Open source ETL tool.(at least 4 years of experience in any of these tools)
- Expertise in Data Modeling techniques especially Dimensional modeling
- Knowledge and experience of datastores such as SQL server, PostgreSQL and Cloud datastores like Snowflake Datawarehouse/ Big data data storesunderstanding of Continued integration to drive the DevOps implementation on respective projects using tools - Gitlab, Airflow, Jenkins
- Must have a high degree of initiative to implement solutions in a fast-paced, dynamic environment.
- Must be able to understand business requirements and translate them into technical deliveries.
- Experience in Large dataset integration with variety of different sources, each with varying degrees of data quality and cleanliness
- Requires strong analytical, conceptual and problem-solving abilities
- Experience working with remote teams.
- Must have excellent communication skills.
- Familiarity with Agile concepts, and experience practicing SCRUM.
Nice-to-Have:
- Knowledge of Real-time Data streaming like Apache Kafka , AWS Kinesis, Spark Streaming
- Proficiency in programming languages like PySpark, Python
- Knowledge on dbt- Nice to have: Knowledge on Data Catalogging
Didn’t find the job appropriate? Report this Job