Data Architect - IT

9 - 15 Years

Any Location/Delhi NCR/Noida/Greater Noida/Gurgaon/Gurugram/Bangalore/Mumbai/Pune/Ghaziabad/Faridabad/Navi Mumbai

Posted 1 year ago

#Data Modeling #Data Management #Big Data #Python #Data Governance

Job Description

As Data Architect, you will be responsible for data platform and solutions for Indian and International SAAS products. You will architect, design, estimate, develop and deploy cutting-edge software products and services that leverage large-scale data ingestion, processing, storage and querying, and in-stream & batch analytics for Cloud and on-prem environments.

Your role will be focused on delivering high-quality solutions while driving design discussions across Data Engineering topics (ingestion, consumption, storage, computation, data models, performance, DevOps, DataOps, Data mesh, Test automation & Security).

As a hands-on technologist and a problem solver, you will lead a super-talented team of Data Engineers passionate about building the best possible solutions for our clients and endorse a culture of lifelong learning and collaboration.

Requirements

- Degree or foreign equivalent in Computer Science, Applied Computer Science or Computer Information Systems.

- Overall, ten to twelve years of experience working on Data Platforms, Data Engineering Database management/design, or Systems engineering.

- Four plus years of experience in playing the role of Data Architect and leading medium to large teams to deliver data solutions

- Extensive experience with Data related technologies, including knowledge of Big Data Architecture Patterns, Data Mesh and Cloud services (AWS - mandatory / Azure / GCP)

- Experience delivering end-to-end Big Data solutions on Cloud

- Knowledge of the pros and cons of various database technologies like Relational, NoSQL, MPP, and Columnar databases

- Expertise in the Hadoop eco-system with one or more distribution-like Cloudera and cloud-specific distributions

- Expertise in one or more NoSQL databases (Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.)

- Experience with Data Lake/Hadoop platform implementation.

- Experience with one or more big data ingestion tools (Sqoop, Flume, NiFi etc.), distributed messaging and ingestion frameworks (Kafka, Pulsar, Pub/Sub etc.)

- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns, including random-access and sequential access, including necessary optimisations like bucketing, aggregating, and sharding.

- Knowledge of performance tuning, optimisation, and scaling solutions from a storage/processing standpoint

- Experience building DevOps pipelines for data solutions, including automated testing

- Experience supporting and working with cross-functional teams in a dynamic environment

- Experience working in Agile Scrum Methodology.

- Passionate about learning new technologies and encouraging the teams to do the same

- Excellent verbal and written communication skills in English

- Effective communication with team members and stakeholders

Good to have

- Knowledge of Domain Driven Design

- Knowledge of containerisation, orchestration, and Kubernetes engine

- An understanding of how-to setup big-data cluster security (Authorization/ Authentication, Security for data at rest, data in transit)

- A basic understanding of how to manage and setup Monitoring and alerting for big-data clusters

- Experience with orchestration tools - Oozie, Apache Airflow, Control-M or similar

- Experience with MPP-style query engines like Impala, Presto, Athena etc.

- Hands-on experience in implementation and performance tuning Hadoop/Spark implementations.

- Experience with Apache Hadoop and the Hadoop ecosystem

- Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro)

- Knowledge of multi-dimensional modelling like start schema, snowflakes, normalised and de-normalized models

- Expertise with at least one distributed data processing framework, e.g., Spark (Core, Streaming, SQL), Storm, Flink etc.

- Proficiency in Java and Scala programming languages (Python a plus)

- Exposure to data governance, catalog, lineage, and associated tools would be an added advantage

- Exposure to implementing data security and regulations, e.g., GDPR, PCI DSS, ISO 27001 etc.

- A certification in one or more cloud platforms or big data technologies

- Any active participation in the Data Engineering thought community (e.g., blogs, keynote sessions, POV/POC, hackathon)

Location

- Remote (preferably Delhi NCR)