Job Description
As Data Architect, you will be responsible for data platform and solutions for Indian and International SAAS products. You will architect, design, estimate, develop and deploy cutting-edge software products and services that leverage large-scale data ingestion, processing, storage and querying, and in-stream & batch analytics for Cloud and on-prem environments.
Your role will be focused on delivering high-quality solutions while driving design discussions across Data Engineering topics (ingestion, consumption, storage, computation, data models, performance, DevOps, DataOps, Data mesh, Test automation & Security).
As a hands-on technologist and a problem solver, you will lead a super-talented team of Data Engineers passionate about building the best possible solutions for our clients and endorse a culture of lifelong learning and collaboration.
Requirements
- Degree or foreign equivalent in Computer Science, Applied Computer Science or Computer Information Systems.
- Overall, ten to twelve years of experience working on Data Platforms, Data Engineering Database management/design, or Systems engineering.
- Four plus years of experience in playing the role of Data Architect and leading medium to large teams to deliver data solutions
- Extensive experience with Data related technologies, including knowledge of Big Data Architecture Patterns, Data Mesh and Cloud services (AWS - mandatory / Azure / GCP)
- Experience delivering end-to-end Big Data solutions on Cloud
- Knowledge of the pros and cons of various database technologies like Relational, NoSQL, MPP, and Columnar databases
- Expertise in the Hadoop eco-system with one or more distribution-like Cloudera and cloud-specific distributions
- Expertise in one or more NoSQL databases (Mongo DB, Cassandra, HBase, DynamoDB, Big Table etc.)
- Experience with Data Lake/Hadoop platform implementation.
- Experience with one or more big data ingestion tools (Sqoop, Flume, NiFi etc.), distributed messaging and ingestion frameworks (Kafka, Pulsar, Pub/Sub etc.)
- Knowledge of flexible, scalable data models addressing a wide variety of consumption patterns, including random-access and sequential access, including necessary optimisations like bucketing, aggregating, and sharding.
- Knowledge of performance tuning, optimisation, and scaling solutions from a storage/processing standpoint
- Experience building DevOps pipelines for data solutions, including automated testing
- Experience supporting and working with cross-functional teams in a dynamic environment
- Experience working in Agile Scrum Methodology.
- Passionate about learning new technologies and encouraging the teams to do the same
- Excellent verbal and written communication skills in English
- Effective communication with team members and stakeholders
Good to have
- Knowledge of Domain Driven Design
- Knowledge of containerisation, orchestration, and Kubernetes engine
- An understanding of how-to setup big-data cluster security (Authorization/ Authentication, Security for data at rest, data in transit)
- A basic understanding of how to manage and setup Monitoring and alerting for big-data clusters
- Experience with orchestration tools - Oozie, Apache Airflow, Control-M or similar
- Experience with MPP-style query engines like Impala, Presto, Athena etc.
- Hands-on experience in implementation and performance tuning Hadoop/Spark implementations.
- Experience with Apache Hadoop and the Hadoop ecosystem
- Experience with one or more relevant tools (Sqoop, Flume, Kafka, Oozie, Hue, Zookeeper, HCatalog, Solr, Avro)
- Knowledge of multi-dimensional modelling like start schema, snowflakes, normalised and de-normalized models
- Expertise with at least one distributed data processing framework, e.g., Spark (Core, Streaming, SQL), Storm, Flink etc.
- Proficiency in Java and Scala programming languages (Python a plus)
- Exposure to data governance, catalog, lineage, and associated tools would be an added advantage
- Exposure to implementing data security and regulations, e.g., GDPR, PCI DSS, ISO 27001 etc.
- A certification in one or more cloud platforms or big data technologies
- Any active participation in the Data Engineering thought community (e.g., blogs, keynote sessions, POV/POC, hackathon)
Location
- Remote (preferably Delhi NCR)
Didn’t find the job appropriate? Report this Job