Saras Analytics - Principal Data Engineer (7-10 yrs)
Job Description :
Principal Data Engineer
About Saras Analytics :
Saras Analytics is a fast-growing data management and predictive analytics company based out of Hyderabad, India. We are a group of engineers and analysts focused on improving business performance through data analysis. We are laser focused on providing the best ROI for our clients and leave no stone unturned in our quest to provide the best results for our customers. We are an employee-centric organization and are looking for individuals who share our passion to make a difference and to be an integral part of our growth.
As a Principal Data Engineer at Saras Analytics, you will be responsible for building and maintaining large-scale data pipelines as well as create and data pipelines that deal with large volumes of data.
Architect : 80%
- Formulates and recommends standards for achieving maximum performance and efficiency of the DW ecosystem.
- Participates in the planning for retirement of systems programs, and migration of systems infrastructure.
- Develop business cases and ROI for the department to get buy-in from senior management.
- Interview stakeholders and develop BI roadmap for success given project prioritization and budget.
- Evangelize self-service BI and visual discovery while helping to change Excel based culture.
- Work closely with DW manager to ensure prioritization of corporate and departmental objectives and projects.
- Champion data quality, integrity, and reliability throughout the department by designing and promoting best practices.
- Experience in database programming using multiple aspects of SQL and Python
- Understand and translate data, analytical requirements, and functional needs into technical requirements
- Build, maintain and deploy scalable data pipelines to support large scale data management projects
- Ensure alignment with data strategy and standards of data processing
- Experience in Big Data ecosystem - on-prem (Hortonworks/MapR) or Cloud (Dataproc/EMR/HDInsight)
- Experience in Hadoop, Pig, SQL, Hive, Sqoop and SparkSQL
- Experience in any orchestration/workflow tool such as Airflow/Oozie for scheduling pipelines
- Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow
- Understand and execute in memory distributed computing frameworks like Spark (and/or DataBricks) and its parameter tuning, writing optimized queries in Spark
- Hands-on experience in using Spark Streaming, Kafka and Hbase
- BE/BS/MTech/MS in computer science or equivalent work experience.
Team Mentoring: 20%
- Provide assistance to DW team members with issues needing technical expertise or complex systems and/or programming knowledge. Provide on-the-job training for new or less experienced team members.
- Provide technical training to external team members to foster stronger cross-departmental relations.
- Exposure to latest cloud ETL tools such as Glue/ADF/Dataflow is a plus
- Expertise in data structures, distributed computing, manipulating, and analyzing complex high-volume data from variety of internal and external sources
- Experience in building structured and unstructured data pipelines
- Proficient in programming language such as Python/Scala
- Good understanding of data analysis techniques
- Has strong presentation and collaboration skills and can communicate all aspects of the job requirements, including the creation of formal documentation
- Strong problem solving, time management and organizational skills
- Knowledge of E-commerce Industry
- Good understanding of in relational/dimensional modelling and ETL concepts
- Understanding of any reporting tools such as Looker, Tableau, Qlikview or PowerBI
Qualifications preferred :
- 4 - 7 years of experience
- Category: Bachelor's Degree, Master's Degree
- Field specialization: Computer Science / IT
- Degree: Bachelor of Engineering - BE, Bachelor of Science - BS, Master of Engineering - MEng, Master of Science - MS