AM/Manager - Data Scientist - BFSI/Healthcare

6 - 12 Years

Bangalore

Posted 9 years ago

#Analytics #SAS #Statistics #Data Management #Big Data

Role Overview: Explore big data to surface useful trends, signals, and segments.

- The role drives business and industry solutions focused on Big Data and Advanced Analytics, in diverse domains such as product development, pricing, marketing research, public policy, optimization and risk management.

- The role uses analytics to provide predictive, prescriptive, and decisive insight.

Role Summary Description:

- Analyze and model structured data using advanced statistical methods

- Implement algorithms and software needed to perform analyses.

- Build recommendation engines, spam classifiers, sentiment analyzers, classifiers for unstructured and semi-structured data

- Analyze data using R, Python, Java, open source packages and commercial/enterprise applications.

- Cluster large amount of user generated content

- Process data in large-scale environments, in Amazon EC2, Storm, Hadoop, Spark

- Interface with databases (SQL, NO SQL, HDFS) to extract, transform and load data

- Perform machine learning, natural language, and statistical analysis methods, such as classification, collaborative filtering, association rules, sentiment analysis, topic modeling, time-series analysis, regression, statistical inference, and validation methods.

- Drive client engagements focused on Big Data and Advanced Business Analytics, in diverse domains such as product development, marketing research, public policy, optimization, and risk management.

- Communicate results and educate others through reports and presentations.

- Performance explanatory data analyses, generate and test working hypotheses, prepare and analyze historical data and identify patterns.

Functional/Technical Skills: (across most levels)

- Ability to break down complex problems, and develop strategies

- Masters degree or PhD in Computer Science, Statistics, Mathematics, Engineering, Bioinformatics, Physics, Operations Research, or related fields, with 2+ years of relevant experience

- Expertise in at least one of the following fields: machine learning, data visualization, statistical modeling, data mining, or information retrieval

- Develop and apply machine learning, and statistical analysis methods, such as classification, collaborative filtering, association rules, time-series analysis, advanced regression methods and hypothesis testing

Experience working with large datasets and problems:

- Strong data extraction and processing, using MapReduce, Pig, and/or Hive preferred

- Experience with command-line scripting, data structures and algorithms

- Knowledgeable with search engines, spam detection, recommendation systems, and/or social networks

- Ability to work in a Linux environment, and process large amounts of data in a cloud environment

- Modern programming language such as Ruby, Python, Java, C++, etc.

- Strong mathematical background with ability to understand algorithms and methods from a mathematical viewpoint and an intuitive viewpoint.

- Proficiency in analysis (e.g. R, SAS, Matlab) packages, and programming languages (e.g. Java, Python, Ruby).

- Ability to implement, maintain, and troubleshoot big data infrastructure, such as distributed processing paradigms, stream processing, and databases, such as Hadoop, Storm, SQL, Solr.

- Additionally, have broad understanding of the various commercial distributions of the Apache Hadoop framework, e.g., MapR, Cloudera, Hortonworks, etc