Role Overview: Explore big data to surface useful trends, signals, and segments.
- The role drives business and industry solutions focused on Big Data and Advanced Analytics, in diverse domains such as product development, pricing, marketing research, public policy, optimization and risk management.
- The role uses analytics to provide predictive, prescriptive, and decisive insight.
Role Summary Description:
- Analyze and model structured data using advanced statistical methods
- Implement algorithms and software needed to perform analyses.
- Build recommendation engines, spam classifiers, sentiment analyzers, classifiers for unstructured and semi-structured data
- Analyze data using R, Python, Java, open source packages and commercial/enterprise applications.
- Cluster large amount of user generated content
- Process data in large-scale environments, in Amazon EC2, Storm, Hadoop, Spark
- Interface with databases (SQL, NO SQL, HDFS) to extract, transform and load data
- Perform machine learning, natural language, and statistical analysis methods, such as classification, collaborative filtering, association rules, sentiment analysis, topic modeling, time-series analysis, regression, statistical inference, and validation methods.
- Drive client engagements focused on Big Data and Advanced Business Analytics, in diverse domains such as product development, marketing research, public policy, optimization, and risk management.
- Communicate results and educate others through reports and presentations.
- Performance explanatory data analyses, generate and test working hypotheses, prepare and analyze historical data and identify patterns.
Functional/Technical Skills: (across most levels)
- Ability to break down complex problems, and develop strategies
- Masters degree or PhD in Computer Science, Statistics, Mathematics, Engineering, Bioinformatics, Physics, Operations Research, or related fields, with 2+ years of relevant experience
- Expertise in at least one of the following fields: machine learning, data visualization, statistical modeling, data mining, or information retrieval
- Develop and apply machine learning, and statistical analysis methods, such as classification, collaborative filtering, association rules, time-series analysis, advanced regression methods and hypothesis testing
Experience working with large datasets and problems:
- Strong data extraction and processing, using MapReduce, Pig, and/or Hive preferred
- Experience with command-line scripting, data structures and algorithms
- Knowledgeable with search engines, spam detection, recommendation systems, and/or social networks
- Ability to work in a Linux environment, and process large amounts of data in a cloud environment
- Modern programming language such as Ruby, Python, Java, C++, etc.
- Strong mathematical background with ability to understand algorithms and methods from a mathematical viewpoint and an intuitive viewpoint.
- Proficiency in analysis (e.g. R, SAS, Matlab) packages, and programming languages (e.g. Java, Python, Ruby).
- Ability to implement, maintain, and troubleshoot big data infrastructure, such as distributed processing paradigms, stream processing, and databases, such as Hadoop, Storm, SQL, Solr.
- Additionally, have broad understanding of the various commercial distributions of the Apache Hadoop framework, e.g., MapR, Cloudera, Hortonworks, etc
Didn’t find the job appropriate? Report this Job