About Data Science

Data Science: As we all know for any business to grow or any kind of research requires a core element is Data. That data may be in raw information, streaming in and stored in enterprise data warehouse. Data Science includes various automated techniques and methods to analyze the huge amount of data to extract knowledge from it. To learn data science required knowledge of various fields like mathematics, statistics, information science and complete knowledge of computer science in terms of data mining, databases, visualization, cluster analysis, machine learning, and classification.

Data scientist Course Content:

Basic Concepts of Statistics:
  1. Descriptive Statistics and Probability Distributions:
  • Introduction about Statistics
  • Different Types of Variables
  • Measures of Central Tendency with examples
  • Mean
  • Mode
  • Median
  • Measures of Dispersion
  • Range
  • Variance
  • Standard Deviation
  • Probability & Distributions
  • Probability Basics
  • Binomial Distribution and its properties
  • Poisson distribution and its properties
  • Normal distribution and its properties
  1. Inferential Statistics and Testing of Hypothesis
  • Sampling methods
  • Sampling and types of sampling
  • Definitions of Sample and Population
  • Importance of sampling in real time
  • Different methods of sampling
  • Simple Random Sampling with replacement and without replacement
  • Stratified Random Sampling
  • Different methods of estimation
  • Testing of Hypothesis & Tests
  • Null Hypothesis and Alternate Hypothesis
  • Level of Significance and P value
  • t-test and its properties
  • Chi-square test and it’s properties
  • Z test
  • Analysis of Variance
  • F-test
  • One and Two way ANOVA
  1. Covariance & Correlation
  • Importance and Properties of Correlation
  • Types of Correlation with examples
Predictive Modeling Steps and Methodology with Live example:
  • Data Preparation
  • Variable Selection
  • Transformation of the variables
  • Normalization of the variables
  • Exploratory Data analysis
  • Summary Statistics
  • Understanding the patterns of the data at single and multiple dimensions
  • Missing data treatment using different methods
  • Outlier’s identification and treating outliers
  • Visualization of the data using the One Dimensional, Two Dimensional and Multi Dimensional Graphs.
Bar chart, Histogram, Box plot, Scatter plot, Bubble chart, Word cloud etc…
  • Model Development
  • Selection of the sample data
  • Selecting the appropriate model based on the requirement and data availability
  • Model Validation
  • Model Implementation
  • Key Statistical parameters checking
  • Validating the model results with the actual result
  • Model Implementation
  • Implementing the model for future prediction
  • Real time telecom business use case with detail explanation
  • Introducing couple of real time use cases and solutions of Banking and Retail domains using the different statistical methods.
Supervised Techniques:
  • Multiple linear Regression
  • Linear Regression - Introduction - Applications
  • Assumptions of Linear Regression
  • Building Linear Regression Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)
  • Validation of Linear Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc)
  • Interpretation of Results - Business Validation - Implementation on new data
  • Real time case study of Manufacturing and Telecom Industry to estimate the future revenue using the models
  • Logistic Regression
  • Logistic Regression - Introduction - Applications
  • Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
  • Building Logistic Regression Model
  • Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification etc)
  • Validation of Logistic Regression Models (Re running Vs. Scoring)
  • Standard Business Outputs (Decile Analysis, ROC Curve)
  • Probability Cut-offs, Lift charts, Model equation, drivers etc)
  • Interpretation of Results - Business Validation - Implementation on new data
  • Real time case study to Predict the Churn customers in the Banking and Retail industry
  • Partial Least Square Regression
  • Partial Least square Regression - Introduction - Applications
  • Difference between Linear Regression and Partial Least Square Regression
  • Building PLS  Model
  • Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)
  • Interpretation of Results - Business Validation - Implementation on new data
  • Sharing the real time example to identify the key factors which are driving the Revenue
Variable Reduction Techniques
  • Factor Analysis
  • Principle component analysis
  • Assumptions of PCA
  • Working Mechanism of PCA
  • Types of Rotations
  • Standardization
  • Positives and Negatives of PCA
Supervised Techniques Classification:
  • CART
  • Difference between CHAID and CART
  • Random Forest
  • Decision tree vs. Random Forest
  • Data Preparation
  • Missing data imputation
  • Outlier detection
  • Handling imbalance data
  • Random Record selection
  • Random Forest R parameters
  • Random Variable selection
  • Optimal number of variables selection
  • Calculating Out Of Bag (OOB) error rate
  • Calculating Out of Bag Predictions
  • Couple of Real time use cases which are related to Telecom and Retail Industry. Identification of the Churn.
Unsupervised Techniques:
  • Segmentation for Marketing Analysis
  • Need for segmentation
  • Criterion of segmentation
  • Types of distances
  • Clustering algorithms
  • Hierarchical clustering
  • K-means clustering
  • Deciding number of clusters
  • Case study
  • Business Rules Criteria
  • Real time use case to identify the Most Valuable revenue generating Customers.
Timeseries Analysis:
  • Forecasting - Introduction - Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Basic Techniques –
  • Averages,
  • Smoothening etc
  • Advanced Techniques
  • AR Models,
  • UCM
  • Hybrid Model
  • Understanding Forecasting Accuracy - MAPE, MAD, MSE etc
  • Couple of use cases, To forecast the future sales of products
Text Analytics:
  • Gathering text data from web and other sources
  • Processing raw web data
  • Collecting twitter data with Twitter API
  • Naive Bayes Algorithm
  • Assumptions and of Naïve Bayes
  • Processing of Text data
  • Handling Standard and Text data
  • Building Naïve Bayes Model
  • Understanding standard model metrics
  • Validation of the Models (Re running Vs. Scoring)
  • Sentiment analysis
  • Goal Setting
  • Text Preprocessing
  • Parsing the content
  • Text refinement
  • Analysis and Scoring
  • Use case of Health care industry, To identify the sentiment of the patients on Specified hospital by extracting the data from the TWITTER.
Visualization Using Tableau:
  • Live connectivity from R to Tableau
  • Generating the Reports and Charts

Why SkillVidya

  • Live project oriented training
  • 24/7 Student support
  • Quality material and training
  • Certification and job support


    • All our trainers are working professionals from the Industry and have at least 10-12 yrs of relevant experience in various departments.
    • Once if you have enrolled in the course then all our contact details will be provided to you and all the information regarding the course and schedules are provided.
    • To attend the live sessions 1mbps speed of internet is required.
    • You can go through the sample class recordings because without enrollment attending a live session is not possible.
    • These classes are completely Online Live Instructor-led classes. You will have chat option available to discuss your queries with the trainer during a class.
    • Yes, you will get the recorded videos of the sessions you missed and also you can attend the missed class in another live session.
    • yes, you will get the course certification once you are completely done with your course.
  • After enrolling with us we will provide you a software where you can work practically in that
  • Yes, the real-time experience will be given, By the end of the course, you will work on a live project.
  • We will train enough you to attend for an interview and placed in a company with your knowledge regarding the course and we will help you to build your resume please make sure that we are not into job placement.
  • NEW BATCHESevery week
  • Duration150 hrs / week
  • certificationyes
  • mode of trainingonline/classroom
  • LanguageEnglish

Inquire Now