Facebook Twitter google plus LinkedIn

Email:contact@psntrainings.com
india:Whatsapp No+91 86860 29160 USA:+1 228 586 3560

Data Science Training in Hyderabad



PSN Trainings is the Best Data Science Online Training institute in Hyderabad,india

Data Science Course Description

Data scientist Content Basic Concepts of Statistics:


    1.Descriptive Statistics and Probability Distributions:
  • Introduction about Statistics
  • Different Types of Variables
  • Measures of Central Tendency with examples
  •      Mean
         Mode
         Median
  • Measures of Dispersion
  •      Range
         Variance
         Standard Deviation
  • Probability & Distributions
  • Probability Basics
  • Binomial Distribution and its properties
  • Poisson distribution and its properties
  • Normal distribution and its properties

  • 2. Inferential Statistics and Testing of Hypothesis
  • Sampling methods
  •      Sampling and types of sampling
         Definitions of Sample and Population
         Importance of sampling in real time
         Different methods of sampling
         Simple Random Sampling with replacement and without replacement
         Stratified Random Sampling
  • Different methods of estimation
  • Testing of Hypothesis & Tests
  •      Null Hypothesis and Alternate Hypothesis
         Level of Significance and P value
         t-test and its properties
         Chi-square test and its properties
         Z test
  • Analysis of Variance
  •      F-test
         One and Two way ANOVA
    3. Covariance & Correlation
         Importance and Properties of Correlation
         Types of Correlation with examples
    Predictive Modeling Steps and Methodology with Live example:
  • Data Preparation
  •      Variable Selection
         Transformation of the variables
         Normalization of the variables
  • Exploratory Data analysis
  •      Summary Statistics
         Understanding the patterns of the data at single and multiple dimensions
         Missing data treatment using different methods
         Outliers identification and treating outliers
         Visualization of the data using the One Dimensional, Two Dimensional and Multi Dimensional Graphs. Bar chart, Histogram, Box plot, Scatter plot, Bubble chart, Word cloud etc
  • Model Development
  •      Selection of the sample data
         Selecting the appropriate model based on the requirement and data availability
  • Model Validation
  •      Model Implementation
         Key Statistical parameters checking
         Validating the model results with the actual result
  • Model Implementation
  •      Implementing the model for future prediction
  • Real time telecom business use case with detail explanation
  • Introducing couple of real time use cases and solutions of Banking and Retail domains using the different statistical methods.

  • Supervised Techniques:
  • Multiple linear Regression
  •      Linear Regression - Introduction - Applications
         Assumptions of Linear Regression
         Building Linear Regression Model
         Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)
         Validation of Linear Regression Models (Re running Vs. Scoring)
         Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc)
         Interpretation of Results - Business Validation - Implementation on new data
         Real time case study of Manufacturing and Telecom Industry to estimate the future revenue using the models
  • Logistic Regression
  •      Logistic Regression - Introduction - Applications
         Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
         Building Logistic Regression Model
         Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini, KS, Misclassification etc)
         Validation of Logistic Regression Models (Re running Vs. Scoring)
         Standard Business Outputs (Decile Analysis, ROC Curve)
         Probability Cut-offs, Lift charts, Model equation, drivers etc)
         Interpretation of Results - Business Validation - Implementation on new data
         Real time case study to Predict the Churn customers in the Banking and Retail industry
  • Partial Least Square Regression
  •      Partial Least square Regression - Introduction - Applications
         Difference between Linear Regression and Partial Least Square Regression
         Building PLS Model
         Understanding standard metrics (Variable significance, R-square/Adjusted R-Square, Global hypothesis etc)
         Interpretation of Results - Business Validation - Implementation on new data
         Sharing the real time example to identify the key factors which are driving the Revenue
    Variable Reduction Techniques
  • Factor Analysis
  • Principle component analysis
  •      Assumptions of PCA
         Working Mechanism of PCA
         Types of Rotations
         Standardization
         Positives and Negatives of PCA
    Supervised Techniques Classification:
  • CHAID
  • CART
  • Difference between CHAID and CART
  • Random Forest
  •      Decision tree vs. Random Forest
         Data Preparation
         Missing data imputation
         Outlier detection
         Handling imbalance data
         Random Record selection
         Random Forest R parameters
         Random Variable selection
         Optimal number of variables selection
         Calculating Out Of Bag (OOB) error rate
         Calculating Out of Bag Predictions
  • Couple of Real time use cases which are related to Telecom and Retail Industry. Identification of the Churn.

  • Unsupervised Techniques:
  • Segmentation for Marketing Analysis
  •      Need for segmentation
         Criterion of segmentation
         Types of distances
         Clustering algorithms
         Hierarchical clustering
         K-means clustering
         Deciding number of clusters
         Case study
  • Business Rules Criteria
  • Real time use case to identify the Most Valuable revenue generating Customers.
  • Timeseries Analysis:
  • Forecasting - Introduction - Applications
  • Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
  • Basic Techniques
  •      Averages,      Smoothening etc
  • Advanced Techniques
  •      AR Models,
         ARIMA
         UCM
         Hybrid Model
  • Understanding Forecasting Accuracy - MAPE, MAD, MSE etc
  • Couple of use cases, To forecast the future sales of products

  • Text Analytics:
  • Gathering text data from web and other sources
  • Processing raw web data
  • Collecting twitter data with Twitter API
  • Naive Bayes Algorithm
  •      Assumptions and of Nave Bayes
         Processing of Text data
         Handling Standard and Text data
         Building Nave Bayes Model
         Understanding standard model metrics
         Validation of the Models (Re running Vs. Scoring)
  • Sentiment analysis
  •      Goal Setting
         Text Preprocessing
         Parsing the content
         Text refinement
         Analysis and Scoring
  • Use case of Health care industry, To identify the sentiment of the patients on Specified hospital by extracting the data from the TWITTER.
    Visualization Using Tableau:
  • Live connectivity from R to Tableau
  • Generating the Reports and Charts
We are offered Devops online training also


Enquiry