Data Science and Machine Learning using Python

Data Science and Machine Learning using Python
Course duration:

75 hours 3 months (Live online/classroom training+ Projects + assignment/case studies + Interview preparation)

Training Mode:​

Online & Classroom

Course Fees:
  • Group training-INR 25000 | USD 650 (Other countries)
  • Individual training – INR 35000 | USD 950 (Other countries)
Pre-requisites for Data Science and Machine Learning course:

To attend this course candidates must have good understanding of basic and advance Python, NumPy, Pandas, Matplotlib and basic statistics.

Certification:

At end of our course, you will be work on various projects and assignmetns. Once you completed your assigned projects with expected results we will issue a Data Science and Machine Learning using Python Certificate.

Syllabus - program of study:

  • Introduction to Data Analytic's and Statistical Techniques
  • Types of Variables, Measures of Central Tendency and Dispersion
  • Variable Distributions and Probability Distributions
  • Normal Distribution and Properties
  • Central Limit Theorem and Application
  • Parametric method vs. Non-Parametric method.
  • Null Hypothesis
  • Alternative Hypothesis
  • P Value Interpretation
  • Z Test
  • T test
  • One Sample t test
  • Paired Sample t test
  • Two sample (Independent) t test
  • Analysis of Variance (ANOVA)
  • Chi Square Test
  • Correlation Analysis
  • What is predictive modelling
  • Importance of predictive modelling
  • Types of business problems- mapping of techniques
  • Different phases of predictive modelling

 

  • Machine Learning Languages, Types, and Examples
  • Applications of Machine Learning
  • Machine Learning vs. Statistical Modelling
  • Supervised vs. Unsupervised Learning
  • Concept of Overfitting and Under fitting (Bias- Variance Trade off)
  • Types of Cross validation (Train & Test, Bootstrapping, K-Fold validation

etc.)

  • Python libraries suitable for Machine Learning

 

Unsupervised Machine Learning:

  • Dealing with Duplicates
  • Outlier treatment
  • Missing values
  • Dummy creation
  • Variable Reduction
  • Introduction of variable reduction techniques
  • Introduction to Factor Analysis
  • Introduction to PCA Analysis
  • Scree plot
  • Eigenvalue , Eigenvector
  • Factor Rotation and Extraction
  • Result Interpretation
  • Segmentation
  • Introduction to Segmentation
  • Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
  • Heuristic Segmentation Techniques
  • Value Based
  • RFM Segmentation
  • Life Stage Segmentation
  • Behavioral Segmentation Techniques (K-Means Cluster Analysis)
  • Introduction to Cluster Techniques
  • Cluster evaluation and profiling
  • Interpretation of results - Implementation on new data.
  • Introduction of Linear Regression
  • Applications and Assumptions of Linear Regression
  • Create training and test samples
  • Building Linear Regression Model
  • Understanding standard metrics
  • Variable significance
  • R- square
  • Adjusted R-square
  • Global hypothesis etc.
  • Validation of Models: Training-Validation approach
  • Standard Business Outputs
  • Decile Analysis,
  • Error distribution (histogram)
  • Model equation
  • Drivers etc.
  • Interpretation of Results - Business Validation - Implementation on new data
  • Interpretation of model parameters

Supervised Machine Learning:

  • Introduction of Linear Regression
  • Applications and Assumptions of Linear Regression
  • Create training and test samples
  • Building Linear Regression Model
  • Understanding standard metrics
  • Variable significance
  • R- square
  • Adjusted R-square
  • Global hypothesis etc.
  • Validation of Models: Training-Validation approach
  • Standard Business Outputs
  • Decile Analysis,
  • Error distribution (histogram)
  • Model equation
  • Drivers etc.
  • Interpretation of Results - Business Validation - Implementation on new data
  • Interpretation of model parameters
  • Introduction of Logistic Regression
  • Applications and Assumptions of Logistic Regression
  • Create training and test samples
  • Building Logistic Regression Model
  • Understanding standard model metrics
  • Concordance
  • Hosmer Lemeshov Test
  • Gini, KS , Somers'D
  • Misclassifications, etc.
  • Confusion Matrix
  • Validation of Logistic Regression Models
  • Standard Business Outputs (ROC Curve, AUC, Decile Analysis, etc.)
  • Interpretation of Results - Business Validation
  • Implementation on new data Logistic Regression
  • What is Decision Trees?
  • Applications of Decision Trees
  • Types of Decision Tree Algorithms
  • Decision Trees Construction
  • Entropy
  • Information Gain
  • Gini Index
  • Chi Square
  • Regression Trees
  • Pruning a Decision Tree
  • Decision Trees – Validation
  • Over fitting - Best Practices to avoid
  • What is Ensemble mean?
  • Methods of Ensemble
  • Bagging
  • Random forest
  • Boosting
  • Ada Boost
  • Gradient Boosting Machines (GBM)
  • XGBoost
  • About Support Vector Machine
  • Applications of SVM
  • Support vector classifier
  • Support Vector Regression
  • Mathematical Intuition
  • Outputs Interpretation
  • Fine tuning of models with hyper parameters
  • Validating SVM models
  • Bayes Theorem and Its Applications
  • Naïve Bayes for classification
  • Applications of Naïve Bayes in Classifications
  • Introduction of KNN
  • Application of KNN
  • KNN for Regression
  • KNN for classification
  • Treatment of missing values using KNN
  • Validating KNN model
  • Model fine tuning with hyper parameters
  • Introduction of Neural Networks
  • Applications of ANN
  • Single and Multi Layered Neural Network
  • Back Propagation and Conjugant Gradient Techniques
  • Neural Networks for Regression
  • Neural Networks for Classification
  • Model fine tuning with hyper parameters
  • Validating ANN models.
a. Introduction and Exponential Smoothing
  • Introduction to Time Series Data and Analysis
  • Decomposition of Time Series
  • Trend and Seasonality detection and forecasting
  • Exponential Smoothing (Single, double and triple)
b. ARIMA Modeling
  • Box - Jenkins Methodology
  • Auto Regression and Moving Averages, ACF, PACF
  • Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc
  • Seasonal ARIMA Models (P,D,Q)
  • Introduction to Multivariate ARIMA
  • Understanding Unstructured vs. Semi-structured Data
  • Text Mining
  • Natural Language processing (NLP)
  • Sentiment Analysis
  • Case studies
  • Assignments
  • projects with industry data