Data Science and Machine Learning using Python
Data Science and Machine Learning using Python
Course duration:
75 hours 3 months (Live online/classroom training+ Projects + assignment/case studies + Interview preparation)
Training Mode:
Online & Classroom
Course Fees:
- Group training-INR 25000 | USD 650 (Other countries)
- Individual training – INR 35000 | USD 950 (Other countries)
Pre-requisites for Data Science and Machine Learning course:
To attend this course candidates must have good understanding of basic and advance Python, NumPy, Pandas, Matplotlib and basic statistics.
Certification:
At end of our course, you will be work on various projects and assignmetns. Once you completed your assigned projects with expected results we will issue a Data Science and Machine Learning using Python Certificate.
Syllabus - program of study:
- Introduction to Data Analytic's and Statistical Techniques
- Types of Variables, Measures of Central Tendency and Dispersion
- Variable Distributions and Probability Distributions
- Normal Distribution and Properties
- Central Limit Theorem and Application
- Parametric method vs. Non-Parametric method.
- Null Hypothesis
- Alternative Hypothesis
- P Value Interpretation
- Z Test
- T test
- One Sample t test
- Paired Sample t test
- Two sample (Independent) t test
- Analysis of Variance (ANOVA)
- Chi Square Test
- Correlation Analysis
- What is predictive modelling
- Importance of predictive modelling
- Types of business problems- mapping of techniques
- Different phases of predictive modelling
- Machine Learning Languages, Types, and Examples
- Applications of Machine Learning
- Machine Learning vs. Statistical Modelling
- Supervised vs. Unsupervised Learning
- Concept of Overfitting and Under fitting (Bias- Variance Trade off)
- Types of Cross validation (Train & Test, Bootstrapping, K-Fold validation
etc.)
- Python libraries suitable for Machine Learning
Unsupervised Machine Learning:
- Dealing with Duplicates
- Outlier treatment
- Missing values
- Dummy creation
- Variable Reduction
- Introduction of variable reduction techniques
- Introduction to Factor Analysis
- Introduction to PCA Analysis
- Scree plot
- Eigenvalue , Eigenvector
- Factor Rotation and Extraction
- Result Interpretation
- Segmentation
- Introduction to Segmentation
- Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
- Heuristic Segmentation Techniques
- Value Based
- RFM Segmentation
- Life Stage Segmentation
- Behavioral Segmentation Techniques (K-Means Cluster Analysis)
- Introduction to Cluster Techniques
- Cluster evaluation and profiling
- Interpretation of results - Implementation on new data.
- Introduction of Linear Regression
- Applications and Assumptions of Linear Regression
- Create training and test samples
- Building Linear Regression Model
- Understanding standard metrics
- Variable significance
- R- square
- Adjusted R-square
- Global hypothesis etc.
- Validation of Models: Training-Validation approach
- Standard Business Outputs
- Decile Analysis,
- Error distribution (histogram)
- Model equation
- Drivers etc.
- Interpretation of Results - Business Validation - Implementation on new data
- Interpretation of model parameters
Supervised Machine Learning:
- Introduction of Linear Regression
- Applications and Assumptions of Linear Regression
- Create training and test samples
- Building Linear Regression Model
- Understanding standard metrics
- Variable significance
- R- square
- Adjusted R-square
- Global hypothesis etc.
- Validation of Models: Training-Validation approach
- Standard Business Outputs
- Decile Analysis,
- Error distribution (histogram)
- Model equation
- Drivers etc.
- Interpretation of Results - Business Validation - Implementation on new data
- Interpretation of model parameters
- Introduction of Logistic Regression
- Applications and Assumptions of Logistic Regression
- Create training and test samples
- Building Logistic Regression Model
- Understanding standard model metrics
- Concordance
- Hosmer Lemeshov Test
- Gini, KS , Somers'D
- Misclassifications, etc.
- Confusion Matrix
- Validation of Logistic Regression Models
- Standard Business Outputs (ROC Curve, AUC, Decile Analysis, etc.)
- Interpretation of Results - Business Validation
- Implementation on new data Logistic Regression
- What is Decision Trees?
- Applications of Decision Trees
- Types of Decision Tree Algorithms
- Decision Trees Construction
- Entropy
- Information Gain
- Gini Index
- Chi Square
- Regression Trees
- Pruning a Decision Tree
- Decision Trees – Validation
- Over fitting - Best Practices to avoid
- What is Ensemble mean?
- Methods of Ensemble
- Bagging
- Random forest
- Boosting
- Ada Boost
- Gradient Boosting Machines (GBM)
- XGBoost
- About Support Vector Machine
- Applications of SVM
- Support vector classifier
- Support Vector Regression
- Mathematical Intuition
- Outputs Interpretation
- Fine tuning of models with hyper parameters
- Validating SVM models
- Bayes Theorem and Its Applications
- Naïve Bayes for classification
- Applications of Naïve Bayes in Classifications
- Introduction of KNN
- Application of KNN
- KNN for Regression
- KNN for classification
- Treatment of missing values using KNN
- Validating KNN model
- Model fine tuning with hyper parameters
- Introduction of Neural Networks
- Applications of ANN
- Single and Multi Layered Neural Network
- Back Propagation and Conjugant Gradient Techniques
- Neural Networks for Regression
- Neural Networks for Classification
- Model fine tuning with hyper parameters
- Validating ANN models.
a. Introduction and Exponential Smoothing
- Introduction to Time Series Data and Analysis
- Decomposition of Time Series
- Trend and Seasonality detection and forecasting
- Exponential Smoothing (Single, double and triple)
b. ARIMA Modeling
- Box - Jenkins Methodology
- Auto Regression and Moving Averages, ACF, PACF
- Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc
- Seasonal ARIMA Models (P,D,Q)
- Introduction to Multivariate ARIMA
- Understanding Unstructured vs. Semi-structured Data
- Text Mining
- Natural Language processing (NLP)
- Sentiment Analysis
- Case studies
- Assignments
- projects with industry data