Master Certification Program in
Analytics, Machine Learning and AI
16 Data Science Modules Live Projects and Doubt Sessions Assignment & Case Studies
India’s Only Data Science Training Program was created to help you to build a successful career in data science from scratch. You will solve 75+ projects and assignments across the project duration working on Stats, Advanced Excel, SQL, Python Libraries, Tableau, Advanced Machine Learning & Deep Learning algorithms to solve day- day industry data problems in healthcare, manufacturing, sales, media, marketing, education sectors making you job ready for 30+ roles.
Course Fee
INR 30k-90k
EMI Options
Upto 18 EMI’s
Duration
3-11 Months
Salary
Rs 5 to 25 LPA
Assignments
75
Batches
Live-Online
Module 1: Introduction to Data Science
- Introduction to the Industry & Buzzwords
- Industrial application of data science
- Introduction to different Data Science Techniques
- Important Software & Tools
- Career paths & growth in data science
Module 2: Introduction to Excel
- Introduction to Excel- Interface, Sorting & Filtering,
- Excel Reporting- Basic & Conditional Formatting
- Essential Excel Formulae
- Layouts, Printing and Securing Files
Module 3: Introduction to Stats
- Introduction to Statistics & It’s Applications
- Different types of Data
- Population vs Sample
- Sampling Techniques
- Intro: Inferential vs. descriptive statistics
Module 4: Descriptive Stats Using Excel Datasets
- Categorical Variables Visualization Using Excel Charts- FDT, Pie Charts, Bar Charts & Pareto
- Numerical Variables Visualization of Frequency & Absolute Frequency- Using Histogram, Cross Table & Scatter Plot
- Measure of Spread ( Mean, Mode , Median)
- Measure of Variance( Skewness, SD, Variance,
- Range, Coef. Of Variance, Bivariate Analysis, Covariance & Correlation)
Module 5: Inferential Stats Using Excel Datasets
- Introduction to Probability
- Permutation & Combinations
- Types of events
- Normal distribution
- Standard Normal distribution
- Normal vs. Standard Normal distribution
- Confidence Intervals & Z-Score
- Hypothesis Testing & It’s Types
Module 6: Database Design & MySQL
- Relational Database theory & Introduction to SQL
- MySQL Installation
- Database Creation in the MySQL Workbench
- Querying in MySQL
- Joins and Set Operations
- SQL Practice Case Study
- Window Functions
- Case Statements, Stored Routines and
Cursors
Ø Query Optimisation and Best Practices
Ø Problem-Solving Using SQL
Module 7: Data Visualization Using Advanced Excel
- Introduction
- LOOKUP functions
- Pivot Tables
- WHATIF Analysis
- Dashboard Creation
- Recording Macros
- Advanced Visualizations- PIVOT Charts, Sparklines, Waterfall Charts
- Data Analysis ToolPak – Regression in Excel
Module 8: Data Visualization Using Tableau
Introduction to Tableau
- Introduction
- What is Data Analytics?
- Why Data Visualisation?
- What is Tableau?
- Why Tableau?
- Tableau vs Excel and PowerBI
- Exploratory and Explanatory Analysis
- Getting started with Tableau
- Summary
Visualizing and Analyzing data with Tableau – I
- Introduction
- Bar Charts
- Line Charts and Filters
- Area Charts
- Box plots and Pivoting
- Maps and Hierarchies
- Pie Charts
- Treemaps and Grouping
- Dashboards
- Summary
Visualizing and Analyzing Data with Tableau – II
- Introduction
- Joins and Splits
- Numeric and String functions
- Logical and Date functions
- Histograms and parameters
- Scatter Plots
- Dual Axis Charts
- Top N Parameters and Calculated Fields
- Stacked bar Charts
- Dashboards – II and Filter Actions
- Storytelling
- Summary
Module 9: Python Programming
Installing Anaconda & Basics of Python
- Introduction to programming languages
- Compiler vs Interpreter
- Getting Started With Python
- Introduction to jupyter Notebooks
- Identifiers, Keywords
- Print function
- Comment, Indentation
- Data Types Functions
- Understanding what are functions
- Defining and calling functions
- Local and global variables
- Different types of arguments
- Map,reduce,filter,lambda and recursive functions
Data Structures in Python
- Introduction
- Lists
- Tuples
- Sets
- Dictionaries
- Practice Exercise
- Summary
Operator Input and Output
- Different Arithmetic , logical and Relational operators
- Input, Output function
- Eval function
- Format Function
Control Flow
- If elif else statement
- For and while loops
- Break , continue and Pass statement
- List and dictionary comprehensions
Functions
- Understanding what are functions
- Defining and calling functions
- Local and global variables
- Different types of arguments
- Map,reduce,filter,lambda and recursive functions
File Handling
- Purpose of file handling
- Different function in file handling (open,read, write,close)
- Different modes (r,w,a,r+,w+,a+)
- With block
Exception Handling, OOPX & Regex
- What is exception handling
- Try, except, else and finally block
- Different types of Exception
- Concept of Oops
- Different functions in Regex
- Metacharacters in Regex
Module 10: Python For Data Science
NumPy
- Introduction to NumPy
- Basics of NumPy
- Operations Over 1-D Arrays
- Practice Exercise I
- Multidimensional Arrays
- Creating NumPy Arrays
- Mathematical Operations on NumPy
- Mathematical Operations on NumPy II
- Computation Times in NumPy vs Python Lists
- Practice Exercise II
Pandas
- Introduction to Pandas
- Basics of Pandas
- Pandas – Rows and Columns
- Describing Data
- Indexing and Slicing
- Operations on Dataframes
- Groupby and Aggregate Functions
- Merging DataFrames
- Pivot Tables
- Practice Exercise
Module 11: Data Visualization Using Python- Matplotlib & Seaborn
Introduction to Data Visualisation with Matplotlib
- Introduction to Matplotlib
- The Necessity of Data Visualisation
- Visualisations – Some Examples
- Facts and Dimensions
- Bar Graph
- Scatter Plot
- Line Graph and Histogram
- Subplots
- Choosing Plot Types
- Summary
Data Visualisation: Case Study
- Introduction
- Case Study: Mind Map
- Case Study Overview
- Data Handling and Cleaning: I
- Data Handling and Cleaning: II
- Sanity Checks
- Outliers Analysis with Boxplots
- Histograms
- Summary
- Practice Questions
Data Visualization with Seaborn
- Introduction
- Distribution Plots
- Styling Options
- Pie – Chart and Bar Chart
- Scatter Plots
- Pair Plots
- Revisiting Bar Graphs and Box Plots
- Heatmaps
- Line Charts
- Stacked Bar Charts
- Case Study Summary
- Plotly
- Practice Questions
Module 12: Exploratory Data Analysis
Data Sourcing
- Module Introduction
- Introduction to EDA
- Public and Private Data
- Private Data
- Public Data
- Web Scraping-I
- Web Scraping-II
- Summary
Data Cleaning
- Introduction
- Data Types
- Fixing the Rows and Columns
- Impute/Remove Missing Values
- Handling Outliers
- Standardising Values
- Fixing Invalid Values and Filter Data
- Practice Questions
- Summary
Univariate Analysis
- Introduction to Univariate Analysis
- Categorical Unordered Univariate Analysis
- Categorical Ordered Univariate Analysis
- Statistics on Numerical Features
- Graded Questions
- Summary
Bivariate and Multivariate Analysis
- Introduction
- Numeric – Numeric Analysis
- Correlation vs Causation
- Numerical – Categorical Analysis
- Categorical – Categorical Analysis
- Multivariate Analysis
- Graded Questions
- Summary
- Module Summary
Module 13: Supervised Learning Model - Regression
Introduction to Simple Linear Regression
- Introduction to Simple Linear Regression
- Introduction to machine learning
- Regression line
- Best fit line
- Strength of simple linear regression
Simple linear regression in python
- Assumptions of simple linear regression
- Reading and understanding the data
- Hypothesis testing in linear regression
- Building a linear model
- Residue analysis and predictions
- Linear Regression using SKLearn
Multiple Linear Regression
- Motivation-when one variable is not enough
- Moving from SLR to MLR-new considerations
- Multi collinearity
- Dealing with categorical variables
- Model assessment in comparison
- Feature selection
Multiple Linear Regression in Python
- Reading and understanding the data
- Data preparation
- Initial steps
- Building the model I & II
- Residue analysis and predictions
- Variable selection using RFE
Industry Relevance of Linear Regression
- Linear regression revision
- Prediction versus projection
- Media company case study
- Exploratory data analysis
- Model building – I, II & III
- Assessing the model
- Interpreting the results
Module 14: Supervised Learning Model - Classification
Univariate Logistic Regression
- Binary classification
- Sigmoid curve
- Finding the best fit sigmoid curve – I
- Finding the best fit sigmoid curve – II
- Odds and log Odds
Multivariate Logistic Regression – Model Building
- Multivariate Logistic Regression – Model Building
- Multivariate logistic regression with telecom churn example
- Data cleaning and preparation – I & II
- Building your first model
- Feature elimination using RFE
- Confusion metrics and accuracy
- Manual feature elimination
Multivariate Logistic Regression – Model Evaluation
- Multivariate Logistic Regression – Model Evaluation
- Metrics beyond accuracy-sensitivity and specificity
- Sensitivity and specificity in Python
- Understanding ROC curve
- ROC curve in python
- Finding the optimal threshold
- Model evaluation metrics – exercise
- Precision and recall
- Making predictions
Logistic Regression – Industry Applications – Part I
- Getting familiar with logistic regression
- Nuances of logistic regression-sample selection
- Nuances of logistic regression-segmentation
- Nuances of logistic impression-variable transformation-I, II & III
- Logistic Regression: Industry Applications – Part II
- Model evaluation – A second look
- Model validation and importance of stability
- Tracking of model performance over time
Logistic Regression – Industry Applications -Part II
- Commonly face challenges in implementation of logistic regression
- Model evaluation – A second look
- Model validation and importance of stability
- Tracking of model performance over time
Module 15: Advanced Machine Learning
Unsupervised Learning: Clustering
- Introduction to Clustering
- K Means Clustering
- Executing K Means in Python
- Hierarchical Clustering
Business Problem Solving
- Introduction to Business Problem Solving
- Case Study Demonstration churn example
- Practice Questions
Tree Models
- Introduction to Decision Trees
- Algorithms for Decision Tree Construction
- Hyperparameter Tuning in Decision Trees
- Ensembles and Random Forests
Time Series Forecasting – II (BA)
- Introduction to AR Models
- Building AR Models
Model Selection
- Principles of Model Selection
- Model Building and Evaluation
Time Series Forecasting – I (BA)
- Introduction to Time Series
- Smoothing Techniques
Module 16: AI- NLP, Neural Networks & Deep Learning
Introduction to NLP
- What is NLP?
- History and evolution of NLP
- Applications of NLP
- Challenges in NLP
- Overview of NLP pipeline
- Corpus and Corpus Linguistics
NLTK Toolkit
- Introduction to the NLTK toolkit
- Preprocessing text data with NLTK
- Basic NLP tasks using NLTK (e.g., Part-of-Speech Tagging, Named Entity Recognition)
- Stemming and Lemmatization
- WordNet in NLTK
- Chunking and Chinking
- Sentiment Analysis with NLTK
Tokenization and Topic Modeling
- Tokenization in NLP
- Bag-of-Words representation
- Topic Modeling with LDA
- Latent Semantic Analysis
- Word Embeddings
Sentiment Analysis Project
- Introduction to Sentiment Analysis
- Sentiment Analysis using supervised and unsupervised methods
- Building a Sentiment Analysis model with Python
- Evaluating Sentiment Analysis models
AI vs Deep Learning vs ML
- Introduction to Artificial Intelligence (AI),
- Machine Learning (ML) and Deep Learning (DL)
- Applications of AI, ML, and DL
- Differences between AI, ML and DL
The Concept of Neural Networks
- Introduction to Neural Networks
- Types of Neural Networks
- Layers in Neural Networks
- Activation Functions
Neural Networks – Feed-forward, Convolutional, Recurrent
- Feed-forward Neural Networks
- Convolutional Neural Networks
- Recurrent Neural Networks
- Applications of Neural Networks
Deep Learning Project
- Building a Deep Learning model with Python
- Image Classification with Convolutional Neural Networks
- Natural Language Processing with Recurrent Neural Networks
- Hyperparameter Tuning in Deep Learning
Who Can Attend?
IT Professionals to Business Professionals, Statisticians and Mathematicians, Graduates and Post-Graduates and Professionals looking to switch careers. With millions of worldwide job openings, data scientist has become the hottest job of the decade.
1. Python:
2. Data Visualization Tools:
3. Database Management Systems:
4. Version Control Tools:
5. Text Editors/IDEs::
- Healthcare Customer Feedback Analysis
- Management Teams Dashboard Creation
- Retails Store Sales Report Analysis
- Software Firm Employee Data Analysis
- Industrial Data Sets Classification & Comparison
- Charts & Graphs: Frequency Distribution Table, Pie-charts, Pareto Diagram, Histogram, Scatter Plots, Heatmaps, Bar Graphs & Many More
- Patient Disease Probability Analysis Using Healthcare Data
- Car Model & Menu Item Data Combination & Configuration Probability Analysis
- Manufacturing & Product Launch Data Classification & Analysis
- Customer Complaint Resolution Analysis Using Normal Distribution Curves
- Product Rating & Employee Productivity Analysis Usign Z-Score
- New Product Need Analysis Using Hypothesis Testing
- Inventory Management & Customer Segmentation Systems Using Vlook up & Hlook Lookup
- Sales Trend & Staffing Plan Creation using Pivot Tables
- Pricing Strategy & Financial Model Creation Using What if Analysis
- Sales & Operations Dashboard Creation
- Healthcare & Construction Reporting Automation Using Macros
- Retail Sales Opportunity Analysis Using PIVOT Charts
- Accounting Firm Statement Analysis Using Sparklines & Waterfall Chart
- FMCG Marketing spend To Sales Revenue Impact Analysis Using Regression Analysis
- Transportation Pricing Model Using Regression Analysis
1. Customer Lifetime Value Calculation:
The project involves calculating the customer lifetime value using SQL to understand the revenue generated by a customer over their lifetime.
2. Customer Churn Prediction:
This project involves building a predictive model using SQL to identify customers who are likely to churn based on their behavior and transaction history.
3. Interactive Dashboard for E-Commerce Sales:
The project involves creating an interactive dashboard using Tableau & SQL to analyze retail sales data, identify trends, and make data-driven decisions.
4. Customer Segmentation Dashboard:
This project involves creating a customer segmentation dashboard using Tableau to identify customer groups based on demographics, behavior, and purchasing patterns.
5. Movie Recommendation System:
The project involves building a movie recommendation system using Python and its libraries such as Pandas, NumPy, and Scikit-Learn. The recommendation system will suggest movies based on user preferences and ratings.
6. Sentiment Analysis on Twitter Data:
This project involves analyzing Twitter data using Python and its libraries such as NLTK and TextBlob to perform sentiment analysis and understand the overall sentiment of a particular topic.
7. Visualizing COVID-19 Data:
The project involves visualizing COVID-19 data using Python and its libraries such as Matplotlib, Seaborn, and Plotly to understand the impact of the pandemic on different countries and regions.
8. Visualizing Stock Market Data:
This project involves visualizing stock market data using Python and its libraries such as Pandas, Matplotlib, and Bokeh to understand the trends and patterns in stock prices over time.
9. Airbnb Data Analysis:
The project involves performing exploratory data analysis on Airbnb data to understand the patterns in the pricing, availability, and quality of Airbnb listings in different cities.
10. Bike Sharing Data Analysis:
This project involves performing exploratory data analysis on bike sharing data to understand the usage patterns of bikes in different cities and identify factors that influence bike usage.
11. House Price Prediction:
The project involves building a regression model using Python and its libraries such as Scikit-Learn to predict the prices of houses based on their features such as location, size, and amenities.
12. Credit Risk Prediction:
This project involves building a classification model using Python and its libraries such as Scikit-Learn to predict the credit risk of loan applicants based on their credit history and other factors.
13. Time Series Forecasting for Sales Data:
The project involves building a time series forecasting model using advanced machine learning algorithms such as ARIMA and LSTM to predict future sales trends and identify factors that influence sales.
14. Sentiment Analysis on Product Reviews:
The project involves building a sentiment analysis model using NLP techniques such as Word Embeddings and Recurrent Neural Networks (RNN) to analyze product reviews and understand the sentiment of customers towards different products.
15. Segmentation using Deep Learning:
This project involves using advanced deep learning techniques such as Fully Convolutional Networks (FCN) and U-Net to perform image segmentation and identify objects in images.
16. Machine Translation using Transformers:
This project involves building a machine translation model using advanced deep learning techniques such as Transformers to translate text from one language to
another.