Section 1
##### Introduction to Analytics

Section 2
##### Understanding Probability and Probability Distribution

1

Introduction to Probability theory

2

Types of probability distribution – Discrete Distribution and Continuous distribution

3

Understanding Probability Mass Function and Probability Density Function

4

Normal Distribution and Standard Normal Distribution

5

Understanding Binomial Distribution and Poisson Distribution

6

Application on Binomial Distribution

7

Application on Normal Distribution

Section 3
##### Introduction to Sampling Theory and Estimation

1

Concept of Population and Sample

2

Introduction to Some important terminologies

3

Parameter and Statistic

4

Properties of a good estimator

5

Standard Deviation and Standard Error

6

Point and Interval Estimation

7

Confidence level and level of Significance

8

Constructing Confidence Intervals

9

Formulation of Null and Alternative hypothesis and performing simple test of Hypothesis

Section 4
##### Introduction to Segmentation Techniques: Factor Analysis

1

Introduction to Factor Analysis and various techniques

2

Principal Component Analysis (PCA) and Exploratory Factor Analysis (EFA)

3

KMO MSA test, Bartlett’s Test Sphericity

4

The Mineigen Criterion, Scree plot

5

Introduction to Factor Loading Matrix and various rotation techniques like Varimax

6

Application of the technique on a case study

7

Interpretation of the result

Section 5
##### Introduction to Segmentation Techniques: Cluster Analysis

1

Introduction to Cluster Analysis and various techniques

2

Hierarchical and Non – Hierarchical Clustering techniques

3

Using Hierarchical Clustering in R

4

Performing K – means Clustering in R

5

Divisive Clustering, Agglomerative Clustering

6

Application of Cluster Analysis in Analytics with Examples with profiling of the clusters and interpretation of the clusters

7

Application of the techniques on a case study

8

Interpretation of the result

Section 6
##### Correlation and Linear Regression

1

Introduction to Pearson’s Correlation coefficient

2

Correlation and Causation- Fitting a simple linear regression model

3

Introduction to CLRM

4

Assumptions of CLRM

5

Understanding the MLRM technique

6

Understanding the related statistic to linear regression

7

Goodness of fit test for linear regression

8

Importing dataset in R to apply linear regression

9

Splitting of dataset – Training and testing

10

Conducting several tests to understand the results obtained

11

Checking for the accuracy of the linear regression model

12

Assessing Collinearity, Heteroskedasticity and Auto – Correlation

Section 7
##### Introduction to categorical data analysis and Logistic Regression

1

Comparison between Liner Regression and Logistic Regression

2

Performing Goodness of fit test of the model

3

Introduction to Percent Concordant, AIC, SC, and Hosmer – Lemeshow

4

Receiver Operating Characteristics (ROC) Curve and Area under Curve (AUC)

5

Interpretation of the model: overall fit of the model and finding out the influential variables using Odds ratio criteria

6

Understanding the ROC testing

7

Checking for the accuracy of the model

8

Application and interpretation using case study

Section 8
##### Introduction to Time Series Analysis

1

What is Time series Analysis, Objectives and Assumptions of Time Series

2

Identifying pattern in Time series data: Decomposition of the time series data

3

Introduction to Various Smoothing techniques: Simple Moving Average, Weighted Moving Average

4

Exponential Smoothing, Holt’s Linear Exponential Smoothing Examples of Seasonality and detecting Seasonality in Time series data

5

Autoregressive and Moving Average models and Introduction to Box Jenkins Methodology

6

Introduction to Autoregressive Moving Average (ARMA) model and Autoregressive Integrated Moving Average (ARIMA) model

7

Building an ARIMA Model

8

Detection of Stationarity, Seasonality in ARIMA Model

9

Detecting the order of AR and MA of ARIMA model

10

Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)

11

Detecting the order by using AIC and BIC criterion

12

Estimation and forecast using R

Section 9
##### Text Mining

1

Introduction to text mining

2

Importance of applying this technique

3

Package required in R to do text mining

4

Understanding WordCloud methodology

5

Performing text mining analysis using a data

6

Understanding the Sentiment Analysis

7

Application of the technique on a dataset

8

Interpretation of the result

Section 10
##### Market Basket Analysis

Section 11
##### Statistical Significance T Test Chi Square Tests and Analysis of Variance

1

Performing test of one sample mean

2

Difference between two group means (independent sample)

3

Difference between two group means (Paired sample)

4

Performing Chi square tests: Test of Independence

5

Descriptive statistics and inferential statistics

6

T-tests and it’s application on case studies

7

ANOVA testing and its application on case studies

8

Interpretation of the test results

9

Chi-square test of independence

10

Test for correlation and partial-correlation test

11

Performing post-hoc multiple comparisons tests in R using Tukey HSD

12

Performing two-way ANOVA with and without interactions

Two distributions may have the same mean and variance but may differ widely in their overall appearance. It is this difference which shows the presence of Skewness or the ‘lack of symmetry’. Again, two distributions may vary on the basis of ‘peakedness’ or on the basis of kurtosis. So, we try to characterize the distributions according to their shape.

**Skewness**

A distribution is known as a skewed distribution if it is asymmetrical. According to Simpson and Kafka “Skewness or asymmetry is the attribute of a frequency distribution that extends further on one side of the class with the highest frequency than on the other”. The idea of Skewness gives us an idea about the nature and extent of concentration of the observations towards the higher or the lower values of the variable.

A distribution is said to be skewed;

- The frequency curve of the distribution is not a symmetric bell-shaped curve but it is stretched more to one side than to the other. In other words, it has a longer tail to one side (left or right) than to the other. A frequency distribution which has a longer tail towards the right is said to be positively skewed and if the longer tail lies to the left, it is said to be negatively skewed.
- The mean, median and mode fall at different points, i.e. they do not coincide.
- Quartiles Q1 and Q3 are not equidistant from the median.
- The sum of the positive deviations about the median is not equal to the sum of the negative deviations from the median.

**Measures of Skewness**

Some of the absolute measure of skewness are:

- Skewness = Mean-Mode = M – M0
- Skewness = 3 (Mean- Median) =3 (M –Md)
- Skewness = (Q3-Md) – (Md- Q1)

The absolute measures of Skewness are not much of practical use because of the following reasons:

- Since the absolute measures of skewness involve the units of measurements, they cannot be used for comparative study of the two distributions measured in different units.
- Even if the distributions are having the same unit of measurement, the absolute measures are not recommended because we may come across different distributions which have more or less identical skewness but which vary widely in the measures of central tendency.

Therefore, for comparing the two or more distributions for skewness we compute the relative measures of skewness, also known as coefficients of skewness which are pure numbers independent of units. The most commonly used coefficients of Skewness are:

The measures of skewness give us an idea about the spread of the frequency distribution. But, the spread of the distribution also has a relation to the peakedness of the distribution. This peakedness of a frequency distribution is discussed under Kurtosis.

**Kurtosis**

Kurtosis is the ‘peakedness’ of a frequency distribution. If the frequency curve has long tails and high peak, we call it ‘leptokurtic’ distribution. On the other hand, if the frequency curve has short and thick tails and is flat topped, we call it ‘platykurtic’. A ‘mesokurtic’ distribution describes the situation in between a leptokurtic and platykurtic distribution.