Section 1
##### Introduction to Analytics

1

Introduction to Excel

2

Conditional Formatting

3

Data Summarization techniques

4

Graphical summary using SAS/GRAPH: Introduction to Bar graph

5

Graphical summary using SAS/GRAPH: Introduction to Pie graph

6

Graphical summary using SAS/GRAPH introduction to Histogram, Box plots, Scatter diagram

7

Descriptive Statistics-Introduction to various measures of Central Tendency

8

Introduction to the measures of Dispersion, Range, Mean Deviation , Standard Deviation

Section 2
##### Understanding Probability and Probability Distribution

1

Introduction to Probability theory

2

Types of probability distribution – Discrete Distribution and Continuous distribution

3

Understanding Probability Mass Function and Probability Density Function

4

Normal Distribution and Standard Normal Distribution

5

Normal plot using Proc GPLOT procedure in SAS

6

Application of Normal distribution in Analytics with real life examples

7

Binomial Distribution and Binomial plot using PROC GPLOT procedure in SAS

8

Poisson distribution and Poisson plot using Proc GPLOT procedure in SAS

9

Application of Binomial and Poisson distribution in Analytics with real life examples

Section 3
##### Introduction to Sampling Theory and Estimation

1

Concept of Population and Sample

2

Use of PROC SURVEYSELECT procedure in SAS

3

Introduction to Some important terminologies

4

Parameter and Statistic

5

Properties of a good estimator

6

Standard Deviation and Standard Error

7

Point and Interval Estimation

8

Confidence level and level of Significance

9

Constructing Confidence Intervals

10

Formulation of Null and Alternative hypothesis

11

Performing simple test of Hypothesis

Section 4

Section 5
##### Statistical Significance of T-Tests Chi Square Tests and Analysis of Variance

1

Performing test of one sample mean using Proc ttest

2

Difference between two group means (independent sample) using Proc ttest

3

difference between two group means (Paired sample) using Proc ttest

4

Performing Chi-square tests: Test of Independence

5

Performing one-way ANOVA with PROC ANOVA and PROC GLM procedure

6

Performing post-hoc multiple comparisons tests in PROC

7

GLM using Tukey’s mean test

Section 6
##### Introduction to Segmentation Techniques: Factor Analysis

1

Introduction to Factor Analysis and various techniques

2

Principal Component Analysis (PCA) and Exploratory Factor Analysis (EFA)

3

Application of Factor Analysis using Proc Factor procedure

4

KMO MSA test, Bartlett’s Test Sphericity

5

The Mineigen Criterion, Scree plot

6

Introduction to Factor Loading Matrix

7

Various rotation techniques like Varimax

Section 7
##### Introduction to Segmentation Techniques: Cluster Analysis

1

Introduction to Cluster Analysis and various techniques

2

Hierarchical and Non – Hierarchical Clustering techniques

3

Using Hierarchical Clustering by Proc Tree procedure in SAS

4

Performing K – means Clustering in SAS

5

Divisive Clustering, Agglomerative Clustering

6

Application of Cluster Analysis in Analytics with profiling of the clusters and interpretation of the clusters

Section 8
##### Correlation and Linear Regression

1

Introduction to Pearson’s Correlation coefficient using PROC CORR procedure

2

Correlation and Causation – Fitting a simple linear regression model with the Proc REG procedure

3

Understanding the concepts of Multiple Regression

4

Using automated model selection techniques in PROC REG to choose the best model

5

Interpretation of the model: overall fit of the model and finding out the influential variables

6

Linear Regression diagnostics

7

Examining Residual

8

Assessing Collinearity, Heteroskedasticity and Auto – Correlation

Section 9
##### Introduction to Categorical Data Analysis and Logistic Regression

1

Comparison between Liner Regression and Logistic Regression

2

Performing Logistic regression using Proc Logistic Procedure in SAS

3

Performing Goodness of ft of the model

4

Introduction to Percent Concordant, AIC, SC, and Hosmer – Lemeshow

5

Receiver Operating Characteristics (ROC) Curve and Area under Curve (AUC)

6

Interpretation of the model: overall fit of the model and finding out the influential variables using Odds ratio criteria

7

Using automated model selection techniques in PROC Logistic to choose the best model using AIC criteria

Section 10
##### Introduction to Time Series Analysis

1

What is Time series Analysis, Objectives and Assumptions of Time Series

2

Identifying pattern in Time series data: Decomposition of the time series data and general aspect of the analysis

3

Introduction to Various Smoothing techniques: Simple Moving Average, Weighted Moving Average, Exponential Smoothing, Holt’s Linear Exponential Smoothing

4

Examples of Seasonality and detecting Seasonality in Time series data

5

Introduction to Proc Forecast to generate forecast for time series data

6

Autoregressive models and Stepwise Autoregression (STEPAR) procedure

7

Autoregressive and Moving Average models and Introduction to Box Jenkins Methodology

8

Introduction to Autoregressive Moving Average (ARMA) model

9

Autoregressive Integrated Moving Average (ARIMA) model

10

Building an ARIMA Model

11

Detection of Stationarity, Seasonality in ARIMA Model

12

Detecting the order of AR and MA of ARIMA model by Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF)

13

Detecting the order by using AIC and BIC criterion

14

Estimation and forecast using Proc ARIMA in SAS

**What is a Pie Chart?
**A pie chart is a circular chart divided into wedge-like sectors, illustrating proportion. Each wedge represents a proportionate part of the whole, and the total value of the pie is always 100 percent. Pie charts can make the size of portions easy to understand at a glance. They’re widely used in business presentations and education to show the proportions among a large variety of categories including expenses, segments of a population, or answers to a survey.

**Pie Chart vs Bar Chart
**Some critics of pie charts point out that the portions are hard to compare across other pie charts and if a pie chart has too many wedges, even wedges in a single pie chart are hard to visually contrast against each other compared to the height of bars in a bar graph for example. Bar charts are easier to read when you’re comparing categories or looking at change over time. The only thing bar charts lack is the whole-part relationship that makes pie charts unique. Pie charts imply that if one wedge gets bigger, the other has to be smaller. This would not be true of two bars on a bar chart.

Let’s now see how we can create different types of pie charts in SAS.

**PROC GCHART **DATA= mylib.CANDY_SALES_SUMMARY;

PIE3D SUBCATEGORY;

**RUN;
**

This code generates a 3-dimensional pie-chart using the option pie-3d. Gchart is used to procedure the graphical chart. The pie-chart represents each of the subcategory on a pie, i.e. as a percent-age of 360 degrees. We are creating a pie chart for each of the different subcategory of candies present in the data set called “Candy_Sales_Summary”. “mylib” is the name of the library which stores all the SAS data sets.

If we modify the above code like:

**PROC GCHART **DATA= mylib.CANDY_SALES_SUMMARY;

PIE SUBCATEGORY/ VALUE= INSIDE;

**RUN;
**

We will get a variation of the previous pie chart representation, value=inside keeps the frequency values in the slices along with the names of the subcategory. Each of the sub-category is shown in slices of different colors. Note: We have not mentioned the keyword “3D” here and hence we would get a 2-dimensional pie chart, which is the default type of chart in SAS. The code below is for a pie chart which puts out the frequency of sale corresponding to the sale subcategory. The percentage frequency of the sale and the discrete value of the sale of the subcategory are shown outside and the name of the variable is shown outside the slice.

**PROC GCHART **DATA= mylib.CANDY_SALES_SUMMARY;

PIE3d SUBCATEGORY/ VALUE=INSIDE

PERCENT=INSIDE

SLICE=OUTSIDE

FREQ=SALE_AMOUNT;

**RUN;
**

On running the above code, we will get a graph like the one shown below