Section 1
##### Getting Started with the most popular Data Science Library: Pandas

Section 2
##### Working with Pandas

1

Imputing missing values

2

Pivot tables

3

Crosstabs

4

Merge DataFrames

5

Sorting data frames

6

Plotting with Pandas

7

Analysis broken down into various steps

8

Exploratory Analysis

9

Basic Descriptive Statistic Analysis

10

Distribution Analysis

11

Categorical Variable Analysis

12

Data Munging

13

Treating Missing Values

Section 3
##### Building Predictive Models

1

Linear Regression Theory

2

Understanding Regression

3

Training and Validation

4

Goodness of Fit

5

Practical Application

6

Exploratory Analysis

7

Case study

8

Logistic Regression Theory

9

Linear Probability Model

10

Concept of Classification

11

Comparison with Linear Regression

12

Odds Ratio

13

Classification Table

14

Practical Application

15

Getting the data

16

Reading the data

17

Algorithms

18

Decision Tree

19

Random Forest

Section 4
##### Time Series Analysis

**DataFrame**** |
**

We will get a brief insight on all these basic operation which can be performed on Pandas DataFrame :

• Creating a DataFrame

• Dealing with Rows and Columns

• Indexing and Selecting Data

• Working with Missing Data

• Iterating over rows and columns** **

**Creating a Pandas DataFrame
**In the real world, a Pandas DataFrame will be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, and Excel file. Pandas DataFrame can be created from the lists, dictionary, and from a list of dictionaries, etc. Dataframe can be created in different ways here are some ways by which we create a dataframe:

DataFrame can be created using a single list or a list of lists.

# import pandas as pd import pandas as pd # list of strings list = [OrangeTree, ‘Global’, ‘institute’, ‘is’, ‘centre’, ‘for’, ‘Geeks’] # Calling DataFrame constructor on list df = pd.DataFrame(list) print(df) |

**Dealing with Rows and Columns
**A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. We can perform basic operations on rows/columns like selecting, deleting, adding, and renaming.

# Import pandas package import pandas as pd # Define a dictionary containing employee data data = {‘Name’:[‘Jai’, ‘Princi’, ‘Gaurav’, ‘Anuj’], ‘Age’:[27, 24, 22, 32], ‘Address’:[‘Delhi’, ‘Kanpur’, ‘Allahabad’, ‘Kannauj’], ‘Qualification’:[‘Msc’, ‘MA’, ‘MCA’, ‘Phd’]} # Convert the dictionary into DataFrame |

Run on IDE

**Output:
**As shown in the output image, two series were returned since there was only one parameter both of the times.

**Indexing and Selecting Data
**Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns. Indexing can also be known as

**Indexing a DataFrame using .loc[]**** :
**This function selects data by the

# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv(“nba.csv”, index_col =”Name”) # retrieving row by loc method first = data.loc[“Avery Bradley”] second = data.loc[“R.J. Hunter”] print(first, “\n\n\n”, second) |

**Output:
**As shown in the output image, two series were returned since there was only one parameter both of the times.

**Selecting a single row
**In order to select a single row using .iloc[], we can pass a single integer to .iloc[] function.

import pandas as pd # making data frame from csv file data = pd.read_csv(“nba.csv”, index_col =”Name”) # retrieving rows by iloc method row2 = data.iloc[3] print(row2) |

**Output:**