Section 1
##### Getting Started with the most popular Data Science Library: Pandas

Section 2
##### Working with Pandas

8

Imputing missing values

9

Pivot tables

10

Crosstabs

11

Merge DataFrames

12

Sorting data frames

13

Plotting with Pandas

14

Analysis broken down into various steps

15

Exploratory Analysis

16

Basic Descriptive Statistic Analysis

17

Distribution Analysis

18

Categorical Variable Analysis

19

Data Munging

20

Treating Missing Values

Section 3
##### Building Predictive Models

21

Linear Regression Theory

22

Understanding Regression

23

Training and Validation

24

Goodness of Fit

25

Practical Application

26

Exploratory Analysis

27

Case study

28

Logistic Regression Theory

29

Linear Probability Model

30

Concept of Classification

31

Comparison with Linear Regression

32

Odds Ratio

33

Classification Table

34

Practical Application

35

Getting the data

36

Reading the data

37

Algorithms

38

Decision Tree

39

Random Forest

Section 4
##### Time Series Analysis

**DataFrame**** |
**

We will get a brief insight on all these basic operation which can be performed on Pandas DataFrame :

• Creating a DataFrame

• Dealing with Rows and Columns

• Indexing and Selecting Data

• Working with Missing Data

• Iterating over rows and columns** **

**Creating a Pandas DataFrame
**In the real world, a Pandas DataFrame will be created by loading the datasets from existing storage, storage can be SQL Database, CSV file, and Excel file. Pandas DataFrame can be created from the lists, dictionary, and from a list of dictionaries, etc. Dataframe can be created in different ways here are some ways by which we create a dataframe:

DataFrame can be created using a single list or a list of lists.

# import pandas as pd import pandas as pd # list of strings list = [OrangeTree, ‘Global’, ‘institute’, ‘is’, ‘centre’, ‘for’, ‘Geeks’] # Calling DataFrame constructor on list df = pd.DataFrame(list) print(df) |

**Dealing with Rows and Columns
**A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. We can perform basic operations on rows/columns like selecting, deleting, adding, and renaming.

# Import pandas package import pandas as pd # Define a dictionary containing employee data data = {‘Name’:[‘Jai’, ‘Princi’, ‘Gaurav’, ‘Anuj’], ‘Age’:[27, 24, 22, 32], ‘Address’:[‘Delhi’, ‘Kanpur’, ‘Allahabad’, ‘Kannauj’], ‘Qualification’:[‘Msc’, ‘MA’, ‘MCA’, ‘Phd’]} # Convert the dictionary into DataFrame |

Run on IDE

**Output:
**As shown in the output image, two series were returned since there was only one parameter both of the times.

**Indexing and Selecting Data
**Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns. Indexing can also be known as

**Indexing a DataFrame using .loc[]**** :
**This function selects data by the

# importing pandas package import pandas as pd # making data frame from csv file data = pd.read_csv(“nba.csv”, index_col =”Name”) # retrieving row by loc method first = data.loc[“Avery Bradley”] second = data.loc[“R.J. Hunter”] print(first, “\n\n\n”, second) |

**Output:
**As shown in the output image, two series were returned since there was only one parameter both of the times.

**Selecting a single row
**In order to select a single row using .iloc[], we can pass a single integer to .iloc[] function.

import pandas as pd # making data frame from csv file data = pd.read_csv(“nba.csv”, index_col =”Name”) # retrieving rows by iloc method row2 = data.iloc[3] print(row2) |

**Output:**