Section 1
##### What is R?

Section 2
##### Basic Operations in R

1

Expressions: Basic Idea

2

Constant Values: Numeric and Non-Numeric

3

Arithmetic: Operations and BODMAS

4

Conditions: Equality, Greater Than, Less Than, etc

5

Function Calls: Introduction to R Functions

6

Symbols and Assignment

7

Keywords: NA, Inf, NaN, NULL, TRUE, FALSE

8

Naming a Variable: Generally accepted conventions

Section 3
##### Data Types and Data Structures

Section 4
##### Subsetting in R

1

Vector Subsetting

2

c() function: Creation of Vectors

3

Using rep() and seq() functions

4

Using factor() to covert vectors to factors

5

Using data.frame() to create data frames

6

Meta data access: dimnames(), rownames(), colnames()

7

Using matrix() to create matrices

8

Using array() to create arrays

9

Subsetting data frames: row subset, column subset, using subset() function

10

Assigning to a subset

11

Using is.na() to detect NA

12

Subsetting factors

Section 5
##### Additional Topics on Data structures

1

The recycling rule: Uneven arithmetic operation on vectors

2

Type coercion: Character to Numeric

3

Automatic Type coercion

4

Coercing factors: Using as.factor() function

5

Changing factor levels

6

Attributes: attribute(), attr(), names() functions

7

Classes: Idea of OOP in R

8

Dates: As a special class

9

Formulas: As a special class

10

Exploring Objects: summary(), str(), dim() functions

11

Generic functions

Section 6
##### Data Import and Export

1

Text formats: Reading Delimited Files

2

read.table() function

3

Using read.fwf() function for fixed width files

4

Using readLines() for reading lines

5

Using write.csv() function to store data as CSV files

6

Reading Excel file: Package XLConnect

7

Reading SPSS file: Package Foreign

8

Reading SAS data file: Package sas7bdat

9

Database connection: The ideas of ODBC connecting in Windows

10

RODBC package: Create and Query database from R

11

Basic SQL

Section 7
##### Control Structures and User Defined Functions

1

Conditional Statements

2

If statement: The Structure

3

If Else statement: The Structure

4

Ifelse() function

5

Iteration

6

The for loop

7

The while loop

8

The repeat statement

9

lapply() function

10

sapply() function

11

apply() function

12

User defined function

13

Variable scooping: Global and Local Variables

14

Using user defined functions inside function definition

Section 8
##### Data Visualisation: Charting with R

1

The plot function

2

plot.new() function: Generating new plot object

3

plot.window() function: Creating window

4

points() function: Plotting points

5

axis() function: Generating Axis

6

box() function: Creating enclosure

7

title() function: Assigning title

8

par() function: Fixing plotting parameters

9

lines() function: Adding connector lines

10

Multi figure layout: Creating multiple charts in the same window

11

hist() function: Plotting histograms

12

Kernel Density Plot: The non-parametric probability distribution

13

Comparing Groups via Kernel Density: Comparing two different probability distributions

14

Simple Bar Plot: Visualizing categorical data

15

Staked Bar Plot: Understating category composition

16

Grouped Bar Plot

17

Line Charts

18

Pie Charts

19

Boxplots: Understanding data distributions and outliers

20

Using Google Chart Tools with R (Package googleVis)

21

Geo Charts

22

Motion Charts

Section 9
##### Visualisation on R using Google Vis

Section 10
##### Visualization in R using GGPLOT2

Birth and Rise of R

R is a language and environment for statistical computing and graphics.

• R was initially written by Robert Gentleman and Ross Ihaka.

• The core group with write access to the R source comprise of – Douglas Bates, John Chambers, Peter Dalgaard, Seth Falcon, Robert Gentleman, Kurt Hornik, Stefano Iacus, Ross Ihaka, Friedrich Leisch, Uwe Ligges, Martin Maechler, Duncan Murdoch, Paul Murrell, Martyn Plummer, Brian Ripley, Deepayan Sarkar, Duncan Temple Lang, Luke Tierney, Simon Urbanek, and Thomas Lumley.

History of R:

The history of R is one of good fortune and good choices. In 1992, Gentleman – then a professor at the University of Waterloo in Canada – traveled 8600 miles to the University of Auckland to lecture for three months. One day, he found himself needing a manual for a particularly tricky piece of software and Ihaka – still a professor of statistics in those days – was the only one in the department who had a copy. In time, they both realized an interest in what Ihaka calls “playing academic fun and games” with statistical computing languages.

They had questions about programming languages they wanted to answer. In particular, both Ihaka and Gentleman shared a common knowledge of the language called “Scheme”, and both found the language useful in a variety of ways. Scheme, however, was unwieldy to type and lacked desired functionality. Again, convenience brought good fortune. Each was familiar with another language, called “S”, and S provided the kind of syntax they wanted. With no blend of the two languages commercially available, Gentleman suggested building something themselves.

Around that time, the University of Auckland needed a language to use in its undergraduate statistics courses as the school’s current tool had reached the end of its useful life. There was one major caveat: the program needed to run on Macintosh. According to Gentleman, the Department of Statistics took inventory and decided “that thing Ross and Robert are working on”, which happened to run on Macintosh, better than their current language. The professors called it R, as both a no to S and in reference to their forenames.

Ihaka and Gentleman kept the project secret from the wider community until August 1993, when an email to the S-news mailing list drew it into the public eye. A Canadian professor had a familiar problem: he needed a Macintosh version of S. Ihaka decided it was time to let R see the light of day. Soon after, a usable version of R appeared on StatLib, an online system for distributing statistical software and data.

Though what we have today is free software, in the mid-1990s Ihaka and Gentleman were seriously considering turning into a commercial product, but ultimately, the idea of selling was more than worth it.

Ihaka and Gentleman agreed with the idea of making free software – meaning that people would be free to use, change, and distribute it as they like. In 1995, the duo made R’s source code available under a free software license.

Evolution of the software:

As the language improved, more users joined – and more users meant less room for bugs to hide. As fixes and functions poured in, the names of the submitters began to look familiar. Usual suspects so often that Ihaka and Gentleman gave them the ability to edit the source code directly because it was easier than managing all the changes themselves. By mid ‐ 1997, 11 people – including Ihaka, Gentleman, Mächler, Peter Dalgaard, Kurt Hornik, Friedrich Leisch, and Thomas Lumley – had the keys to R’s source code. The group fashioned themselves the “R Core” team.

“The users were the developers in those days,” Ihaka says, and more of them joined the community, they needed to show off what they had done and download contributions they found useful. In March 1997, Hornik and Leisch, of the Vienna University of Economics and Business, made a Herculean contribution to the R Project by building the Comprehensive Archive Network (CRAN). This network made the essential information and files of R available for download in one place. Most importantly, users could browse packages – R version of code libraries – and download the ones they needed.

CRAN makes R shine. Most of the functionality of R is contained in the packages stored in CRAN, which can be loaded and used when needed. This makes R more versatile than other statistical software. Closed-source software, such as SAS and SPSS, can only be updated by their official developers, whereas R has a community churning out updates all the time.

In 2000, the R Project released R version 1.0.0, the first version they felt was ready for public usage. The following year, several influential statisticians published papers on data science, and 2003 saw the first academic journal dedicated to this growing field. For those people now identified as data scientists, R, CRAN and the wider community provided the means to explore and familiarize themselves with statistical tools and techniques. In turn, those data scientists added packages to help with data types and models from fields as diverse as ecology, linguistics, bioinformatics and network science.

Future of R

We have now caught up with R’s story so far, but it is by now the end of the tale. What might the future have in store?

Lumley was unsure if another computing language would be coming to bury anytime soon, but he felt that any successor would have to absorb CRAN and its stockpile of code. Gentleman agreed, saying: “There are really good algorithms in R, and no one should be implementing them.”

The future of CRAN is a popular topic for speculation as the network is starting to creak under the weight of its own success. The archive now holds more than 12,000 packages and is growing near-exponentially. With CRAN growing unabated and – in Peter Dalgaard’s words – “the original Core team approaching pensionable age”, the maintenance of R and CRAN will at some point need to change.

Ultimately, of course, the future of R will be determined by its community – the people who, over the last quarter of a century, have donated years of their lives to the source code, crafting clever packages and helping new users get started. These donations of time and effort did not come with the promise of future monetary rewards.

R is free, open source software that was created for fun, reared by committee, and developed by the masses. That is a software that could survive and flourish for 25 years, a credit to its quality, to its creators and to its users. “People in the past would have said you couldn’t do something like this,” says Lumley. “Now it’s clear that you can.”