SlideShare ist ein Scribd-Unternehmen logo
1 von 67
Dr. E. N. SATHISHKUMAR,
Guest Lecturer,
Department of Computer Science,
Periyar University,
Salem -11.
Introduction
 R (the language) was created in the early 1990s, by Ross Ihaka and
Robert Gentleman.
 It is based upon the S language that was developed at Bell
Laboratories in the 1970s.
 It is a high-level language like C#, Java, etc..,
 R is an interpreted language (sometimes called a scripting language),
which means that your code doesn’t need to be compiled before you
run it.
 R supports a mixture of programming paradigms (At its core, it is an
imperative language, but it also supports OOP, and functional
programming).
Getting started
 Where to get R?
The newest version of R and its documentation can be downloaded
from http://www.r-project.org.
 Download, Packages: Select CRAN
 Set your Mirror: India (Indian Institute of Technology Madras)
Select http://ftp.iitm.ac.in/cran/
 Select Download R for Windows
 Select base.
 Select Download R 3.4.2 for Windows
 Execute the R-3.4.2-win.exe with administrator privileges. Once the
program is installed, run the R program by clicking on its icon
Choosing an IDE
 If you use R under Windows or Mac OS X, then a graphical user
interface (GUI) is available to you.
 Some of he best GUIs are:
 Emacs + ESS
 Eclipse/Architect
 RStudio
 Revolution-R
 Live-R
 Tinn-R
A Scientific Calculator
 R is at heart a supercharged scientific calculator, so typing commands
directly into the R Console.
> 5+5
[1] 10
> 4-7
[1] -3
> 7*3
[1] 21
> 16/31
[1] 0.516129
> log2(32)
[1] 5
Variable Assignment
 We assign values to variables with the assignment operator "=".
 Just typing the variable by itself at the prompt will print out the value.
 We should note that another form of assignment operator "<-" is also
in use.
> X = 2
[1] 2
> X <- 5
[1] 5
> X * X
[1] 25
Comments
 All text after the pound sign "#" within the same line is considered a
comment.
> X = 2 # this is a comment
[1] 2
# 5 is assign to variable X
> X <- 5
[1] 5
Getting Help
 R provides extensive documentation. If we want to help to
particular topic, just use help() with help topic.
 For example,
 > help("if")
 starting httpd help server ... Done
 Immediately help content opens in web browser.
Basic Data Types
 There are several basic R data types that are of frequent
occurrence in routine R calculations.
 Numeric
 Integer
 Complex
 Logical
 Character
 Factor
Numeric
 Decimal values are called numerics in R. It is the default
computational data type.
 If we assign a decimal value to a variable x as follows, x will be
of numeric type.
> x = 10.5 # assign a decimal value
> x # print the value of x
[1] 10.5
> class(x) # print the class name of x
[1] "numeric"
Numeric
 Furthermore, even if we assign an integer to a variable k, it is still being
saved as a numeric value.
> k = 1
> k # print the value of k
[1] 1
> class(k) # print the class name of k
[1] "numeric"
 The fact that k is not an integer can be confirmed with the is.integer
function.
> is.integer(k) # is k an integer?
[1] FALSE
Integer
 In order to create an integer variable in R, we invoke the as.integer
function.
 For example,
> y = as.integer(3)
> y # print the value of y
[1] 3
> class(y) # print the class name of y
[1] "integer"
> is.integer(y) # is y an integer?
[1] TRUE
Complex
 A complex value in R is defined via the pure imaginary value i.
 For example,
> z = 1 + 2i # create a complex number
> z # print the value of z
[1] 1+2i
> class(z) # print the class name of z
[1] "complex"
 The following gives an error as −1 is not a complex value.
> sqrt(−1) # square root of −1
[1] NaN
 Warning message: In sqrt(−1) : NaNs produced
Complex
 Instead, we have to use the complex value −1 + 0i.
 For example,
> sqrt(−1+0i) # square root of −1+0i
[1] 0+1i
 An alternative is to coerce −1 into a complex value.
> sqrt(as.complex(−1))
[1] 0+1i
Logical
 A logical value is often created via comparison between variables.
 For example,
> x = 1; y = 2 # sample values
> z = x > y # is x larger than y?
> z # print the logical value
[1] FALSE
> class(z) # print the class name of z
[1] "logical"
Logical
 A Standard logical operations are "&", "|" , "!" .
 For example,
> u = TRUE; v = FALSE
> u & v # u AND v
[1] FALSE
> u | v # u OR v
[1] TRUE
> !u # negation of u
[1] FALSE
Character
 A character object is used to represent string values in R. We
convert objects into character values with the as.character(). For
example,
> x = as.character(3.14)
> x # print the character string
[1] "3.14"
> class(x) # print the class name of x
[1] "character"
> x = as.character( “hai”)
> x # print the character string
[1] “hai”
> class(x) # print the class name of x
[1] "character"
Factor
 The factor data type is used to represent categorical data. (i.e. data of
which the value range is a collection of codes).
 For example, to create a vector of length five of type factor do the
following:
>sex <- c("male","male","female","male","female")
 The object sex is a character object. You need to transform it to factor.
>sex <- factor(sex)
>sex
[1] male male female male female
Levels: female male
 Use the function levels to see the different levels a factor variable has.
Data structures
 Before you can perform statistical analysis in R, your data has to
be structured in some coherent way. To store your data R has the
following structures:
 Vector
 Matrix
 Array
 Data frame
 Time-series
 List
Vectors
 A vector is a sequence of data elements of the same basic type.
 Members in a vector are officially called components.
 For example, Here is a vector containing three numeric values 2, 3, 5.
> c(2, 3, 5)
[1] 2 3 5
 Here is a vector of logical values.
> c(TRUE, FALSE, TRUE, FALSE, FALSE)
[1] TRUE FALSE TRUE FALSE FALSE
Combining Vectors
 Vectors can be combined via the function c.
 For example, Here is a vector containing three numeric
values 2, 3, 5.
> n = c(2, 3, 5)
> s = c("aa", "bb", "cc", "dd", "ee")
> c(n, s)
[1] "2" "3" "5" "aa" "bb" "cc" "dd" "ee"
Vector Arithmetics
 Arithmetic operations of vectors are performed member-by-member.
 For example, Here is a vector containing three numeric values 2, 3, 5.
> a = c(1, 3, 5, 7)
> b = c(1, 2, 4, 8)
 We add a and b together, the sum would be a vector whose members
are the sum of the corresponding members from a and b.
> a + b
[1] 2 5 9 15
 Similarly for subtraction, multiplication and division, we get new
vectors via member wise operations.
Vector Recycling Rule
 If two vectors are of unequal length, the shorter one will be recycled in
order to match the longer vector.
 For example, sum is computed by recycling values of the shorter vector.
> u = c(10, 20, 30)
> v = c(1, 2, 3, 4, 5, 6, 7, 8, 9)
> u + v
[1] 11 22 33 14 25 36 17 28 39
Vector Index
 We retrieve values in a vector by declaring an index inside a single
square bracket "[ ]" operator.
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> > s[3]
[1] "cc"
Vector Negative Index
 If the index is negative, it would strip the member whose position
has the same absolute value as the negative index.
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> s[-3]
[1] "aa" "bb" "dd" "ee"
Out-of-Range Index
 If an index is out-of-range, a missing value will be reported via the
symbol NA.
>s[10]
[1] NA
Numeric Index Vector
 A new vector can be sliced from a given vector with a numeric
index vector, which consists of member positions of the original
vector to be retrieved.
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> s[c(2, 3)]
[1] "bb" "cc"
Vector Duplicate Indexes
 The index vector allows duplicate values. Hence the following
retrieves a member twice in one operation.
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> s[c(2, 3, 3)]
[1] "bb" "cc" "cc"
Vector Out-of-Order Indexes
 The index vector can even be out-of-order. Here is a vector slice
with the order of first and second members reversed.
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> s[c(2, 1, 3)]
[1] "bb" "aa" "cc"
Vector Range Index
 To produce a vector slice between two indexes, we can use the
colon operator ":".
 For example,
> s = c("aa", "bb", "cc", "dd", "ee")
> s[2:4]
[1] "bb" "cc" "dd"
Named Vector Members
 We can assign names to vector members.
 For example, the following variable v is a character string vector
with two members.
> v = c("Mary", "Sue")
> v
[1] "Mary" "Sue"
 We now name the first member as First, and the second as Last.
> names(v) = c("First", "Last")
> v
First Last
"Mary" "Sue"
Named Vector Members
 We can assign names to vector members.
 For example, the following variable v is a character string vector
with two members.
> v = c("Mary", "Sue")
> v
[1] "Mary" "Sue”
 We now name the first member as First, and the second as Last.
> names(v) = c("First", "Last")
> v
First Last
"Mary" "Sue"
Matrices
 A matrix is a collection of data elements arranged in a row-
column layout.
 A matrix can be regarded as a generalization of a vector.
 As with vectors, all the elements of a matrix must be of the same
data type.
 A matrix can be generated in several ways.
 Use the function dim
 Use the function matrix
Matrices
 Use the function dim
> x <- 1:8 [,1] [,2] [,3] [,4]
> dim(x) <- c(2,4) [1,] 1 3 5 7
> X [2,] 2 4 6 8
 Use the function matrix
> A = matrix(c(2, 4, 3, 1, 5, 7), nrow=2, ncol=3, byrow = T)
> A
> A <- matrix(c(2, 4, 3, 1, 5, 7),2,3,byrow=T)
> A
A [,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
Accessing Matrices
 An element at the mth row, nth column of A can be accessed by the
expression A[m, n].
> A[2, 3]
[1] 7
 The entire mth row A can be extracted as A[m, ].
> A[2, ]
[1] 1 5 7
 We can also extract more than one rows/columns at a time.
> A[ ,c(1,3)]
[,1] [,2]
[1,] 2 3
[2,] 1 7
Calculations on matrices
 We construct the transpose of a matrix by interchanging its columns
and rows with the function t .
> t(A) # transpose of A
[,1] [,2]
[1,] 2 1
[2,] 4 5
[3,] 3 7
 We can deconstruct a matrix by applying the c function, which
combines all column vectors into one.
> c(A)
[1] 2 4 3 1 5 7
Arrays
 In R, Arrays are generalizations of vectors and matrices.
 A vector is a one-dimensional array and a matrix is a two
dimensional array.
 As with vectors and matrices, all the elements of an array must be
of the same data type.
 An array of one dimension of two element may be constructed as
follows.
> x = array(c(T,F),dim=c(2))
> print(x)
[1] TRUE FALSE
Arrays
 A three dimensional array - 3 by 3 by 3 - may
be created as follows.
> z = array(1:27,dim=c(3,3,3))
> dim(z)
[1] 3 3 3
> print(z)
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
, , 3
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
Accessing Arrays
 R arrays are accessed in a manner similar to arrays in other
languages: by integer index, starting at 1 (not 0).
 For example, the third dimension is a 3 by 3 array.
> z[,,3]
[,1] [,2] [,3]
[1,] 19 22 25
[2,] 20 23 26
[3,] 21 24 27
 Specifying two of the three dimensions returns an array on one
dimension.
>z[,3,3]
[1] 25 26 27
Accessing Arrays
 Specifying three of three dimension returns an element of the 3 by 3 by 3
array.
> z[3,3,3]
[1] 27
 More complex partitioning of array may be had.
> z[,c(2,3),c(2,3)]
, , 1
[,1] [,2]
[1,] 13 16
[2,] 14 17
[3,] 15 18
, , 2
[,1] [,2]
[1,] 22 25
[2,] 23 26
[3,] 24 27
Lists
 A list is a collection of R objects.
 list() creates a list. unlist()
transform a list into a vector.
 The objects in a list do not have to
be of the same type or length.
>x <- c(1:4)
>y <- FALSE
> z <-
matrix(c(1:4),nrow=2,ncol=2)
> myList <- list(x,y,z)
> myList
[[1]]
[1] 1 2 3 4
[[2]]
[1]
FALSE
[[3]]
[,1] [,2]
[1,] 1 2
[2,] 3 4
Data Frame
 A data frame is used for storing data like spreadsheet(table).
 It is a list of vectors of equal length.
 Most statistical modeling routines in R require a data frame as input.
 For example,
> weight = c(150, 135, 210, 140)
> height = c(65, 61, 70, 65)
> gender = c("Fe","Fe","Ma","Fe")
> study = data.frame(weight,height,gender) # make the data frame
> study
weight height gender
1 150 65 Fe
2 135 61 Fe
3 210 70 Ma
4 140 65 Fe
Creating a data frame
 The dataframe may be created directly using data.frame().
 For example, the dataframe is created - naming each vector composing
the dataframe as part of the argument list.
> patientID <- c(1, 2, 3, 4)
> age <- c(25, 34, 28, 52)
> diabetes <- c("Type1", "Type2", "Type1", "Type1")
> status <- c("Poor", "Improved", "Excellent", "Poor")
> patientdata <- data.frame(patientID, age, diabetes, status)
> patientdata
patientID age diabetes status
1 1 25 Type1 Poor
2 2 34 Type2 Improved
3 3 28 Type1 Excellent
4 4 52 Type1 Poor
Accessing data frame elements
 Use the subscript notation/specify column names to identify the elements
in the patient data frame [1] 25 34 28 52
>patientdata[1:2]
patientID age
1 1 25
2 2 34
3 3 28
4 4 52
>table(patientdata$diabetes, patientdata$status)
Excellent Improved Poor
Type1 1 0 2
Type2 0 1 0
>patientdata[c("diabetes", "status")]
diabetes status
1 Type1 Poor
2 Type2 Improved
3 Type1 Excellent
4 Type1 Poor
Functions
 Most tasks are performed by calling a function in R. All R functions
have three parts:
 the body(), the code inside the function.
 the formals(), the list of arguments which controls how you
can call the function.
 the environment(), the “map” of the location of the function’s
variables.
 The general form of a function is given by:
functionname <- function(arg1, arg2,...)
{
Body of function: a collection of valid statements
}
Functions
 Example 1: Creating a function, called f1, which adds a pair of numbers.
f1 <- function(x, y)
{
x+y
}
f1( 3, 4)
[1] 7
Functions
 Example 2: Creating a function, called readinteger.
readinteger <- function()
{
n <- readline(prompt="Enter an integer: ")
return(as.integer(n))
}
print(readinteger())
Enter an integer: 55
[1] 55
Functions
Example 3: calculate rnorm()
x <- rnorm(100)
y <- x + rnorm(100)
plot(x, y)
my.plot <- function(..., pch.new=15)
{
plot(..., pch=pch.new)
}
my.plot(x, y)
Control flow
 A list of constructions to perform testing and looping in R.
 These allow you to control the flow of execution of a script typically
inside of a function. Common ones include:
 if, else
 switch
 for
 while
 repeat
 break
 next
 return
Simple if
 Syntax:
if (test_expression) {statement}
 Example:
x <- 5
if(x > 0)
{
print("Positive number")
}
 Output:
[1] "Positive number"
 Example:
x <- 4 == 3
if (x)
{
"4 equals 3"
}
 Output:
[1] FALSE
if...else
 Syntax:
if (test_expression)
{
statement1
}else
{
statement2
}
 Note that else must be in the
same line as the closing braces
of the if statements.
 Example:
x <- -5
if(x > 0)
{
print("Non-negative number")
} else
{
print("Negative number")
}
 Output:
[1] "Positive number"
Nested if...else
 Syntax:
if ( test_expression1)
{
statement1
} else if ( test_expression2)
{
statement2
} else if ( test_expression3)
{
statement3
} else
statement4
 Only one statement will get
executed depending upon the
test_expressions.
 Example:
x <- 0
if (x < 0)
{
print("Negative number")
} else if (x > 0)
{
print("Positive number")
} else
print("Zero")
 Output:
[1] "Zero"
ifelse()
 There is a vector equivalent form of the if...else statement in R, the
ifelse() function.
 Syntax:
ifelse(test_expression, x, y)
 Example:
> a = c(5,7,2,9)
> ifelse(a %% 2 == 0,"even","odd")
 Output:
[1] "odd" "odd" "even" "odd"
for
 A for loop is used to iterate over a vector, in R programming.
 Syntax:
for (val in sequence) {statement}
 Example:
v <- c("this", "is", "the", “R", "for", "loop")
for(i in v)
{
print(i)
}
 Output:
[1] "this"
[1] "is"
[1] "the"
[1] R
[1] "for"
[1] "loop"
Nested for loops
 We can use a for loop within another for loop to iterate over two things
at once (e.g., rows and columns of a matrix).
 Example:
for(i in 1:3)
{
for(j in 1:3)
{
print(paste(i,j))
}
}
 Output:
[1] "1 1"
[1] "1 2"
[1] "1 3"
[1] "2 1"
[1] "2 2"
[1] "2 3"
[1] "3 1"
[1] "3 2"
[1] "3 3"
while
 while loops are used to loop until a specific condition is met.
 Syntax:
while (test_expression)
{
statement
}
 Example:
i <- 1
while (i < 6)
{
print(i)
i = i+1
}
 Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
repeat
 The easiest loop to master in R is repeat.
 All it does is execute the same code over and over until you tell it to
stop.
 Syntax:
repeat {statement}
 Example:
x <- 1
repeat {
print(x)
x = x+1
if (x == 6){
break
}
}
 Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
break
 A break statement is used inside a loop to stop the iterations and
flow the control outside of the loop.
 Example:
x <- 1:5
for (val in x) {
if (val == 3){
break
}
print(val)
}
 Output:
[1] 1
[1] 2
Replication
 The rep() repeats its input several times.
 Another related function, replicate() calls an expression several times.
 rep will repeat the same random number several times, but replicate
gives a different number each time
 Example:
>rep(runif(1), 5)
[1] 0.04573 0.04573 0.04573 0.04573 0.04573
>replicate(5, runif(1))
[1] 0.5839 0.3689 0.1601 0.9176 0.5388
Packages
 Packages are collections of R functions, compiled code, data,
documentation, and tests, in a well-defined format.
 The directory where packages are stored is called the library.
 R comes with a standard set of packages.
 Others are available for download and installation.
 Once installed, they have to be loaded into the session to be used.
>.libPaths() # get library location
>library() # see all packages installed
>search() # see packages currently loaded
Adding Packages
 You can expand the types of analyses you do be adding other packages.
 For adding package, Download and install a package.
1 2
Loading Packages
 To load a package that is already installed on your machine; and call the
library function with package name which package you want to load.
 For example, the lattice package should be installed, but it won’t
automatically be loaded. We can load it with the library() or require().
>library(lattice)
 Same as,
>library(eda) # load package "eda"
>require(eda) # the same
>library() # list all available packages
>library(lib = .Library) # list all packages in the default library
>library(help = eda) # documentation on package "eda"
Importing and Exporting Data
There are many ways to get data in and out.
Most programs (e.g. Excel), as well as humans, know how to deal
with rectangular tables in the form of tab-delimited text files.
Normally, you would start your R session by reading in some data to
be analysed. This can be done with the read.table function. Download
the sample data to your local directory...
>x <- read.table(“sample.txt", header = TRUE)
Also: read.delim, read.csv, scan
>write.csv(x, file = “samplenew.csv")
Also: write.matrix, write.table, write HANDSON
Frequently used Operators
<- Assign
+ Sum
- Difference
* Multiplication
/ Division
^ Exponent
%% Mod
%*% Dot product
%/% Integer division
%in% Subset
| Or
& And
< Less
> Greater
<= Less or =
>= Greater or =
! Not
!= Not equal
== Is equal
Frequently used Functions
c Concatenate
cbind,
rbind
Concatenate vectors
min Minimum
max Maximum
length # values
dim # rows, cols
floor Max integer in
which TRUE indices
table Counts
summary Generic stats
Sort, order,
rank
Sort, order, rank a
vector
print Show value
cat Print as char
paste c() as char
round Round
apply Repeat over rows,
cols
Statistical Functions
rnorm, dnorm, pnorm,
qnorm
Normal distribution random sample, density,
cdf and quantiles
lm, glm, anova Model fitting
loess, lowess Smooth curve fitting
sample Resampling (bootstrap, permutation)
.Random.seed Random number generation
mean, median Location statistics
var, cor, cov, mad, range Scale statistics
svd, qr, chol, eigen Linear algebra
Graphical Functions
plot Generic plot eg: scatter
points Add points
lines, abline Add lines
text, mtext Add text
legend Add a legend
axis Add axes
box Add box around all axes
par Plotting parameters (lots!)
colors, palette Use colors
Thank you!
Queries ???

Weitere ähnliche Inhalte

Was ist angesagt?

3. R- list and data frame
3. R- list and data frame3. R- list and data frame
3. R- list and data framekrishna singh
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
List,tuple,dictionary
List,tuple,dictionaryList,tuple,dictionary
List,tuple,dictionarynitamhaske
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
Looping statement in python
Looping statement in pythonLooping statement in python
Looping statement in pythonRaginiJain21
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
R Programming: Introduction to Matrices
R Programming: Introduction to MatricesR Programming: Introduction to Matrices
R Programming: Introduction to MatricesRsquared Academy
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to RstudioOlga Scrivner
 
Python functions
Python functionsPython functions
Python functionsAliyamanasa
 
dplyr Package in R
dplyr Package in Rdplyr Package in R
dplyr Package in RVedant Shah
 
Set Theory
Set TheorySet Theory
Set Theoryitutor
 
R basics
R basicsR basics
R basicsFAO
 
Introduction to data structure ppt
Introduction to data structure pptIntroduction to data structure ppt
Introduction to data structure pptNalinNishant3
 
Python Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplPython Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplAnkur Shrivastava
 

Was ist angesagt? (20)

Python : Data Types
Python : Data TypesPython : Data Types
Python : Data Types
 
3. R- list and data frame
3. R- list and data frame3. R- list and data frame
3. R- list and data frame
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
List,tuple,dictionary
List,tuple,dictionaryList,tuple,dictionary
List,tuple,dictionary
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Getting Started with R
Getting Started with RGetting Started with R
Getting Started with R
 
Looping statement in python
Looping statement in pythonLooping statement in python
Looping statement in python
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Arrays
ArraysArrays
Arrays
 
R Programming: Introduction to Matrices
R Programming: Introduction to MatricesR Programming: Introduction to Matrices
R Programming: Introduction to Matrices
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to Rstudio
 
Python functions
Python functionsPython functions
Python functions
 
dplyr Package in R
dplyr Package in Rdplyr Package in R
dplyr Package in R
 
R programming
R programmingR programming
R programming
 
Set Theory
Set TheorySet Theory
Set Theory
 
Loops in Python
Loops in PythonLoops in Python
Loops in Python
 
R basics
R basicsR basics
R basics
 
Introduction to data structure ppt
Introduction to data structure pptIntroduction to data structure ppt
Introduction to data structure ppt
 
Python Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG ManiaplPython Workshop Part 2. LUG Maniapl
Python Workshop Part 2. LUG Maniapl
 
Theory of computing
Theory of computingTheory of computing
Theory of computing
 

Ähnlich wie R Basics (20)

R교육1
R교육1R교육1
R교육1
 
R tutorial for a windows environment
R tutorial for a windows environmentR tutorial for a windows environment
R tutorial for a windows environment
 
R programming
R programmingR programming
R programming
 
R Programming Intro
R Programming IntroR Programming Intro
R Programming Intro
 
Programming in R
Programming in RProgramming in R
Programming in R
 
Ggplot2 v3
Ggplot2 v3Ggplot2 v3
Ggplot2 v3
 
R Programming.pptx
R Programming.pptxR Programming.pptx
R Programming.pptx
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
R basics
R basicsR basics
R basics
 
bobok
bobokbobok
bobok
 
Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017Big Data Mining in Indian Economic Survey 2017
Big Data Mining in Indian Economic Survey 2017
 
statistical computation using R- an intro..
statistical computation using R- an intro..statistical computation using R- an intro..
statistical computation using R- an intro..
 
Arrays in C language
Arrays in C languageArrays in C language
Arrays in C language
 
R language introduction
R language introductionR language introduction
R language introduction
 
A quick introduction to R
A quick introduction to RA quick introduction to R
A quick introduction to R
 
Arrays
ArraysArrays
Arrays
 
Session 4
Session 4Session 4
Session 4
 
L5 array
L5 arrayL5 array
L5 array
 
Arrays
ArraysArrays
Arrays
 
Basic R Data Manipulation
Basic R Data ManipulationBasic R Data Manipulation
Basic R Data Manipulation
 

Kürzlich hochgeladen

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesShubhangi Sonawane
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 

Kürzlich hochgeladen (20)

Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

R Basics

  • 1. Dr. E. N. SATHISHKUMAR, Guest Lecturer, Department of Computer Science, Periyar University, Salem -11.
  • 2. Introduction  R (the language) was created in the early 1990s, by Ross Ihaka and Robert Gentleman.  It is based upon the S language that was developed at Bell Laboratories in the 1970s.  It is a high-level language like C#, Java, etc..,  R is an interpreted language (sometimes called a scripting language), which means that your code doesn’t need to be compiled before you run it.  R supports a mixture of programming paradigms (At its core, it is an imperative language, but it also supports OOP, and functional programming).
  • 3. Getting started  Where to get R? The newest version of R and its documentation can be downloaded from http://www.r-project.org.  Download, Packages: Select CRAN  Set your Mirror: India (Indian Institute of Technology Madras) Select http://ftp.iitm.ac.in/cran/  Select Download R for Windows  Select base.  Select Download R 3.4.2 for Windows  Execute the R-3.4.2-win.exe with administrator privileges. Once the program is installed, run the R program by clicking on its icon
  • 4. Choosing an IDE  If you use R under Windows or Mac OS X, then a graphical user interface (GUI) is available to you.  Some of he best GUIs are:  Emacs + ESS  Eclipse/Architect  RStudio  Revolution-R  Live-R  Tinn-R
  • 5. A Scientific Calculator  R is at heart a supercharged scientific calculator, so typing commands directly into the R Console. > 5+5 [1] 10 > 4-7 [1] -3 > 7*3 [1] 21 > 16/31 [1] 0.516129 > log2(32) [1] 5
  • 6. Variable Assignment  We assign values to variables with the assignment operator "=".  Just typing the variable by itself at the prompt will print out the value.  We should note that another form of assignment operator "<-" is also in use. > X = 2 [1] 2 > X <- 5 [1] 5 > X * X [1] 25
  • 7. Comments  All text after the pound sign "#" within the same line is considered a comment. > X = 2 # this is a comment [1] 2 # 5 is assign to variable X > X <- 5 [1] 5
  • 8. Getting Help  R provides extensive documentation. If we want to help to particular topic, just use help() with help topic.  For example,  > help("if")  starting httpd help server ... Done  Immediately help content opens in web browser.
  • 9. Basic Data Types  There are several basic R data types that are of frequent occurrence in routine R calculations.  Numeric  Integer  Complex  Logical  Character  Factor
  • 10. Numeric  Decimal values are called numerics in R. It is the default computational data type.  If we assign a decimal value to a variable x as follows, x will be of numeric type. > x = 10.5 # assign a decimal value > x # print the value of x [1] 10.5 > class(x) # print the class name of x [1] "numeric"
  • 11. Numeric  Furthermore, even if we assign an integer to a variable k, it is still being saved as a numeric value. > k = 1 > k # print the value of k [1] 1 > class(k) # print the class name of k [1] "numeric"  The fact that k is not an integer can be confirmed with the is.integer function. > is.integer(k) # is k an integer? [1] FALSE
  • 12. Integer  In order to create an integer variable in R, we invoke the as.integer function.  For example, > y = as.integer(3) > y # print the value of y [1] 3 > class(y) # print the class name of y [1] "integer" > is.integer(y) # is y an integer? [1] TRUE
  • 13. Complex  A complex value in R is defined via the pure imaginary value i.  For example, > z = 1 + 2i # create a complex number > z # print the value of z [1] 1+2i > class(z) # print the class name of z [1] "complex"  The following gives an error as −1 is not a complex value. > sqrt(−1) # square root of −1 [1] NaN  Warning message: In sqrt(−1) : NaNs produced
  • 14. Complex  Instead, we have to use the complex value −1 + 0i.  For example, > sqrt(−1+0i) # square root of −1+0i [1] 0+1i  An alternative is to coerce −1 into a complex value. > sqrt(as.complex(−1)) [1] 0+1i
  • 15. Logical  A logical value is often created via comparison between variables.  For example, > x = 1; y = 2 # sample values > z = x > y # is x larger than y? > z # print the logical value [1] FALSE > class(z) # print the class name of z [1] "logical"
  • 16. Logical  A Standard logical operations are "&", "|" , "!" .  For example, > u = TRUE; v = FALSE > u & v # u AND v [1] FALSE > u | v # u OR v [1] TRUE > !u # negation of u [1] FALSE
  • 17. Character  A character object is used to represent string values in R. We convert objects into character values with the as.character(). For example, > x = as.character(3.14) > x # print the character string [1] "3.14" > class(x) # print the class name of x [1] "character" > x = as.character( “hai”) > x # print the character string [1] “hai” > class(x) # print the class name of x [1] "character"
  • 18. Factor  The factor data type is used to represent categorical data. (i.e. data of which the value range is a collection of codes).  For example, to create a vector of length five of type factor do the following: >sex <- c("male","male","female","male","female")  The object sex is a character object. You need to transform it to factor. >sex <- factor(sex) >sex [1] male male female male female Levels: female male  Use the function levels to see the different levels a factor variable has.
  • 19. Data structures  Before you can perform statistical analysis in R, your data has to be structured in some coherent way. To store your data R has the following structures:  Vector  Matrix  Array  Data frame  Time-series  List
  • 20. Vectors  A vector is a sequence of data elements of the same basic type.  Members in a vector are officially called components.  For example, Here is a vector containing three numeric values 2, 3, 5. > c(2, 3, 5) [1] 2 3 5  Here is a vector of logical values. > c(TRUE, FALSE, TRUE, FALSE, FALSE) [1] TRUE FALSE TRUE FALSE FALSE
  • 21. Combining Vectors  Vectors can be combined via the function c.  For example, Here is a vector containing three numeric values 2, 3, 5. > n = c(2, 3, 5) > s = c("aa", "bb", "cc", "dd", "ee") > c(n, s) [1] "2" "3" "5" "aa" "bb" "cc" "dd" "ee"
  • 22. Vector Arithmetics  Arithmetic operations of vectors are performed member-by-member.  For example, Here is a vector containing three numeric values 2, 3, 5. > a = c(1, 3, 5, 7) > b = c(1, 2, 4, 8)  We add a and b together, the sum would be a vector whose members are the sum of the corresponding members from a and b. > a + b [1] 2 5 9 15  Similarly for subtraction, multiplication and division, we get new vectors via member wise operations.
  • 23. Vector Recycling Rule  If two vectors are of unequal length, the shorter one will be recycled in order to match the longer vector.  For example, sum is computed by recycling values of the shorter vector. > u = c(10, 20, 30) > v = c(1, 2, 3, 4, 5, 6, 7, 8, 9) > u + v [1] 11 22 33 14 25 36 17 28 39
  • 24. Vector Index  We retrieve values in a vector by declaring an index inside a single square bracket "[ ]" operator.  For example, > s = c("aa", "bb", "cc", "dd", "ee") > > s[3] [1] "cc"
  • 25. Vector Negative Index  If the index is negative, it would strip the member whose position has the same absolute value as the negative index.  For example, > s = c("aa", "bb", "cc", "dd", "ee") > s[-3] [1] "aa" "bb" "dd" "ee" Out-of-Range Index  If an index is out-of-range, a missing value will be reported via the symbol NA. >s[10] [1] NA
  • 26. Numeric Index Vector  A new vector can be sliced from a given vector with a numeric index vector, which consists of member positions of the original vector to be retrieved.  For example, > s = c("aa", "bb", "cc", "dd", "ee") > s[c(2, 3)] [1] "bb" "cc"
  • 27. Vector Duplicate Indexes  The index vector allows duplicate values. Hence the following retrieves a member twice in one operation.  For example, > s = c("aa", "bb", "cc", "dd", "ee") > s[c(2, 3, 3)] [1] "bb" "cc" "cc"
  • 28. Vector Out-of-Order Indexes  The index vector can even be out-of-order. Here is a vector slice with the order of first and second members reversed.  For example, > s = c("aa", "bb", "cc", "dd", "ee") > s[c(2, 1, 3)] [1] "bb" "aa" "cc"
  • 29. Vector Range Index  To produce a vector slice between two indexes, we can use the colon operator ":".  For example, > s = c("aa", "bb", "cc", "dd", "ee") > s[2:4] [1] "bb" "cc" "dd"
  • 30. Named Vector Members  We can assign names to vector members.  For example, the following variable v is a character string vector with two members. > v = c("Mary", "Sue") > v [1] "Mary" "Sue"  We now name the first member as First, and the second as Last. > names(v) = c("First", "Last") > v First Last "Mary" "Sue"
  • 31. Named Vector Members  We can assign names to vector members.  For example, the following variable v is a character string vector with two members. > v = c("Mary", "Sue") > v [1] "Mary" "Sue”  We now name the first member as First, and the second as Last. > names(v) = c("First", "Last") > v First Last "Mary" "Sue"
  • 32. Matrices  A matrix is a collection of data elements arranged in a row- column layout.  A matrix can be regarded as a generalization of a vector.  As with vectors, all the elements of a matrix must be of the same data type.  A matrix can be generated in several ways.  Use the function dim  Use the function matrix
  • 33. Matrices  Use the function dim > x <- 1:8 [,1] [,2] [,3] [,4] > dim(x) <- c(2,4) [1,] 1 3 5 7 > X [2,] 2 4 6 8  Use the function matrix > A = matrix(c(2, 4, 3, 1, 5, 7), nrow=2, ncol=3, byrow = T) > A > A <- matrix(c(2, 4, 3, 1, 5, 7),2,3,byrow=T) > A A [,1] [,2] [,3] [1,] 2 4 3 [2,] 1 5 7
  • 34. Accessing Matrices  An element at the mth row, nth column of A can be accessed by the expression A[m, n]. > A[2, 3] [1] 7  The entire mth row A can be extracted as A[m, ]. > A[2, ] [1] 1 5 7  We can also extract more than one rows/columns at a time. > A[ ,c(1,3)] [,1] [,2] [1,] 2 3 [2,] 1 7
  • 35. Calculations on matrices  We construct the transpose of a matrix by interchanging its columns and rows with the function t . > t(A) # transpose of A [,1] [,2] [1,] 2 1 [2,] 4 5 [3,] 3 7  We can deconstruct a matrix by applying the c function, which combines all column vectors into one. > c(A) [1] 2 4 3 1 5 7
  • 36. Arrays  In R, Arrays are generalizations of vectors and matrices.  A vector is a one-dimensional array and a matrix is a two dimensional array.  As with vectors and matrices, all the elements of an array must be of the same data type.  An array of one dimension of two element may be constructed as follows. > x = array(c(T,F),dim=c(2)) > print(x) [1] TRUE FALSE
  • 37. Arrays  A three dimensional array - 3 by 3 by 3 - may be created as follows. > z = array(1:27,dim=c(3,3,3)) > dim(z) [1] 3 3 3 > print(z) , , 1 [,1] [,2] [,3] [1,] 1 4 7 [2,] 2 5 8 [3,] 3 6 9 , , 2 [,1] [,2] [,3] [1,] 10 13 16 [2,] 11 14 17 [3,] 12 15 18 , , 3 [,1] [,2] [,3] [1,] 19 22 25 [2,] 20 23 26 [3,] 21 24 27
  • 38. Accessing Arrays  R arrays are accessed in a manner similar to arrays in other languages: by integer index, starting at 1 (not 0).  For example, the third dimension is a 3 by 3 array. > z[,,3] [,1] [,2] [,3] [1,] 19 22 25 [2,] 20 23 26 [3,] 21 24 27  Specifying two of the three dimensions returns an array on one dimension. >z[,3,3] [1] 25 26 27
  • 39. Accessing Arrays  Specifying three of three dimension returns an element of the 3 by 3 by 3 array. > z[3,3,3] [1] 27  More complex partitioning of array may be had. > z[,c(2,3),c(2,3)] , , 1 [,1] [,2] [1,] 13 16 [2,] 14 17 [3,] 15 18 , , 2 [,1] [,2] [1,] 22 25 [2,] 23 26 [3,] 24 27
  • 40. Lists  A list is a collection of R objects.  list() creates a list. unlist() transform a list into a vector.  The objects in a list do not have to be of the same type or length. >x <- c(1:4) >y <- FALSE > z <- matrix(c(1:4),nrow=2,ncol=2) > myList <- list(x,y,z) > myList [[1]] [1] 1 2 3 4 [[2]] [1] FALSE [[3]] [,1] [,2] [1,] 1 2 [2,] 3 4
  • 41. Data Frame  A data frame is used for storing data like spreadsheet(table).  It is a list of vectors of equal length.  Most statistical modeling routines in R require a data frame as input.  For example, > weight = c(150, 135, 210, 140) > height = c(65, 61, 70, 65) > gender = c("Fe","Fe","Ma","Fe") > study = data.frame(weight,height,gender) # make the data frame > study weight height gender 1 150 65 Fe 2 135 61 Fe 3 210 70 Ma 4 140 65 Fe
  • 42. Creating a data frame  The dataframe may be created directly using data.frame().  For example, the dataframe is created - naming each vector composing the dataframe as part of the argument list. > patientID <- c(1, 2, 3, 4) > age <- c(25, 34, 28, 52) > diabetes <- c("Type1", "Type2", "Type1", "Type1") > status <- c("Poor", "Improved", "Excellent", "Poor") > patientdata <- data.frame(patientID, age, diabetes, status) > patientdata patientID age diabetes status 1 1 25 Type1 Poor 2 2 34 Type2 Improved 3 3 28 Type1 Excellent 4 4 52 Type1 Poor
  • 43. Accessing data frame elements  Use the subscript notation/specify column names to identify the elements in the patient data frame [1] 25 34 28 52 >patientdata[1:2] patientID age 1 1 25 2 2 34 3 3 28 4 4 52 >table(patientdata$diabetes, patientdata$status) Excellent Improved Poor Type1 1 0 2 Type2 0 1 0 >patientdata[c("diabetes", "status")] diabetes status 1 Type1 Poor 2 Type2 Improved 3 Type1 Excellent 4 Type1 Poor
  • 44. Functions  Most tasks are performed by calling a function in R. All R functions have three parts:  the body(), the code inside the function.  the formals(), the list of arguments which controls how you can call the function.  the environment(), the “map” of the location of the function’s variables.  The general form of a function is given by: functionname <- function(arg1, arg2,...) { Body of function: a collection of valid statements }
  • 45. Functions  Example 1: Creating a function, called f1, which adds a pair of numbers. f1 <- function(x, y) { x+y } f1( 3, 4) [1] 7
  • 46. Functions  Example 2: Creating a function, called readinteger. readinteger <- function() { n <- readline(prompt="Enter an integer: ") return(as.integer(n)) } print(readinteger()) Enter an integer: 55 [1] 55
  • 47. Functions Example 3: calculate rnorm() x <- rnorm(100) y <- x + rnorm(100) plot(x, y) my.plot <- function(..., pch.new=15) { plot(..., pch=pch.new) } my.plot(x, y)
  • 48. Control flow  A list of constructions to perform testing and looping in R.  These allow you to control the flow of execution of a script typically inside of a function. Common ones include:  if, else  switch  for  while  repeat  break  next  return
  • 49. Simple if  Syntax: if (test_expression) {statement}  Example: x <- 5 if(x > 0) { print("Positive number") }  Output: [1] "Positive number"  Example: x <- 4 == 3 if (x) { "4 equals 3" }  Output: [1] FALSE
  • 50. if...else  Syntax: if (test_expression) { statement1 }else { statement2 }  Note that else must be in the same line as the closing braces of the if statements.  Example: x <- -5 if(x > 0) { print("Non-negative number") } else { print("Negative number") }  Output: [1] "Positive number"
  • 51. Nested if...else  Syntax: if ( test_expression1) { statement1 } else if ( test_expression2) { statement2 } else if ( test_expression3) { statement3 } else statement4  Only one statement will get executed depending upon the test_expressions.  Example: x <- 0 if (x < 0) { print("Negative number") } else if (x > 0) { print("Positive number") } else print("Zero")  Output: [1] "Zero"
  • 52. ifelse()  There is a vector equivalent form of the if...else statement in R, the ifelse() function.  Syntax: ifelse(test_expression, x, y)  Example: > a = c(5,7,2,9) > ifelse(a %% 2 == 0,"even","odd")  Output: [1] "odd" "odd" "even" "odd"
  • 53. for  A for loop is used to iterate over a vector, in R programming.  Syntax: for (val in sequence) {statement}  Example: v <- c("this", "is", "the", “R", "for", "loop") for(i in v) { print(i) }  Output: [1] "this" [1] "is" [1] "the" [1] R [1] "for" [1] "loop"
  • 54. Nested for loops  We can use a for loop within another for loop to iterate over two things at once (e.g., rows and columns of a matrix).  Example: for(i in 1:3) { for(j in 1:3) { print(paste(i,j)) } }  Output: [1] "1 1" [1] "1 2" [1] "1 3" [1] "2 1" [1] "2 2" [1] "2 3" [1] "3 1" [1] "3 2" [1] "3 3"
  • 55. while  while loops are used to loop until a specific condition is met.  Syntax: while (test_expression) { statement }  Example: i <- 1 while (i < 6) { print(i) i = i+1 }  Output: [1] 1 [1] 2 [1] 3 [1] 4 [1] 5
  • 56. repeat  The easiest loop to master in R is repeat.  All it does is execute the same code over and over until you tell it to stop.  Syntax: repeat {statement}  Example: x <- 1 repeat { print(x) x = x+1 if (x == 6){ break } }  Output: [1] 1 [1] 2 [1] 3 [1] 4 [1] 5
  • 57. break  A break statement is used inside a loop to stop the iterations and flow the control outside of the loop.  Example: x <- 1:5 for (val in x) { if (val == 3){ break } print(val) }  Output: [1] 1 [1] 2
  • 58. Replication  The rep() repeats its input several times.  Another related function, replicate() calls an expression several times.  rep will repeat the same random number several times, but replicate gives a different number each time  Example: >rep(runif(1), 5) [1] 0.04573 0.04573 0.04573 0.04573 0.04573 >replicate(5, runif(1)) [1] 0.5839 0.3689 0.1601 0.9176 0.5388
  • 59. Packages  Packages are collections of R functions, compiled code, data, documentation, and tests, in a well-defined format.  The directory where packages are stored is called the library.  R comes with a standard set of packages.  Others are available for download and installation.  Once installed, they have to be loaded into the session to be used. >.libPaths() # get library location >library() # see all packages installed >search() # see packages currently loaded
  • 60. Adding Packages  You can expand the types of analyses you do be adding other packages.  For adding package, Download and install a package. 1 2
  • 61. Loading Packages  To load a package that is already installed on your machine; and call the library function with package name which package you want to load.  For example, the lattice package should be installed, but it won’t automatically be loaded. We can load it with the library() or require(). >library(lattice)  Same as, >library(eda) # load package "eda" >require(eda) # the same >library() # list all available packages >library(lib = .Library) # list all packages in the default library >library(help = eda) # documentation on package "eda"
  • 62. Importing and Exporting Data There are many ways to get data in and out. Most programs (e.g. Excel), as well as humans, know how to deal with rectangular tables in the form of tab-delimited text files. Normally, you would start your R session by reading in some data to be analysed. This can be done with the read.table function. Download the sample data to your local directory... >x <- read.table(“sample.txt", header = TRUE) Also: read.delim, read.csv, scan >write.csv(x, file = “samplenew.csv") Also: write.matrix, write.table, write HANDSON
  • 63. Frequently used Operators <- Assign + Sum - Difference * Multiplication / Division ^ Exponent %% Mod %*% Dot product %/% Integer division %in% Subset | Or & And < Less > Greater <= Less or = >= Greater or = ! Not != Not equal == Is equal
  • 64. Frequently used Functions c Concatenate cbind, rbind Concatenate vectors min Minimum max Maximum length # values dim # rows, cols floor Max integer in which TRUE indices table Counts summary Generic stats Sort, order, rank Sort, order, rank a vector print Show value cat Print as char paste c() as char round Round apply Repeat over rows, cols
  • 65. Statistical Functions rnorm, dnorm, pnorm, qnorm Normal distribution random sample, density, cdf and quantiles lm, glm, anova Model fitting loess, lowess Smooth curve fitting sample Resampling (bootstrap, permutation) .Random.seed Random number generation mean, median Location statistics var, cor, cov, mad, range Scale statistics svd, qr, chol, eigen Linear algebra
  • 66. Graphical Functions plot Generic plot eg: scatter points Add points lines, abline Add lines text, mtext Add text legend Add a legend axis Add axes box Add box around all axes par Plotting parameters (lots!) colors, palette Use colors