SlideShare ist ein Scribd-Unternehmen logo
1 von 32
R- An Introduction and
Application
2
Outline

Why R, and R Paradigm

R Overview

R Interface

R Workspace

Help

R Packages

Input/Output

Reusing Results

References, Tutorials and links
3
Why R?

It's free!

It runs on a variety of platforms including Windows, Unix and
MacOS.

It provides an unparalleled platform for programming new
statistical methods in an easy and straightforward manner

It contains advanced statistical routines not yet available in
other packages.

It has state-of-the-art graphics capabilities.
4
How to download?
 Google it using R or CRAN
(Comprehensive R Archive Network)
 http://www.r-project.org
R-Interface
5
6
Basic functions
>In import file section of the Environment
Import the file from the desktop
>
7
8
An example
> # An example
> x <- c(1:10)
> x[(x>8) | (x<5)]
> # yields 1 2 3 4 9 10
> # How it works
> x <- c(1:10)
> X
>1 2 3 4 5 6 7 8 9 10
> x > 8
> F F F F F F F F T T
> x < 5
> T T T T F F F F F F
> x > 8 | x < 5
> T T T T F F F F T T
> x[c(T,T,T,T,F,F,F,F,T,T)]
> 1 2 3 4 9 10
9
R Warning !

R is a case sensitive language

FOO, Foo, and foo are three
different objects
10
R Help
 Once R is installed, there is a comprehensive
built-in help system and we can use any of the
following:
 help.start() # general help
help(foo) # help about function foo
?foo # same thing
apropos("foo") # list all function containing string foo
example(foo) # show an example of function foo
11
R Datasets
R comes with a number of sample datasets that
you can experiment with. Type
> data( )
to see the available datasets. The results will
depend on which packages you have loaded.
Type
help(datasetname)
for details on a sample dataset.
12
R Packages
 When you download R, already a number (around 30)
of packages are downloaded as well..
> search()
[1] ".GlobalEnv" "package:stats" "package:graphics"
[4] "package:grDevices" "package:datasets" "package:utils"
[7] "package:methods" "Autoloads" "package:base"
Data Manipulation
14
Creating new variables

Use the assignment operator <- to create new variables. A
wide array of operators and functions are available here.

# Three examples for doing the same computations
mydata$sum <- mydata$x1 + mydata$x2
mydata$mean <- (mydata$x1 + mydata$x2)/2
attach(mydata)
mydata$sum <- x1 + x2
mydata$mean <- (x1 + x2)/2
detach(mydata)

mydata <- transform( mydata,
sum = x1 + x2,
mean = (x1 + x2)/2
)
15
Creating new variables
Recoding variables

# create 2 age categories
mydata$agecat <- ifelse(mydata$age > 70,
c("older"), c("younger"))

# another example: create 3 age categories
attach(mydata)
mydata$agecat[age > 75] <- "Elder"
mydata$agecat[age > 45 & age <= 75] <- "Middle Aged"
mydata$agecat[age <= 45] <- "Young"
detach(mydata)
Applied Statistical Computing and
Graphics 16
Arithmetic Operators
Operator Description
+ addition
- subtraction
* multiplication
/ division
^ or ** exponentiation
17
Logical Operators
Operator Description
< less than
<= less than or equal to
> greater than
>= greater than or equal to
== exactly equal to
!= not equal to
!x Not x
x | y x OR y
x & y x AND y
isTRUE(x) test if x is TRUE
18
Sorting

To sort a dataframe in R, use the order( ) function. By
default, sorting is ASCENDING. Prepend the sorting variable
by a minus sign to indicate DESCENDING order. Here are
some examples.

# sorting examples using the mtcars dataset
data(mtcars)
# sort by mpg
newdata = mtcars[order(mtcars$mpg),]
# sort by mpg and cyl
newdata <- mtcars[order(mtcars$mpg, mtcars$cyl),]
#sort by mpg (ascending) and cyl (descending)
newdata <- mtcars[order(mtcars$mpg, -mtcars$cyl),]
19
Merging
To merge two dataframes (datasets) horizontally, use
the merge function. In most cases, you join two
dataframes by one or more common key variables
(i.e., an inner join).
# merge two dataframes by ID
total <- merge(dataframeA,dataframeB,by="ID")
# merge two dataframes by ID and Country
total <-
merge(dataframeA,dataframeB,by=c("ID","Country
"))
20
Merging
ADDING ROWS
To join two dataframes (datasets) vertically, use the rbind
function. The two dataframes must have the same
variables, but they do not have to be in the same order.
total <- rbind(dataframeA, dataframeB)
If dataframeA has variables that dataframeB does not, then either:
Delete the extra variables in dataframeA or
Create the additional variables in dataframeB and set them to NA
(missing)
before joining them with rbind.
Graphs
22
Creating a Graph

In R, graphs are typically created interactively.
# Creating a Graph
attach(mtcars)
plot(wt, mpg)
abline(lm(mpg~wt))
title("Regression of MPG on Weight")

The plot( ) function opens a graph window and
plots weight vs. miles per gallon.
The next line of code adds a regression line to this
graph. The final line adds a title.
Statistics
24
T test

The t.test( ) function produces a variety of t-tests. Unlike most
statistical packages, the default assumes unequal variance and
applies the Welsh df modification.# independent 2-group t-test
t.test(y~x) # where y is numeric and x is a binary factor

# independent 2-group t-test
t.test(y1,y2) # where y1 and y2 are numeric

# paired t-test
t.test(y1,y2,paired=TRUE) # where y1 & y2 are numeric

# one samle t-test
t.test(y,mu=3) # Ho: mu=3
25
Multiple (Linear) Regression

R provides comprehensive support for multiple linear regression.
The topics below are provided in order of increasing complexity.

Fitting the Model

# Multiple Linear Regression Example
fit <- lm(y ~ x1 + x2 + x3, data=mydata)
summary(fit) # show results

# Other useful functions
coefficients(fit) # model coefficients
confint(fit, level=0.95) # CIs for model parameters
fitted(fit) # predicted values
residuals(fit) # residuals
anova(fit) # anova table
vcov(fit) # covariance matrix for model parameters
influence(fit) # regression diagnostics
26
Comparing Models

You can compare nested models with the anova( ) function. The
following code provides a simultaneous test that x3 and x4 add to
linear prediction above and beyond x1 and x2.

# compare models
fit1 <- lm(y ~ x1 + x2 + x3 + x4, data=mydata)
fit2 <- lm(y ~ x1 + x2)
anova(fit1, fit2)
27
ANOVA

If you have been analyzing ANOVA
designs in traditional statistical packages,
you are likely to find R's approach less
coherent and user-friendly. A good online
presentation on ANOVA in R is available
online.
28
Multiple Comparisons

You can get Tukey HSD tests using the function
below. By default, it calculates post hoc
comparisons on each factor in the model. You
can specify specific factors as an option. Again,
remember that results are based on Type I SS!

# Tukey Honestly Significant Differences
TukeyHSD(fit) # where fit comes from aov()
29
Visualizing Results
 Use box plots and line plots to visualize group differences.
interaction.plot( ) in the base stats package produces plots for two-way
interactions. plotmeans( ) in the gplots package produces mean plots for
single factors, and includes confidence intervals.

# Two-way Interaction Plot
attach(mtcars)
gears <- factor(gears)
cyl <- factor(cyl)
interaction.plot(cyl, gear, mpg, type="b", col=c(1:3),
leg.bty="o", leg.bg="beige", lwd=2, pch=c(18,24,22),
xlab="Number of Cylinders",
ylab="Mean Miles Per Gallon",
main="Interaction Plot")

# Plot Means with Error Bars
library(gplots)
attach(mtcars)
cyl <- factor(cyl)
plotmeans(mpg~cyl,xlab="Number of Cylinders",
ylab="Miles Per Gallon, main="Mean Plotnwith 95% CI")
30
Tutorials
Each of the following tutorials are in PDF format.
 P. Kuhnert & B. Venables, An Introduction to R: Software for Statistical
Modeling & Computing
 J.H. Maindonald, Using R for Data Analysis and Graphics
 B. Muenchen, R for SAS and SPSS Users
 W.J. Owen, The R Guide
 W.N. Venebles & D. M. Smith, An Introduction to R
Applied Statistical Computing and Graphics 31
Applied Statistical Computing and Graphics 32

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with RShareThis
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using RUmmiya Mohammedi
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RAjay Ohri
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programmingRamon Salazar
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with RKazuki Yoshida
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data AnalysisUmair Shafique
 
R programming presentation
R programming presentationR programming presentation
R programming presentationAkshat Sharma
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudioegoodwintx
 
Quantitative Data Analysis using R
Quantitative Data Analysis using RQuantitative Data Analysis using R
Quantitative Data Analysis using RTaddesse Kassahun
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factorskrishna singh
 

Was ist angesagt? (20)

Data Visualization With R
Data Visualization With RData Visualization With R
Data Visualization With R
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with R
 
Data visualization using R
Data visualization using RData visualization using R
Data visualization using R
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Class ppt intro to r
Class ppt intro to rClass ppt intro to r
Class ppt intro to r
 
Unit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptxUnit 1 - R Programming (Part 2).pptx
Unit 1 - R Programming (Part 2).pptx
 
How to get started with R programming
How to get started with R programmingHow to get started with R programming
How to get started with R programming
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with R
 
R programming
R programmingR programming
R programming
 
R programming
R programmingR programming
R programming
 
Exploratory Data Analysis
Exploratory Data AnalysisExploratory Data Analysis
Exploratory Data Analysis
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudio
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Data Management in R
Data Management in RData Management in R
Data Management in R
 
Quantitative Data Analysis using R
Quantitative Data Analysis using RQuantitative Data Analysis using R
Quantitative Data Analysis using R
 
2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors2. R-basics, Vectors, Arrays, Matrices, Factors
2. R-basics, Vectors, Arrays, Matrices, Factors
 

Ähnlich wie R studio

Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statisticsKrishna Dhakal
 
Stata cheat sheet: data processing
Stata cheat sheet: data processingStata cheat sheet: data processing
Stata cheat sheet: data processingTim Essam
 
Introduction to R - from Rstudio to ggplot
Introduction to R - from Rstudio to ggplotIntroduction to R - from Rstudio to ggplot
Introduction to R - from Rstudio to ggplotOlga Scrivner
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Yao Yao
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply FunctionSakthi Dasans
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R langsenthil0809
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfKabilaArun
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfattalurilalitha
 
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageMajid Abdollahi
 
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdfSTAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdfSOUMIQUE AHAMED
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Soham Kulkarni
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisUniversity of Illinois,Chicago
 
R programming slides
R  programming slidesR  programming slides
R programming slidesPankaj Saini
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisUniversity of Illinois,Chicago
 

Ähnlich wie R studio (20)

Presentation on use of r statistics
Presentation on use of r statisticsPresentation on use of r statistics
Presentation on use of r statistics
 
R basics
R basicsR basics
R basics
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Oct.22nd.Presentation.Final
Oct.22nd.Presentation.FinalOct.22nd.Presentation.Final
Oct.22nd.Presentation.Final
 
Stata cheat sheet: data processing
Stata cheat sheet: data processingStata cheat sheet: data processing
Stata cheat sheet: data processing
 
Introduction to R - from Rstudio to ggplot
Introduction to R - from Rstudio to ggplotIntroduction to R - from Rstudio to ggplot
Introduction to R - from Rstudio to ggplot
 
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
Mini-lab 1: Stochastic Gradient Descent classifier, Optimizing Logistic Regre...
 
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
NCCU: Statistics in the Criminal Justice System, R basics and Simulation - Pr...
 
4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function4 R Tutorial DPLYR Apply Function
4 R Tutorial DPLYR Apply Function
 
An Intoduction to R
An Intoduction to RAn Intoduction to R
An Intoduction to R
 
Get started with R lang
Get started with R langGet started with R lang
Get started with R lang
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
R-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdfR-Language-Lab-Manual-lab-1.pdf
R-Language-Lab-Manual-lab-1.pdf
 
Rattle Graphical Interface for R Language
Rattle Graphical Interface for R LanguageRattle Graphical Interface for R Language
Rattle Graphical Interface for R Language
 
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdfSTAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
STAT-522 (Data Analysis Using R) by SOUMIQUE AHAMED.pdf
 
Compiler Construction for DLX Processor
Compiler Construction for DLX Processor Compiler Construction for DLX Processor
Compiler Construction for DLX Processor
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 
R programming slides
R  programming slidesR  programming slides
R programming slides
 
Pumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency AnalysisPumps, Compressors and Turbine Fault Frequency Analysis
Pumps, Compressors and Turbine Fault Frequency Analysis
 

Mehr von Kinza Irshad

Why Pakistan will survive ppt by Kinza IRSHAD
Why Pakistan will survive ppt by Kinza IRSHADWhy Pakistan will survive ppt by Kinza IRSHAD
Why Pakistan will survive ppt by Kinza IRSHADKinza Irshad
 
Paper on sea role in trans boundry river management by Kinza Irshad
Paper on sea role in trans boundry river management by Kinza IrshadPaper on sea role in trans boundry river management by Kinza Irshad
Paper on sea role in trans boundry river management by Kinza IrshadKinza Irshad
 
Regional aspects of development and planning
Regional aspects of development and planningRegional aspects of development and planning
Regional aspects of development and planningKinza Irshad
 
PROJECT MANAGEMENT
PROJECT MANAGEMENT PROJECT MANAGEMENT
PROJECT MANAGEMENT Kinza Irshad
 
Memorizing techniquess
Memorizing techniquessMemorizing techniquess
Memorizing techniquessKinza Irshad
 
Composition &amp; structure of the atmosphere
Composition &amp; structure of the atmosphereComposition &amp; structure of the atmosphere
Composition &amp; structure of the atmosphereKinza Irshad
 
Hospital waste incineration
Hospital waste incineration Hospital waste incineration
Hospital waste incineration Kinza Irshad
 
Strategic Environmental Assessment
Strategic Environmental Assessment Strategic Environmental Assessment
Strategic Environmental Assessment Kinza Irshad
 
CPEC, A game changer
CPEC, A game changerCPEC, A game changer
CPEC, A game changerKinza Irshad
 
Attractions and Distraction
Attractions and DistractionAttractions and Distraction
Attractions and DistractionKinza Irshad
 
Halqa e asar final ppt
Halqa e asar final pptHalqa e asar final ppt
Halqa e asar final pptKinza Irshad
 
Leadership influence
Leadership influenceLeadership influence
Leadership influenceKinza Irshad
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysisKinza Irshad
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysisKinza Irshad
 
Lignocellulyitc enzymes, COMSATS Vehari
Lignocellulyitc enzymes, COMSATS VehariLignocellulyitc enzymes, COMSATS Vehari
Lignocellulyitc enzymes, COMSATS VehariKinza Irshad
 
Impact of agriculture on climate change
Impact of agriculture on climate change Impact of agriculture on climate change
Impact of agriculture on climate change Kinza Irshad
 

Mehr von Kinza Irshad (20)

Why Pakistan will survive ppt by Kinza IRSHAD
Why Pakistan will survive ppt by Kinza IRSHADWhy Pakistan will survive ppt by Kinza IRSHAD
Why Pakistan will survive ppt by Kinza IRSHAD
 
Paper on sea role in trans boundry river management by Kinza Irshad
Paper on sea role in trans boundry river management by Kinza IrshadPaper on sea role in trans boundry river management by Kinza Irshad
Paper on sea role in trans boundry river management by Kinza Irshad
 
Regional aspects of development and planning
Regional aspects of development and planningRegional aspects of development and planning
Regional aspects of development and planning
 
PROJECT MANAGEMENT
PROJECT MANAGEMENT PROJECT MANAGEMENT
PROJECT MANAGEMENT
 
Memorizing techniquess
Memorizing techniquessMemorizing techniquess
Memorizing techniquess
 
Composition &amp; structure of the atmosphere
Composition &amp; structure of the atmosphereComposition &amp; structure of the atmosphere
Composition &amp; structure of the atmosphere
 
Lbm degradation
Lbm degradationLbm degradation
Lbm degradation
 
Hospital waste incineration
Hospital waste incineration Hospital waste incineration
Hospital waste incineration
 
Strategic Environmental Assessment
Strategic Environmental Assessment Strategic Environmental Assessment
Strategic Environmental Assessment
 
CPEC, A game changer
CPEC, A game changerCPEC, A game changer
CPEC, A game changer
 
Attractions and Distraction
Attractions and DistractionAttractions and Distraction
Attractions and Distraction
 
Halqa e asar final ppt
Halqa e asar final pptHalqa e asar final ppt
Halqa e asar final ppt
 
Leadership edited
Leadership editedLeadership edited
Leadership edited
 
Leadership influence
Leadership influenceLeadership influence
Leadership influence
 
Management
Management Management
Management
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysis
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysis
 
Lignocellulyitc enzymes, COMSATS Vehari
Lignocellulyitc enzymes, COMSATS VehariLignocellulyitc enzymes, COMSATS Vehari
Lignocellulyitc enzymes, COMSATS Vehari
 
Impact of agriculture on climate change
Impact of agriculture on climate change Impact of agriculture on climate change
Impact of agriculture on climate change
 
Beat plastic
Beat plasticBeat plastic
Beat plastic
 

Kürzlich hochgeladen

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...Jack Cole
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaManalVerma4
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelBoston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 

Kürzlich hochgeladen (20)

FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
why-transparency-and-traceability-are-essential-for-sustainable-supply-chains...
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
IBEF report on the Insurance market in India
IBEF report on the Insurance market in IndiaIBEF report on the Insurance market in India
IBEF report on the Insurance market in India
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis modelDecoding Movie Sentiments: Analyzing Reviews with Data Analysis model
Decoding Movie Sentiments: Analyzing Reviews with Data Analysis model
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 

R studio

  • 1. R- An Introduction and Application
  • 2. 2 Outline  Why R, and R Paradigm  R Overview  R Interface  R Workspace  Help  R Packages  Input/Output  Reusing Results  References, Tutorials and links
  • 3. 3 Why R?  It's free!  It runs on a variety of platforms including Windows, Unix and MacOS.  It provides an unparalleled platform for programming new statistical methods in an easy and straightforward manner  It contains advanced statistical routines not yet available in other packages.  It has state-of-the-art graphics capabilities.
  • 4. 4 How to download?  Google it using R or CRAN (Comprehensive R Archive Network)  http://www.r-project.org
  • 6. 6
  • 7. Basic functions >In import file section of the Environment Import the file from the desktop > 7
  • 8. 8 An example > # An example > x <- c(1:10) > x[(x>8) | (x<5)] > # yields 1 2 3 4 9 10 > # How it works > x <- c(1:10) > X >1 2 3 4 5 6 7 8 9 10 > x > 8 > F F F F F F F F T T > x < 5 > T T T T F F F F F F > x > 8 | x < 5 > T T T T F F F F T T > x[c(T,T,T,T,F,F,F,F,T,T)] > 1 2 3 4 9 10
  • 9. 9 R Warning !  R is a case sensitive language  FOO, Foo, and foo are three different objects
  • 10. 10 R Help  Once R is installed, there is a comprehensive built-in help system and we can use any of the following:  help.start() # general help help(foo) # help about function foo ?foo # same thing apropos("foo") # list all function containing string foo example(foo) # show an example of function foo
  • 11. 11 R Datasets R comes with a number of sample datasets that you can experiment with. Type > data( ) to see the available datasets. The results will depend on which packages you have loaded. Type help(datasetname) for details on a sample dataset.
  • 12. 12 R Packages  When you download R, already a number (around 30) of packages are downloaded as well.. > search() [1] ".GlobalEnv" "package:stats" "package:graphics" [4] "package:grDevices" "package:datasets" "package:utils" [7] "package:methods" "Autoloads" "package:base"
  • 14. 14 Creating new variables  Use the assignment operator <- to create new variables. A wide array of operators and functions are available here.  # Three examples for doing the same computations mydata$sum <- mydata$x1 + mydata$x2 mydata$mean <- (mydata$x1 + mydata$x2)/2 attach(mydata) mydata$sum <- x1 + x2 mydata$mean <- (x1 + x2)/2 detach(mydata)  mydata <- transform( mydata, sum = x1 + x2, mean = (x1 + x2)/2 )
  • 15. 15 Creating new variables Recoding variables  # create 2 age categories mydata$agecat <- ifelse(mydata$age > 70, c("older"), c("younger"))  # another example: create 3 age categories attach(mydata) mydata$agecat[age > 75] <- "Elder" mydata$agecat[age > 45 & age <= 75] <- "Middle Aged" mydata$agecat[age <= 45] <- "Young" detach(mydata)
  • 16. Applied Statistical Computing and Graphics 16 Arithmetic Operators Operator Description + addition - subtraction * multiplication / division ^ or ** exponentiation
  • 17. 17 Logical Operators Operator Description < less than <= less than or equal to > greater than >= greater than or equal to == exactly equal to != not equal to !x Not x x | y x OR y x & y x AND y isTRUE(x) test if x is TRUE
  • 18. 18 Sorting  To sort a dataframe in R, use the order( ) function. By default, sorting is ASCENDING. Prepend the sorting variable by a minus sign to indicate DESCENDING order. Here are some examples.  # sorting examples using the mtcars dataset data(mtcars) # sort by mpg newdata = mtcars[order(mtcars$mpg),] # sort by mpg and cyl newdata <- mtcars[order(mtcars$mpg, mtcars$cyl),] #sort by mpg (ascending) and cyl (descending) newdata <- mtcars[order(mtcars$mpg, -mtcars$cyl),]
  • 19. 19 Merging To merge two dataframes (datasets) horizontally, use the merge function. In most cases, you join two dataframes by one or more common key variables (i.e., an inner join). # merge two dataframes by ID total <- merge(dataframeA,dataframeB,by="ID") # merge two dataframes by ID and Country total <- merge(dataframeA,dataframeB,by=c("ID","Country "))
  • 20. 20 Merging ADDING ROWS To join two dataframes (datasets) vertically, use the rbind function. The two dataframes must have the same variables, but they do not have to be in the same order. total <- rbind(dataframeA, dataframeB) If dataframeA has variables that dataframeB does not, then either: Delete the extra variables in dataframeA or Create the additional variables in dataframeB and set them to NA (missing) before joining them with rbind.
  • 22. 22 Creating a Graph  In R, graphs are typically created interactively. # Creating a Graph attach(mtcars) plot(wt, mpg) abline(lm(mpg~wt)) title("Regression of MPG on Weight")  The plot( ) function opens a graph window and plots weight vs. miles per gallon. The next line of code adds a regression line to this graph. The final line adds a title.
  • 24. 24 T test  The t.test( ) function produces a variety of t-tests. Unlike most statistical packages, the default assumes unequal variance and applies the Welsh df modification.# independent 2-group t-test t.test(y~x) # where y is numeric and x is a binary factor  # independent 2-group t-test t.test(y1,y2) # where y1 and y2 are numeric  # paired t-test t.test(y1,y2,paired=TRUE) # where y1 & y2 are numeric  # one samle t-test t.test(y,mu=3) # Ho: mu=3
  • 25. 25 Multiple (Linear) Regression  R provides comprehensive support for multiple linear regression. The topics below are provided in order of increasing complexity.  Fitting the Model  # Multiple Linear Regression Example fit <- lm(y ~ x1 + x2 + x3, data=mydata) summary(fit) # show results  # Other useful functions coefficients(fit) # model coefficients confint(fit, level=0.95) # CIs for model parameters fitted(fit) # predicted values residuals(fit) # residuals anova(fit) # anova table vcov(fit) # covariance matrix for model parameters influence(fit) # regression diagnostics
  • 26. 26 Comparing Models  You can compare nested models with the anova( ) function. The following code provides a simultaneous test that x3 and x4 add to linear prediction above and beyond x1 and x2.  # compare models fit1 <- lm(y ~ x1 + x2 + x3 + x4, data=mydata) fit2 <- lm(y ~ x1 + x2) anova(fit1, fit2)
  • 27. 27 ANOVA  If you have been analyzing ANOVA designs in traditional statistical packages, you are likely to find R's approach less coherent and user-friendly. A good online presentation on ANOVA in R is available online.
  • 28. 28 Multiple Comparisons  You can get Tukey HSD tests using the function below. By default, it calculates post hoc comparisons on each factor in the model. You can specify specific factors as an option. Again, remember that results are based on Type I SS!  # Tukey Honestly Significant Differences TukeyHSD(fit) # where fit comes from aov()
  • 29. 29 Visualizing Results  Use box plots and line plots to visualize group differences. interaction.plot( ) in the base stats package produces plots for two-way interactions. plotmeans( ) in the gplots package produces mean plots for single factors, and includes confidence intervals.  # Two-way Interaction Plot attach(mtcars) gears <- factor(gears) cyl <- factor(cyl) interaction.plot(cyl, gear, mpg, type="b", col=c(1:3), leg.bty="o", leg.bg="beige", lwd=2, pch=c(18,24,22), xlab="Number of Cylinders", ylab="Mean Miles Per Gallon", main="Interaction Plot")  # Plot Means with Error Bars library(gplots) attach(mtcars) cyl <- factor(cyl) plotmeans(mpg~cyl,xlab="Number of Cylinders", ylab="Miles Per Gallon, main="Mean Plotnwith 95% CI")
  • 30. 30 Tutorials Each of the following tutorials are in PDF format.  P. Kuhnert & B. Venables, An Introduction to R: Software for Statistical Modeling & Computing  J.H. Maindonald, Using R for Data Analysis and Graphics  B. Muenchen, R for SAS and SPSS Users  W.J. Owen, The R Guide  W.N. Venebles & D. M. Smith, An Introduction to R
  • 31. Applied Statistical Computing and Graphics 31
  • 32. Applied Statistical Computing and Graphics 32