SlideShare ist ein Scribd-Unternehmen logo
1 von 20
R IntroWeek 1 Scott Chamberlain [modified from Haldre Rogers] September 9, 2011
Don’t just listen to me! Other Intros to R: http://www.stat.duke.edu/programs/gcc/ResourcesDocuments/RTutorial.pdf http://www.cyclismo.org/tutorial/R/ http://www.r-tutor.com/r-introduction Quick R: http://www.statmethods.net/ http://www.bioconductor.org/help/course-materials/2011/CSAMA/Monday/Morning%20Talks/R_intro.pdf
R user frameworks R from command line: OSX and PC Just type “R” into the command line – and have fun! R itself http://www.r-project.org/ RStudio – good choice http://www.rstudio.org/ RevolutionR [free academic version] – this is sort of the SAS-ised version of R http://www.revolutionanalytics.com/downloads/free-academic.php Uses proprietary .xdf file format that speeds up computation times Many other ways to use R, including GUIs, other IDEs, and huge variety of text editors https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources If you are afraid of the code interface, use Rattle, or R Commander, or Deducer, or Red R You can learn using these interfaces what code does what after pressing buttons
R user frameworks, cont. R from Python RPy: http://rpy.sourceforge.net/ C from R:  rcpp package: http://cran.r-project.org/web/packages/Rcpp/index.html http://dirk.eddelbuettel.com/code/rcpp.html Can hugely speed up computation times by writing R functions in C language. Then the function calls C to run instead of R. E.g., http://helmingstay.blogspot.com/2011/06/efficient-loops-in-r-complexity-versus.html & http://dirk.eddelbuettel.com/code/rcpp.examples.html Excel from R XLConnect package: http://cran.r-project.org/web/packages/XLConnect/index.html And more
.see for yourself
R Tips R can crash  Do not use R’s built in text editor or solely write code in the R console. Instead use any text editor that integrates with R. See here for links:  https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources When asking for help on listserves/help websites, use BRIEF and  REPRODUCIBLE examples Not doing this makes people not want to help you! R automatically overwrites files with the same file name!!!! Make sure you want to overwrite a file before doing so
Style
Not this kind of style

This kind of style!!!
Style Style is important so YOU and OTHERS can read your code and actually use it Google style guide:  http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html#generallayout Henrik Bengtsson style guide:  http://www1.maths.lth.se/help/R/RCC/ Hadley Wickham's style guide:  https://github.com/hadley/devtools/wiki/Style
Preparing your data for R What makes clean data? Correct spelling Identical capitalization (e.g. Premna vspremna) If myvector <- c(3, 4, 5), calling Myvector does not work! No spaces between words (spaces turned into “.”) Generally try to avoid, use underscores instead NA or blank (if using csv) for missing values Find and replace to get rid of spaces after words I generally keep an .xls and a .csv file so you can always recreate work in R with the .csv file and still modify the .xls file
Bringing data into R Create csv file One worksheet only No special formatting, filters, comments etc. Copy only columns and rows with your data to the CSV, as R will read in columns without data sometimes Name your variables well  self-explanatory, unique, lowercase, short-ish, one-word names In R, set the working directory setwd("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro") What is the working directory? getwd() What is in the working directory? dir() Read in data CSV files: iris.df <- read.csv("iris_df.csv", header=T) Clipboard: read.csv("clipboard")- reads in file like cutting and pasting it From web: read.csv("http://explore.data.gov/download/pwaj-zn2n/CSV") From excel files: (using the XLConnect package) iris.df <- readWorksheetFromFile("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro/iris_df.xlsx", sheet=“Sheet1”) Write data write.csv(dataframe, “dataframename.csv”), OR save(iris, “iris.RData”) [and load(“iris.RData”) to open in R]
R data structures Scalar: Object with a single value, either numeric or character Vector: Sequence of any values, including numeric, character, and NA List: Arbitrary collections of variables – very useful R object Character: Text, e.g., “this is some text” Factor: Like character vectors, but only w/ values in predefined “levels” Matrix: Only numeric values allowed Dataframe:  Each column can be of a different class Immutable dataframe:  special dataframe used in plyr package for faster dataframe manipulation, it references the original dataframe for faster calculations Function Environment
Exploring dataframes str(dataframe) gives column formats and dimensions head(dataframe) and tail() give first and last 6 rows names(dataframe) gives column names row.names(dataframe) gives row names attributes(dataframe) gives column and row names and object class summary(dataframe) gives a lot of good information Make sure variables are appropriate form Character/string, Numeric, Factor, Integer, logical Make sure mins, maxs, means, etc. seem right Make sure you don’t have typing errors so Premna and premna are two separate factors Use: unique(iris$species) to see what all unique values of a column are Or use: levels(spider$species) to see different levels
To attach or not to attach
that is the question Some like to use ‘attach’ to make dataframe variables accessible by name within the R session  Generally, ‘attach’ is frowned upon by R junkies.   Use dataframe$y, or data=dataframe, or dataframe[,”y”], or dataframe[, 2] To detach the object, use: detach()   I recommend: do not use attach, but do what you want
R Packages 3,262 packages!!!! Packages are extensions written by anyone for any purpose, usually loaded by: install.packages(”packagename”), then require(packagename) or library() Use ?functionname for help on any function in base R or in R packages In RStudio, just press tab when in parentheses after the function name to see function options!!! Explore packages at the CRAN site: http://cran.r-project.org/web/packages/ Inside-R package reference:  http://www.inside-r.org/packages
Data manipulation Packages: plyr, data.table, doBY, sqldf, reshape2, and more Comparison of packages Modified from code from Recipes, scripts and Genomics blog: https://gist.github.com/878919 data.table is by far the fastest!!!  BUT, ease of use and flexibility may be plyr? See for yourself
 Also, see examples in the tutorial code for reshape2 package for neat data manipulation tricks
Visualizations A few different approaches: Base graphics Lattice graphics Grid graphics ggplot2 graphics Further reading: http://www.slideshare.net/dataspora/a-survey-of-r-graphics An example:
more on ggplot2 graphics There are classes taught by Hadley Wickham here at Rice if you want to learn more! Data visualization (Stat645): http://had.co.nz/stat645/ Statistical computing (Stat405): http://had.co.nz/stat405/ Hadley’s website is really helpful: http://had.co.nz/ggplot2/ The ggplot2 google groups site: https://groups.google.com/forum/#!forum/ggplot2
QUICK RSTUDIO RUN THROUGH Keyboard shortcuts!! http://www.rstudio.org/docs/using/keyboard_shortcuts
USE CASE HERE [see intro_usecase.R file]

Weitere Àhnliche Inhalte

Was ist angesagt?

Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingVictor Ordu
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to RstudioOlga Scrivner
 
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ...
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ..."Introduction to Data Visualization" Workshop for General Assembly by Hunter ...
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ...Hunter Whitney
 
support vector regression
support vector regressionsupport vector regression
support vector regressionAkhilesh Joshi
 
CMSC 56 | Lecture 14: Representing Relations
CMSC 56 | Lecture 14: Representing RelationsCMSC 56 | Lecture 14: Representing Relations
CMSC 56 | Lecture 14: Representing Relationsallyn joy calcaben
 
Exploratory Data Analysis using Python
Exploratory Data Analysis using PythonExploratory Data Analysis using Python
Exploratory Data Analysis using PythonShirin Mojarad, Ph.D.
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Edureka!
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Edureka!
 
NaĂŻve Bayes Classifier Algorithm.pptx
NaĂŻve Bayes Classifier Algorithm.pptxNaĂŻve Bayes Classifier Algorithm.pptx
NaĂŻve Bayes Classifier Algorithm.pptxShubham Jaybhaye
 
R programming
R programmingR programming
R programmingPooja Sharma
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear RegressionAndrew Ferlitsch
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionVARUN KUMAR
 
polynomial linear regression
polynomial linear regressionpolynomial linear regression
polynomial linear regressionAkhilesh Joshi
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programmingizahn
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERKnoldus Inc.
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programmingNimrita Koul
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesMarc Garcia
 
R programming presentation
R programming presentationR programming presentation
R programming presentationAkshat Sharma
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiUnmesh Baile
 

Was ist angesagt? (20)

Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Introduction to Rstudio
Introduction to RstudioIntroduction to Rstudio
Introduction to Rstudio
 
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ...
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ..."Introduction to Data Visualization" Workshop for General Assembly by Hunter ...
"Introduction to Data Visualization" Workshop for General Assembly by Hunter ...
 
support vector regression
support vector regressionsupport vector regression
support vector regression
 
CMSC 56 | Lecture 14: Representing Relations
CMSC 56 | Lecture 14: Representing RelationsCMSC 56 | Lecture 14: Representing Relations
CMSC 56 | Lecture 14: Representing Relations
 
Exploratory Data Analysis using Python
Exploratory Data Analysis using PythonExploratory Data Analysis using Python
Exploratory Data Analysis using Python
 
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
Decision Tree Algorithm & Analysis | Machine Learning Algorithm | Data Scienc...
 
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
Random Forest Tutorial | Random Forest in R | Machine Learning | Data Science...
 
NaĂŻve Bayes Classifier Algorithm.pptx
NaĂŻve Bayes Classifier Algorithm.pptxNaĂŻve Bayes Classifier Algorithm.pptx
NaĂŻve Bayes Classifier Algorithm.pptx
 
Relations
RelationsRelations
Relations
 
R programming
R programmingR programming
R programming
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
polynomial linear regression
polynomial linear regressionpolynomial linear regression
polynomial linear regression
 
Introduction to R Programming
Introduction to R ProgrammingIntroduction to R Programming
Introduction to R Programming
 
NAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIERNAIVE BAYES CLASSIFIER
NAIVE BAYES CLASSIFIER
 
Workshop presentation hands on r programming
Workshop presentation hands on r programmingWorkshop presentation hands on r programming
Workshop presentation hands on r programming
 
CART: Not only Classification and Regression Trees
CART: Not only Classification and Regression TreesCART: Not only Classification and Regression Trees
CART: Not only Classification and Regression Trees
 
R programming presentation
R programming presentationR programming presentation
R programming presentation
 
Best corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbaiBest corporate-r-programming-training-in-mumbai
Best corporate-r-programming-training-in-mumbai
 

Andere mochten auch

R language tutorial
R language tutorialR language tutorial
R language tutorialDavid Chiu
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformSyracuse University
 
R programming Basic & Advanced
R programming Basic & AdvancedR programming Basic & Advanced
R programming Basic & AdvancedSohom Ghosh
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)Dataspora
 
R learning by examples
R learning by examplesR learning by examples
R learning by examplesMichelle Darling
 
R programming language
R programming languageR programming language
R programming languageAlberto Minetti
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial ProgrammingSakthi Dasans
 
Introduction to R
Introduction to RIntroduction to R
Introduction to RSamuel Bosch
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial IntroductionSakthi Dasans
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudioegoodwintx
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching moduleSander Timmer
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with RShareThis
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environmentizahn
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Goran S. Milovanovic
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2izahn
 
Counterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning modelsCounterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning modelsMichael Manapat
 

Andere mochten auch (20)

R language tutorial
R language tutorialR language tutorial
R language tutorial
 
Why R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics PlatformWhy R? A Brief Introduction to the Open Source Statistics Platform
Why R? A Brief Introduction to the Open Source Statistics Platform
 
R programming
R programmingR programming
R programming
 
R programming Basic & Advanced
R programming Basic & AdvancedR programming Basic & Advanced
R programming Basic & Advanced
 
An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)An Interactive Introduction To R (Programming Language For Statistics)
An Interactive Introduction To R (Programming Language For Statistics)
 
R learning by examples
R learning by examplesR learning by examples
R learning by examples
 
R presentation
R presentationR presentation
R presentation
 
R programming language
R programming languageR programming language
R programming language
 
Rtutorial
RtutorialRtutorial
Rtutorial
 
2 R Tutorial Programming
2 R Tutorial Programming2 R Tutorial Programming
2 R Tutorial Programming
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
1 R Tutorial Introduction
1 R Tutorial Introduction1 R Tutorial Introduction
1 R Tutorial Introduction
 
Intro to RStudio
Intro to RStudioIntro to RStudio
Intro to RStudio
 
R tutorial
R tutorialR tutorial
R tutorial
 
Presentation R basic teaching module
Presentation R basic teaching modulePresentation R basic teaching module
Presentation R basic teaching module
 
Data analysis with R
Data analysis with RData analysis with R
Data analysis with R
 
Introduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing EnvironmentIntroduction to the R Statistical Computing Environment
Introduction to the R Statistical Computing Environment
 
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
Introduction to R for Data Science :: Session 7 [Multiple Linear Regression i...
 
Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2Introduction to R Graphics with ggplot2
Introduction to R Graphics with ggplot2
 
Counterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning modelsCounterfactual evaluation of machine learning models
Counterfactual evaluation of machine learning models
 

Ähnlich wie R Introduction

Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchAndrew Lowe
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTHaritikaChhatwal1
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheetDr. Volkan OBAN
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studioDerek Kane
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in Rliz__is
 
Basics R.ppt
Basics R.pptBasics R.ppt
Basics R.pptAtulTandan
 
Reading Data into R REVISED
Reading Data into R REVISEDReading Data into R REVISED
Reading Data into R REVISEDKazuki Yoshida
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RYanchang Zhao
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packagesAjay Ohri
 
Introduction to r
Introduction to rIntroduction to r
Introduction to rgslicraf
 
Easy R
Easy REasy R
Easy RAjay Ohri
 
Reproducible Computational Research in R
Reproducible Computational Research in RReproducible Computational Research in R
Reproducible Computational Research in RSamuel Bosch
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R StudioSusan Johnston
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R StudioRupak Roy
 
1 installing & Getting Started with R
1 installing & Getting Started with R1 installing & Getting Started with R
1 installing & Getting Started with RDr Nisha Arora
 
1 Installing & getting started with R
1 Installing & getting started with R1 Installing & getting started with R
1 Installing & getting started with Rnaroranisha
 

Ähnlich wie R Introduction (20)

Language-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible researchLanguage-agnostic data analysis workflows and reproducible research
Language-agnostic data analysis workflows and reproducible research
 
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIASTBUSINESS ANALYTICS WITH R SOFTWARE DIAST
BUSINESS ANALYTICS WITH R SOFTWARE DIAST
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
 
Devtools cheatsheet
Devtools cheatsheetDevtools cheatsheet
Devtools cheatsheet
 
Unit 3
Unit 3Unit 3
Unit 3
 
Data Science - Part II - Working with R & R studio
Data Science - Part II -  Working with R & R studioData Science - Part II -  Working with R & R studio
Data Science - Part II - Working with R & R studio
 
Reproducible research (and literate programming) in R
Reproducible research (and literate programming) in RReproducible research (and literate programming) in R
Reproducible research (and literate programming) in R
 
Basics R.ppt
Basics R.pptBasics R.ppt
Basics R.ppt
 
Basics.ppt
Basics.pptBasics.ppt
Basics.ppt
 
Reading Data into R REVISED
Reading Data into R REVISEDReading Data into R REVISED
Reading Data into R REVISED
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
 
r,rstats,r language,r packages
r,rstats,r language,r packagesr,rstats,r language,r packages
r,rstats,r language,r packages
 
Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga Introduction to R software, by Leire ibaibarriaga
Introduction to R software, by Leire ibaibarriaga
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Easy R
Easy REasy R
Easy R
 
Reproducible Computational Research in R
Reproducible Computational Research in RReproducible Computational Research in R
Reproducible Computational Research in R
 
Reproducible Research in R and R Studio
Reproducible Research in R and R StudioReproducible Research in R and R Studio
Reproducible Research in R and R Studio
 
Introduction to R and R Studio
Introduction to R and R StudioIntroduction to R and R Studio
Introduction to R and R Studio
 
1 installing & Getting Started with R
1 installing & Getting Started with R1 installing & Getting Started with R
1 installing & Getting Started with R
 
1 Installing & getting started with R
1 Installing & getting started with R1 Installing & getting started with R
1 Installing & getting started with R
 

Mehr von schamber

Poster
PosterPoster
Posterschamber
 
Poster
PosterPoster
Posterschamber
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesisschamber
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in Rschamber
 
Web data from R
Web data from RWeb data from R
Web data from Rschamber
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwinschamber
 

Mehr von schamber (6)

Poster
PosterPoster
Poster
 
Poster
PosterPoster
Poster
 
Chamberlain PhD Thesis
Chamberlain PhD ThesisChamberlain PhD Thesis
Chamberlain PhD Thesis
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
 
Web data from R
Web data from RWeb data from R
Web data from R
 
regex-presentation_ed_goodwin
regex-presentation_ed_goodwinregex-presentation_ed_goodwin
regex-presentation_ed_goodwin
 

KĂŒrzlich hochgeladen

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

KĂŒrzlich hochgeladen (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

R Introduction

  • 1. R IntroWeek 1 Scott Chamberlain [modified from Haldre Rogers] September 9, 2011
  • 2. Don’t just listen to me! Other Intros to R: http://www.stat.duke.edu/programs/gcc/ResourcesDocuments/RTutorial.pdf http://www.cyclismo.org/tutorial/R/ http://www.r-tutor.com/r-introduction Quick R: http://www.statmethods.net/ http://www.bioconductor.org/help/course-materials/2011/CSAMA/Monday/Morning%20Talks/R_intro.pdf
  • 3. R user frameworks R from command line: OSX and PC Just type “R” into the command line – and have fun! R itself http://www.r-project.org/ RStudio – good choice http://www.rstudio.org/ RevolutionR [free academic version] – this is sort of the SAS-ised version of R http://www.revolutionanalytics.com/downloads/free-academic.php Uses proprietary .xdf file format that speeds up computation times Many other ways to use R, including GUIs, other IDEs, and huge variety of text editors https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources If you are afraid of the code interface, use Rattle, or R Commander, or Deducer, or Red R You can learn using these interfaces what code does what after pressing buttons
  • 4. R user frameworks, cont. R from Python RPy: http://rpy.sourceforge.net/ C from R: rcpp package: http://cran.r-project.org/web/packages/Rcpp/index.html http://dirk.eddelbuettel.com/code/rcpp.html Can hugely speed up computation times by writing R functions in C language. Then the function calls C to run instead of R. E.g., http://helmingstay.blogspot.com/2011/06/efficient-loops-in-r-complexity-versus.html & http://dirk.eddelbuettel.com/code/rcpp.examples.html Excel from R XLConnect package: http://cran.r-project.org/web/packages/XLConnect/index.html And more
.see for yourself
  • 5. R Tips R can crash  Do not use R’s built in text editor or solely write code in the R console. Instead use any text editor that integrates with R. See here for links: https://github.com/RatRiceEEB/RIntroCode/wiki/R-Resources When asking for help on listserves/help websites, use BRIEF and REPRODUCIBLE examples Not doing this makes people not want to help you! R automatically overwrites files with the same file name!!!! Make sure you want to overwrite a file before doing so
  • 7. Not this kind of style

  • 8. This kind of style!!!
  • 9. Style Style is important so YOU and OTHERS can read your code and actually use it Google style guide: http://google-styleguide.googlecode.com/svn/trunk/google-r-style.html#generallayout Henrik Bengtsson style guide: http://www1.maths.lth.se/help/R/RCC/ Hadley Wickham's style guide: https://github.com/hadley/devtools/wiki/Style
  • 10. Preparing your data for R What makes clean data? Correct spelling Identical capitalization (e.g. Premna vspremna) If myvector <- c(3, 4, 5), calling Myvector does not work! No spaces between words (spaces turned into “.”) Generally try to avoid, use underscores instead NA or blank (if using csv) for missing values Find and replace to get rid of spaces after words I generally keep an .xls and a .csv file so you can always recreate work in R with the .csv file and still modify the .xls file
  • 11. Bringing data into R Create csv file One worksheet only No special formatting, filters, comments etc. Copy only columns and rows with your data to the CSV, as R will read in columns without data sometimes Name your variables well self-explanatory, unique, lowercase, short-ish, one-word names In R, set the working directory setwd("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro") What is the working directory? getwd() What is in the working directory? dir() Read in data CSV files: iris.df <- read.csv("iris_df.csv", header=T) Clipboard: read.csv("clipboard")- reads in file like cutting and pasting it From web: read.csv("http://explore.data.gov/download/pwaj-zn2n/CSV") From excel files: (using the XLConnect package) iris.df <- readWorksheetFromFile("/Users/ScottMac/Dropbox/R Group/Week1_R-Intro/iris_df.xlsx", sheet=“Sheet1”) Write data write.csv(dataframe, “dataframename.csv”), OR save(iris, “iris.RData”) [and load(“iris.RData”) to open in R]
  • 12. R data structures Scalar: Object with a single value, either numeric or character Vector: Sequence of any values, including numeric, character, and NA List: Arbitrary collections of variables – very useful R object Character: Text, e.g., “this is some text” Factor: Like character vectors, but only w/ values in predefined “levels” Matrix: Only numeric values allowed Dataframe: Each column can be of a different class Immutable dataframe: special dataframe used in plyr package for faster dataframe manipulation, it references the original dataframe for faster calculations Function Environment
  • 13. Exploring dataframes str(dataframe) gives column formats and dimensions head(dataframe) and tail() give first and last 6 rows names(dataframe) gives column names row.names(dataframe) gives row names attributes(dataframe) gives column and row names and object class summary(dataframe) gives a lot of good information Make sure variables are appropriate form Character/string, Numeric, Factor, Integer, logical Make sure mins, maxs, means, etc. seem right Make sure you don’t have typing errors so Premna and premna are two separate factors Use: unique(iris$species) to see what all unique values of a column are Or use: levels(spider$species) to see different levels
  • 14. To attach or not to attach
that is the question Some like to use ‘attach’ to make dataframe variables accessible by name within the R session Generally, ‘attach’ is frowned upon by R junkies. Use dataframe$y, or data=dataframe, or dataframe[,”y”], or dataframe[, 2] To detach the object, use: detach()  I recommend: do not use attach, but do what you want
  • 15. R Packages 3,262 packages!!!! Packages are extensions written by anyone for any purpose, usually loaded by: install.packages(”packagename”), then require(packagename) or library() Use ?functionname for help on any function in base R or in R packages In RStudio, just press tab when in parentheses after the function name to see function options!!! Explore packages at the CRAN site: http://cran.r-project.org/web/packages/ Inside-R package reference: http://www.inside-r.org/packages
  • 16. Data manipulation Packages: plyr, data.table, doBY, sqldf, reshape2, and more Comparison of packages Modified from code from Recipes, scripts and Genomics blog: https://gist.github.com/878919 data.table is by far the fastest!!! BUT, ease of use and flexibility may be plyr? See for yourself
 Also, see examples in the tutorial code for reshape2 package for neat data manipulation tricks
  • 17. Visualizations A few different approaches: Base graphics Lattice graphics Grid graphics ggplot2 graphics Further reading: http://www.slideshare.net/dataspora/a-survey-of-r-graphics An example:
  • 18. more on ggplot2 graphics There are classes taught by Hadley Wickham here at Rice if you want to learn more! Data visualization (Stat645): http://had.co.nz/stat645/ Statistical computing (Stat405): http://had.co.nz/stat405/ Hadley’s website is really helpful: http://had.co.nz/ggplot2/ The ggplot2 google groups site: https://groups.google.com/forum/#!forum/ggplot2
  • 19. QUICK RSTUDIO RUN THROUGH Keyboard shortcuts!! http://www.rstudio.org/docs/using/keyboard_shortcuts
  • 20. USE CASE HERE [see intro_usecase.R file]

Hinweis der Redaktion

  1. Header=T means first row contains variable names
  2. Some numbers are actually factors- think of 0/1 for dead/alive or zipcodes (average zipcode?)