2. Today
Title: R Language and Analytics as a Profession
Agenda:
1 hour hands on R coding (beginner/intermediate level)
1 Hour talking on Analytics as a Profession
Speaker: Mr. Ajay Ohri
Venue: Room No: 511, Department of Management studies (DMS), Vishwakarma Bhawan, IIT Delhi – 110016
Date & Time: 14 June 2014, Saturday 14.30 – 16:30 Hrs
Directions to Reach the Venue:
Option A: You need to get down at Malviya Nagar/Hauz Khas Metro station and ask auto to take you just near Katwariya Sarai.
Option B: Get down at Hauz Khas Metro Station. Take Bus (511 or 511A or Battery PoweredRickshaw to reach to Sanskrit Vidya peeth
(near Katwaiya Sarai ).
Or Take Bus 764 and get down at IIT Hostel Gate. Walk a bit to reach DMS.
PLEASE NOTE-
DMS or IIT Delhi has no role in organizing this event.
3. New Delhi R Meetup
http://www.meetup.com/New-Delhi-R-UseR-Group/
298 members
2 Years
Sponsored
Non Commercial Group Only
5. R from the Console
● limited lines of code can submitted at a time
● one graph can be viewed at a time
● best for either beginners or really command line users
● no syntax prompting
● help is in a separate window
6. R Syntax- most important
- # adding a hash or # comments out rest of sentence
comments make your code more readable
-?(keyword) looks for help on that keyword locally
-??(keyword) looks for help on that keyword in all the
documentation
Assignment
● objectname1=subset(df,df.name$var1 )
7. My first 25 R Commands
What’s here?
● ls()
● getwd()
● setwd()
● dir()
● rm()
What’s in my object?
● str()
● class()
● dim()
● length()
● names()
● nrowl() # and ncol()
How do I select or change stuff
● data.frame.name$variable
● data.frame[row,column]
● subset(df,df.name$var1 > X & df$var2 <Y | df$var3 ==” text”)
Function
● function1=function(x,y,z){x^2+2x*y+(z/10)-23}
Math
● log(x)
● mean(x)
● sd(x)
● median(x)
● exp(x)
Packages
● install.package(“FOO”)
● library(FOO)
● update.package()
What can I do?
● read.table()
● write.table()
● summary()
● table()
● plot()
● hist()
● boxplot()
● library(Hmisc) describe()
● library(Hmisc) summarize()
15. My next 25 R Commands
What’s missing?
● is.na()
● na.omit()
● na.rm=T
Operators
● diff
● lag
● cumsum
Data Mining
● kmeans
● arules::apriori
● tm::tm_map
References-
http://www.statmethods.net/advstats/cluster.html
http://cran.r-project.org/web/packages/arulesViz/vignettes/arulesViz.pdf
http://cran.r-project.org/web/packages/tm/vignettes/tm.pdf
http://www.rdatamining.com/examples/association-rules
Modeling
● cor(x)
● lm(x)
● vif(a)
● outlierTest(a)
System
● system.time()
● Sys.Date()
● Sys.time()
What more can I do?
● b=ajay[sample(nrow(ajay),replace=F,
size=0.05*nrow(ajay)),]
● png(“graph.png”) Write plot as png
file
● dev.off
Data Manipulation
● as operator
● substr
● nchar
● paste
● difftime
● strptime
● lubridate::mdy
● apply functions
16. My favorite 15 R Packages
Data Mining
● tm
● arulesViz
● forecast
GUIs
● rattle
● Rcmdr
○ epack plugin
○ KMggplot2 plugin
● Deducer
Visualization
● ggplot2
● ggmap
Data Handling
● Dates- lubridate
● Analysis - Hmisc
● Rcurl
● XML
● jsonlite
17. Some more R packages
slidify
http://slidify.org/
quantmod
http://www.quantmod.com/
rocr
http://rocr.bioinf.mpi-sb.mpg.de/
r charts
http://rcharts.io/
18. My favorite R documentation
CRAN Views
http://cran.r-project.org/web/views/
R Documentation
http://www.rdocumentation.org/
Inside R
http://www.inside-r.org/
33. Teaching R
Multiple ways to do the same thing in R - Resolve CONFUSION
GUIs can be shortcut initially -Selective Introduction to Packages
Will need command line and ?help later on -Emphasizing documentation
Pace of learning to be as per audience
Huge Scope- hence should be kept pertinent to needs
Analytics is not Statistics
R is more than a computer language or syntax
Projects are the best teachers
34. R Project for Researchers
● creating packages for analytics relevant to
industry
○ i.e telecom churn, rfm, ltv, retail
● any takers?