SlideShare ist ein Scribd-Unternehmen logo
1 von 35
R for Pirates
     Mandi Walls
      @lnxchk
 EscConf, Boston, MA
  October 27, 2011
whoami

• stats misfit
• R tinkerer
• large-farm runner
• not a professional statistician :D
What is R
• Scripting language for stats work
• Inspired by earlier S (for statistics)
  developed at AT&T
• FOSS
• Syntax inherits through Algol family, so
  looks somewhat like C/C++
What Does R Do?

•   Manipulate data

•   Complex Modeling and
    Computation

•   Graphics and
    Visualization
Why R?


• WHY NOT!?
But Other Math Stuff!
•   Mathematica
•   MatLab
•   Minitab
•   MAPLE
•   Excel (yes. shutup h8rs. ask your CFOs what they
    use)
•   R provides sophisticated statistical and modeling
    capabilities, and is extendible through your own code
Get R


• Available for Linux, Mac, Windows
• http://www.r-project.org/
Fire!

•   R console on Mac

•   Interactive interpreter
    for your R needs

•   Can also run from the
    command line: R
R Basics
•   R considers all elements
    to be vectors

•   A single number is a
    one-element vector

•   Use <- for assignment

•   Use c() to concatenate
    values into a vector
Let’s see that again
Practice Datasets


•   data()

•   shows the sample sets
    included with your R
Functions

•   Looks familiar!

•   Let’s see one!

•   “evencount” counts the number of even ints in a vector
Datatypes
•   Vectors, the important ones

•   Scalars are really single-element vectors

•   Character strings

•   Matrices, rectangular arrays of numbers

•   Lists

•   Tables, useful for data transitions and temp work
Vectors
•   R’s most-used data structure

•   All elements in a vector must have the same mode
    or data type

•   To add values to a vector, you concatenate into it
    with the c() function

•   Many mathematical functions can be performed on
    a vector, they can also be traversed like arrays

•   Index starts at 1, not 0!
Scalars

•   One-element vectors

    > x <- 8

    > x[1]

    [1] 8

•   also climb your rigging


                                  ©Disney.
Character Strings
•   Single-element vectors   •   Can do normal string
    with mode character          things, like
                                 > t <- paste("yo","dawg")
    > y <- "abc"
                                 > t
    > length(y)
                                 [1] "yo dawg"
    [1] 1
                                 > u <- strsplit(t,"")
    > mode(y)
                                 > u
    [1] "character"
                                 [[1]]

                                 [1] "y" "o" " " "d" "a" "w" "g"
Matrices
•   Two-dimensional array

    > m <- rbind(c(1,4),c(2,2))

    > m
           [,1] [,2]
    [1,]      1    4
    [2,]      2    2
    > m[1,2]
    [1] 4
    > m[1,]
    [1] 1 4
Lists
•   Contain elements of different types

•   Have a particular syntax

    > x <- list(u=2, v="abc")
    > x
    $u
    [1] 2

    $v
    [1] "abc"

    > x$u
    [1] 2
Data Frames
•   Matrices are limited to only a single type for all elements
•   A data frame can contain different types of data, can be read
    in from a file or created in realtime
    > df <- data.frame(list(kids=c("Olivia","Madison"),ages=c(10,8)))

    > df

           kids ages

    1   Olivia    10

    2 Madison      8

    > df$ages

    [1] 10    8
Putting R to Work

•   Read in a log file:
    access <- read.table("access.log", header=FALSE)
    > head(access)
               V1 V2 V3                      V4     V5                            V6   V7    V8
    1 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.js HTTP/1.1 401    401
    2 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.js HTTP/1.1 200    1970
    3 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.css HTTP/1.1 200   2258
Fun with Plots
• This plot series is going to
   make use of the “return
   codes” from the access log

• We’ll do a series of plots
   that gradually get more
   sophisticated

• This is a basic histogram of
   the data, it’s not much fun
Barplot
barplot(table(access[,7]))
Barplot v2
barplot(table(access[,7]),ylab="Number of Pages",xlab="Return
Code",main="Plot of Return Codes")
Barplot v3
barplot(table(access[,7]),ylab="Number of
Pages",xlab="Return Code",main="Plot of
Return Codes", col=heat.colors(length(x)))
Barplot v4




Source: wikipedia, http://en.wikipedia.org/wiki/Bar_%28establishment%29
Writing Graphical
             Output to Files
•   Set up the output target by calling a graphics function:

•   pdf(), png(), jpeg(), etc

•   jpeg(“/var/www/images/returncodes-date.jpg”)

•   Call the plot function you have chosen, then call dev.off()

•   Can be used in batch mode to create graphics from your data
Shopping is Hard, Let’s
          Do Math
•   Read in some load averages (one-min)

    loadavg<-read.table("load_avg.txt")

    head(loadavg)
        V1
    1 3.79
    2 3.11
    3 2.94
    4 4.81
Summary Stats
•   Summarize the data with one function call

•   Gives the min, max, mean, median, and quartiles
    summary(loadavg)
              V1
     Min.      :0.760
     1st Qu.:1.390
     Median :1.970
     Mean      :2.302
     3rd Qu.:3.080
     Max.      :5.070
Summary Stats as
   Boxplot
Same Thing, 3
                                  Datacenters
               > cpu<-read.table("cpu")

               > head(cpu)

                    V1    V2

               1 3.78 smq

               2 2.57 smq

               3 3.69 smq

               4 0.86 smq

          •    Looks like there’s outliers. That could spell
               trouble! You found them with R awesomeness.
               Horay!




boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot
                                 of One-Minute Load Average, FEs", col=topo.colors(3))
Running R in Your
              Workflow
  •   The little bit of boxplotting we did eariler, in a script:
[mandi@mandi ~]$ cat sample.R
#!/usr/bin/env Rscript
cpu<-read.table("cpu")
jpeg("./sample.jpg")
boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by
Datacenter", ylab="One-Minute Load Average", main="Box Plot
of One-Minute Load Average, FEs", col=heat.colors(3))
dev.off()
[mandi@mandi ~]$ Rscript sample.R > /dev/null
[mandi@mandi ~]$ ls -l sample.jpg
-rw-rw-r-- 1 mandi staff 20137 Oct 24 20:44 sample.jpg
Hey!


•   I made a graph with a
    script!
What Else?
•   R can read data input from a variety of files with regular
    formats

•   R can also fetch data from the internet using the url()
    function

•   R has a number of functions available for dealing with
    reading data, creating data frames or other structures, and
    converting string text into numerical data modes

•   Extended packages provide support for structured data
    formats like JSON.
References
• http://www.slideshare.net/dataspora/an-
  interactive-introduction-to-r-programming-
  language-for-statistics
• http://www.harding.edu/fmccown/R/
• Art of R Programming, Norman Matloff, Copyright
  2011 No Starch Press
• Statistical Analysis with R, John M. Quick, Copyright
  2011 Packt Publishing

Weitere ähnliche Inhalte

Was ist angesagt?

Python Pandas
Python PandasPython Pandas
Python PandasSunil OS
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
Python for R Users
Python for R UsersPython for R Users
Python for R UsersAjay Ohri
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Julian Hyde
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsFlink Forward
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RYogesh Khandelwal
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiInfluxData
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin CoroutinesEamonn Boyle
 

Was ist angesagt? (20)

Clojure class
Clojure classClojure class
Clojure class
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Scala
ScalaScala
Scala
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
Meetup slides
Meetup slidesMeetup slides
Meetup slides
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
Python for R Users
Python for R UsersPython for R Users
Python for R Users
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API Basics
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using R
 
Language R
Language RLanguage R
Language R
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Jug java7
Jug java7Jug java7
Jug java7
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines
 
Collections
CollectionsCollections
Collections
 
Haskell
HaskellHaskell
Haskell
 

Ähnlich wie R for Pirates. ESCCONF October 27, 2011

Unit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxUnit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxSreeLaya9
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAnshika865276
 
Learning python
Learning pythonLearning python
Learning pythonFraboni Ec
 
Learning python
Learning pythonLearning python
Learning pythonJames Wong
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachReza Rahimi
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)Paul Chao
 

Ähnlich wie R for Pirates. ESCCONF October 27, 2011 (20)

محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
Ggplot2 v3
Ggplot2 v3Ggplot2 v3
Ggplot2 v3
 
Unit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxUnit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptx
 
Matlab lec1
Matlab lec1Matlab lec1
Matlab lec1
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
MATLAB Programming
MATLAB Programming MATLAB Programming
MATLAB Programming
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.ppt
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning Approach
 
C
CC
C
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
Modern C++
Modern C++Modern C++
Modern C++
 
R training2
R training2R training2
R training2
 

Mehr von Mandi Walls

DOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfDOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfMandi Walls
 
Addo reducing trauma in organizations with SLOs and chaos engineering
Addo  reducing trauma in organizations with SLOs and chaos engineeringAddo  reducing trauma in organizations with SLOs and chaos engineering
Addo reducing trauma in organizations with SLOs and chaos engineeringMandi Walls
 
Full Service Ownership
Full Service OwnershipFull Service Ownership
Full Service OwnershipMandi Walls
 
PagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsPagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsMandi Walls
 
InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020Mandi Walls
 
Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Mandi Walls
 
Using Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityUsing Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityMandi Walls
 
Adding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xAdding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xMandi Walls
 
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Mandi Walls
 
BuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopBuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopMandi Walls
 
InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018Mandi Walls
 
DevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopDevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopMandi Walls
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecMandi Walls
 
InSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beInSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beMandi Walls
 
habitat at docker bud
habitat at docker budhabitat at docker bud
habitat at docker budMandi Walls
 
Ingite Slides for InSpec
Ingite Slides for InSpecIngite Slides for InSpec
Ingite Slides for InSpecMandi Walls
 
Habitat at LinuxLab IT
Habitat at LinuxLab ITHabitat at LinuxLab IT
Habitat at LinuxLab ITMandi Walls
 
InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017Mandi Walls
 
Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Mandi Walls
 
InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017Mandi Walls
 

Mehr von Mandi Walls (20)

DOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfDOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdf
 
Addo reducing trauma in organizations with SLOs and chaos engineering
Addo  reducing trauma in organizations with SLOs and chaos engineeringAddo  reducing trauma in organizations with SLOs and chaos engineering
Addo reducing trauma in organizations with SLOs and chaos engineering
 
Full Service Ownership
Full Service OwnershipFull Service Ownership
Full Service Ownership
 
PagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsPagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call Teams
 
InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020
 
Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019
 
Using Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityUsing Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure Security
 
Adding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xAdding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17x
 
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
 
BuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopBuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec Workshop
 
InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018
 
DevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopDevOpsDays InSpec Workshop
DevOpsDays InSpec Workshop
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpec
 
InSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beInSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.be
 
habitat at docker bud
habitat at docker budhabitat at docker bud
habitat at docker bud
 
Ingite Slides for InSpec
Ingite Slides for InSpecIngite Slides for InSpec
Ingite Slides for InSpec
 
Habitat at LinuxLab IT
Habitat at LinuxLab ITHabitat at LinuxLab IT
Habitat at LinuxLab IT
 
InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017
 
Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017
 
InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017
 

Kürzlich hochgeladen

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 

R for Pirates. ESCCONF October 27, 2011

  • 1. R for Pirates Mandi Walls @lnxchk EscConf, Boston, MA October 27, 2011
  • 2. whoami • stats misfit • R tinkerer • large-farm runner • not a professional statistician :D
  • 3. What is R • Scripting language for stats work • Inspired by earlier S (for statistics) developed at AT&T • FOSS • Syntax inherits through Algol family, so looks somewhat like C/C++
  • 4. What Does R Do? • Manipulate data • Complex Modeling and Computation • Graphics and Visualization
  • 6. But Other Math Stuff! • Mathematica • MatLab • Minitab • MAPLE • Excel (yes. shutup h8rs. ask your CFOs what they use) • R provides sophisticated statistical and modeling capabilities, and is extendible through your own code
  • 7. Get R • Available for Linux, Mac, Windows • http://www.r-project.org/
  • 8. Fire! • R console on Mac • Interactive interpreter for your R needs • Can also run from the command line: R
  • 9. R Basics • R considers all elements to be vectors • A single number is a one-element vector • Use <- for assignment • Use c() to concatenate values into a vector
  • 11. Practice Datasets • data() • shows the sample sets included with your R
  • 12. Functions • Looks familiar! • Let’s see one! • “evencount” counts the number of even ints in a vector
  • 13.
  • 14. Datatypes • Vectors, the important ones • Scalars are really single-element vectors • Character strings • Matrices, rectangular arrays of numbers • Lists • Tables, useful for data transitions and temp work
  • 15. Vectors • R’s most-used data structure • All elements in a vector must have the same mode or data type • To add values to a vector, you concatenate into it with the c() function • Many mathematical functions can be performed on a vector, they can also be traversed like arrays • Index starts at 1, not 0!
  • 16. Scalars • One-element vectors > x <- 8 > x[1] [1] 8 • also climb your rigging ©Disney.
  • 17. Character Strings • Single-element vectors • Can do normal string with mode character things, like > t <- paste("yo","dawg") > y <- "abc" > t > length(y) [1] "yo dawg" [1] 1 > u <- strsplit(t,"") > mode(y) > u [1] "character" [[1]] [1] "y" "o" " " "d" "a" "w" "g"
  • 18. Matrices • Two-dimensional array > m <- rbind(c(1,4),c(2,2)) > m [,1] [,2] [1,] 1 4 [2,] 2 2 > m[1,2] [1] 4 > m[1,] [1] 1 4
  • 19. Lists • Contain elements of different types • Have a particular syntax > x <- list(u=2, v="abc") > x $u [1] 2 $v [1] "abc" > x$u [1] 2
  • 20. Data Frames • Matrices are limited to only a single type for all elements • A data frame can contain different types of data, can be read in from a file or created in realtime > df <- data.frame(list(kids=c("Olivia","Madison"),ages=c(10,8))) > df kids ages 1 Olivia 10 2 Madison 8 > df$ages [1] 10 8
  • 21. Putting R to Work • Read in a log file: access <- read.table("access.log", header=FALSE) > head(access) V1 V2 V3 V4 V5 V6 V7 V8 1 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.js HTTP/1.1 401 401 2 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.js HTTP/1.1 200 1970 3 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.css HTTP/1.1 200 2258
  • 22. Fun with Plots • This plot series is going to make use of the “return codes” from the access log • We’ll do a series of plots that gradually get more sophisticated • This is a basic histogram of the data, it’s not much fun
  • 24. Barplot v2 barplot(table(access[,7]),ylab="Number of Pages",xlab="Return Code",main="Plot of Return Codes")
  • 25. Barplot v3 barplot(table(access[,7]),ylab="Number of Pages",xlab="Return Code",main="Plot of Return Codes", col=heat.colors(length(x)))
  • 26. Barplot v4 Source: wikipedia, http://en.wikipedia.org/wiki/Bar_%28establishment%29
  • 27. Writing Graphical Output to Files • Set up the output target by calling a graphics function: • pdf(), png(), jpeg(), etc • jpeg(“/var/www/images/returncodes-date.jpg”) • Call the plot function you have chosen, then call dev.off() • Can be used in batch mode to create graphics from your data
  • 28. Shopping is Hard, Let’s Do Math • Read in some load averages (one-min) loadavg<-read.table("load_avg.txt") head(loadavg) V1 1 3.79 2 3.11 3 2.94 4 4.81
  • 29. Summary Stats • Summarize the data with one function call • Gives the min, max, mean, median, and quartiles summary(loadavg) V1 Min. :0.760 1st Qu.:1.390 Median :1.970 Mean :2.302 3rd Qu.:3.080 Max. :5.070
  • 30. Summary Stats as Boxplot
  • 31. Same Thing, 3 Datacenters > cpu<-read.table("cpu") > head(cpu) V1 V2 1 3.78 smq 2 2.57 smq 3 3.69 smq 4 0.86 smq • Looks like there’s outliers. That could spell trouble! You found them with R awesomeness. Horay! boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot of One-Minute Load Average, FEs", col=topo.colors(3))
  • 32. Running R in Your Workflow • The little bit of boxplotting we did eariler, in a script: [mandi@mandi ~]$ cat sample.R #!/usr/bin/env Rscript cpu<-read.table("cpu") jpeg("./sample.jpg") boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot of One-Minute Load Average, FEs", col=heat.colors(3)) dev.off() [mandi@mandi ~]$ Rscript sample.R > /dev/null [mandi@mandi ~]$ ls -l sample.jpg -rw-rw-r-- 1 mandi staff 20137 Oct 24 20:44 sample.jpg
  • 33. Hey! • I made a graph with a script!
  • 34. What Else? • R can read data input from a variety of files with regular formats • R can also fetch data from the internet using the url() function • R has a number of functions available for dealing with reading data, creating data frames or other structures, and converting string text into numerical data modes • Extended packages provide support for structured data formats like JSON.
  • 35. References • http://www.slideshare.net/dataspora/an- interactive-introduction-to-r-programming- language-for-statistics • http://www.harding.edu/fmccown/R/ • Art of R Programming, Norman Matloff, Copyright 2011 No Starch Press • Statistical Analysis with R, John M. Quick, Copyright 2011 Packt Publishing

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n