SlideShare ist ein Scribd-Unternehmen logo
1 von 35
R for Pirates
     Mandi Walls
      @lnxchk
 EscConf, Boston, MA
  October 27, 2011
whoami

• stats misfit
• R tinkerer
• large-farm runner
• not a professional statistician :D
What is R
• Scripting language for stats work
• Inspired by earlier S (for statistics)
  developed at AT&T
• FOSS
• Syntax inherits through Algol family, so
  looks somewhat like C/C++
What Does R Do?

•   Manipulate data

•   Complex Modeling and
    Computation

•   Graphics and
    Visualization
Why R?


• WHY NOT!?
But Other Math Stuff!
•   Mathematica
•   MatLab
•   Minitab
•   MAPLE
•   Excel (yes. shutup h8rs. ask your CFOs what they
    use)
•   R provides sophisticated statistical and modeling
    capabilities, and is extendible through your own code
Get R


• Available for Linux, Mac, Windows
• http://www.r-project.org/
Fire!

•   R console on Mac

•   Interactive interpreter
    for your R needs

•   Can also run from the
    command line: R
R Basics
•   R considers all elements
    to be vectors

•   A single number is a
    one-element vector

•   Use <- for assignment

•   Use c() to concatenate
    values into a vector
Let’s see that again
Practice Datasets


•   data()

•   shows the sample sets
    included with your R
Functions

•   Looks familiar!

•   Let’s see one!

•   “evencount” counts the number of even ints in a vector
Datatypes
•   Vectors, the important ones

•   Scalars are really single-element vectors

•   Character strings

•   Matrices, rectangular arrays of numbers

•   Lists

•   Tables, useful for data transitions and temp work
Vectors
•   R’s most-used data structure

•   All elements in a vector must have the same mode
    or data type

•   To add values to a vector, you concatenate into it
    with the c() function

•   Many mathematical functions can be performed on
    a vector, they can also be traversed like arrays

•   Index starts at 1, not 0!
Scalars

•   One-element vectors

    > x <- 8

    > x[1]

    [1] 8

•   also climb your rigging


                                  ©Disney.
Character Strings
•   Single-element vectors   •   Can do normal string
    with mode character          things, like
                                 > t <- paste("yo","dawg")
    > y <- "abc"
                                 > t
    > length(y)
                                 [1] "yo dawg"
    [1] 1
                                 > u <- strsplit(t,"")
    > mode(y)
                                 > u
    [1] "character"
                                 [[1]]

                                 [1] "y" "o" " " "d" "a" "w" "g"
Matrices
•   Two-dimensional array

    > m <- rbind(c(1,4),c(2,2))

    > m
           [,1] [,2]
    [1,]      1    4
    [2,]      2    2
    > m[1,2]
    [1] 4
    > m[1,]
    [1] 1 4
Lists
•   Contain elements of different types

•   Have a particular syntax

    > x <- list(u=2, v="abc")
    > x
    $u
    [1] 2

    $v
    [1] "abc"

    > x$u
    [1] 2
Data Frames
•   Matrices are limited to only a single type for all elements
•   A data frame can contain different types of data, can be read
    in from a file or created in realtime
    > df <- data.frame(list(kids=c("Olivia","Madison"),ages=c(10,8)))

    > df

           kids ages

    1   Olivia    10

    2 Madison      8

    > df$ages

    [1] 10    8
Putting R to Work

•   Read in a log file:
    access <- read.table("access.log", header=FALSE)
    > head(access)
               V1 V2 V3                      V4     V5                            V6   V7    V8
    1 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.js HTTP/1.1 401    401
    2 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.js HTTP/1.1 200    1970
    3 192.168.1.10   -   - [23/Oct/2011:07:03:33 -0500]   GET /menu/menu.css HTTP/1.1 200   2258
Fun with Plots
• This plot series is going to
   make use of the “return
   codes” from the access log

• We’ll do a series of plots
   that gradually get more
   sophisticated

• This is a basic histogram of
   the data, it’s not much fun
Barplot
barplot(table(access[,7]))
Barplot v2
barplot(table(access[,7]),ylab="Number of Pages",xlab="Return
Code",main="Plot of Return Codes")
Barplot v3
barplot(table(access[,7]),ylab="Number of
Pages",xlab="Return Code",main="Plot of
Return Codes", col=heat.colors(length(x)))
Barplot v4




Source: wikipedia, http://en.wikipedia.org/wiki/Bar_%28establishment%29
Writing Graphical
             Output to Files
•   Set up the output target by calling a graphics function:

•   pdf(), png(), jpeg(), etc

•   jpeg(“/var/www/images/returncodes-date.jpg”)

•   Call the plot function you have chosen, then call dev.off()

•   Can be used in batch mode to create graphics from your data
Shopping is Hard, Let’s
          Do Math
•   Read in some load averages (one-min)

    loadavg<-read.table("load_avg.txt")

    head(loadavg)
        V1
    1 3.79
    2 3.11
    3 2.94
    4 4.81
Summary Stats
•   Summarize the data with one function call

•   Gives the min, max, mean, median, and quartiles
    summary(loadavg)
              V1
     Min.      :0.760
     1st Qu.:1.390
     Median :1.970
     Mean      :2.302
     3rd Qu.:3.080
     Max.      :5.070
Summary Stats as
   Boxplot
Same Thing, 3
                                  Datacenters
               > cpu<-read.table("cpu")

               > head(cpu)

                    V1    V2

               1 3.78 smq

               2 2.57 smq

               3 3.69 smq

               4 0.86 smq

          •    Looks like there’s outliers. That could spell
               trouble! You found them with R awesomeness.
               Horay!




boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot
                                 of One-Minute Load Average, FEs", col=topo.colors(3))
Running R in Your
              Workflow
  •   The little bit of boxplotting we did eariler, in a script:
[mandi@mandi ~]$ cat sample.R
#!/usr/bin/env Rscript
cpu<-read.table("cpu")
jpeg("./sample.jpg")
boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by
Datacenter", ylab="One-Minute Load Average", main="Box Plot
of One-Minute Load Average, FEs", col=heat.colors(3))
dev.off()
[mandi@mandi ~]$ Rscript sample.R > /dev/null
[mandi@mandi ~]$ ls -l sample.jpg
-rw-rw-r-- 1 mandi staff 20137 Oct 24 20:44 sample.jpg
Hey!


•   I made a graph with a
    script!
What Else?
•   R can read data input from a variety of files with regular
    formats

•   R can also fetch data from the internet using the url()
    function

•   R has a number of functions available for dealing with
    reading data, creating data frames or other structures, and
    converting string text into numerical data modes

•   Extended packages provide support for structured data
    formats like JSON.
References
• http://www.slideshare.net/dataspora/an-
  interactive-introduction-to-r-programming-
  language-for-statistics
• http://www.harding.edu/fmccown/R/
• Art of R Programming, Norman Matloff, Copyright
  2011 No Starch Press
• Statistical Analysis with R, John M. Quick, Copyright
  2011 Packt Publishing

Weitere ähnliche Inhalte

Was ist angesagt?

Python Pandas
Python PandasPython Pandas
Python PandasSunil OS
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to pythonActiveState
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Introthnetos
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Sparksamthemonad
 
Python for R Users
Python for R UsersPython for R Users
Python for R UsersAjay Ohri
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Julian Hyde
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsFlink Forward
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RYogesh Khandelwal
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiInfluxData
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programmingAlberto Labarga
 
2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin CoroutinesEamonn Boyle
 

Was ist angesagt? (20)

Clojure class
Clojure classClojure class
Clojure class
 
Python Pandas
Python PandasPython Pandas
Python Pandas
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Scala
ScalaScala
Scala
 
Migrating from matlab to python
Migrating from matlab to pythonMigrating from matlab to python
Migrating from matlab to python
 
Meetup slides
Meetup slidesMeetup slides
Meetup slides
 
Clojure Intro
Clojure IntroClojure Intro
Clojure Intro
 
Spark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with SparkSpark 4th Meetup Londond - Building a Product with Spark
Spark 4th Meetup Londond - Building a Product with Spark
 
Python for R Users
Python for R UsersPython for R Users
Python for R Users
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...
 
Apache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API BasicsApache Flink Training: DataSet API Basics
Apache Flink Training: DataSet API Basics
 
Merge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using RMerge Multiple CSV in single data frame using R
Merge Multiple CSV in single data frame using R
 
Language R
Language RLanguage R
Language R
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
 
Introduction to R programming
Introduction to R programmingIntroduction to R programming
Introduction to R programming
 
Jug java7
Jug java7Jug java7
Jug java7
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines2019-01-29 - Demystifying Kotlin Coroutines
2019-01-29 - Demystifying Kotlin Coroutines
 
Collections
CollectionsCollections
Collections
 
Haskell
HaskellHaskell
Haskell
 

Ähnlich wie R for Pirates. ESCCONF October 27, 2011

Unit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxUnit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxSreeLaya9
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAnshika865276
 
Learning python
Learning pythonLearning python
Learning pythonFraboni Ec
 
Learning python
Learning pythonLearning python
Learning pythonJames Wong
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachReza Rahimi
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)Paul Chao
 

Ähnlich wie R for Pirates. ESCCONF October 27, 2011 (20)

محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي   R program د.هديل القفيديمحاضرة برنامج التحليل الكمي   R program د.هديل القفيدي
محاضرة برنامج التحليل الكمي R program د.هديل القفيدي
 
Ggplot2 v3
Ggplot2 v3Ggplot2 v3
Ggplot2 v3
 
Unit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptxUnit I - 1R introduction to R program.pptx
Unit I - 1R introduction to R program.pptx
 
Matlab lec1
Matlab lec1Matlab lec1
Matlab lec1
 
Aggregate.pptx
Aggregate.pptxAggregate.pptx
Aggregate.pptx
 
MATLAB Programming
MATLAB Programming MATLAB Programming
MATLAB Programming
 
Advanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.pptAdvanced Data Analytics with R Programming.ppt
Advanced Data Analytics with R Programming.ppt
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
Learning python
Learning pythonLearning python
Learning python
 
SMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning ApproachSMS Spam Filter Design Using R: A Machine Learning Approach
SMS Spam Filter Design Using R: A Machine Learning Approach
 
C
CC
C
 
AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)AI與大數據數據處理 Spark實戰(20171216)
AI與大數據數據處理 Spark實戰(20171216)
 
Modern C++
Modern C++Modern C++
Modern C++
 
R training2
R training2R training2
R training2
 

Mehr von Mandi Walls

DOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfDOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfMandi Walls
 
Addo reducing trauma in organizations with SLOs and chaos engineering
Addo  reducing trauma in organizations with SLOs and chaos engineeringAddo  reducing trauma in organizations with SLOs and chaos engineering
Addo reducing trauma in organizations with SLOs and chaos engineeringMandi Walls
 
Full Service Ownership
Full Service OwnershipFull Service Ownership
Full Service OwnershipMandi Walls
 
PagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsPagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsMandi Walls
 
InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020Mandi Walls
 
Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Mandi Walls
 
Using Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityUsing Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityMandi Walls
 
Adding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xAdding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xMandi Walls
 
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Mandi Walls
 
BuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopBuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopMandi Walls
 
InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018Mandi Walls
 
DevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopDevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopMandi Walls
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecMandi Walls
 
InSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beInSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beMandi Walls
 
habitat at docker bud
habitat at docker budhabitat at docker bud
habitat at docker budMandi Walls
 
Ingite Slides for InSpec
Ingite Slides for InSpecIngite Slides for InSpec
Ingite Slides for InSpecMandi Walls
 
Habitat at LinuxLab IT
Habitat at LinuxLab ITHabitat at LinuxLab IT
Habitat at LinuxLab ITMandi Walls
 
InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017Mandi Walls
 
Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Mandi Walls
 
InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017Mandi Walls
 

Mehr von Mandi Walls (20)

DOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdfDOD Raleigh Gamedays with Chaos Engineering.pdf
DOD Raleigh Gamedays with Chaos Engineering.pdf
 
Addo reducing trauma in organizations with SLOs and chaos engineering
Addo  reducing trauma in organizations with SLOs and chaos engineeringAddo  reducing trauma in organizations with SLOs and chaos engineering
Addo reducing trauma in organizations with SLOs and chaos engineering
 
Full Service Ownership
Full Service OwnershipFull Service Ownership
Full Service Ownership
 
PagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call TeamsPagerDuty: Best Practices for On Call Teams
PagerDuty: Best Practices for On Call Teams
 
InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020InSpec at DevOps ATL Meetup January 22, 2020
InSpec at DevOps ATL Meetup January 22, 2020
 
Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019Prescriptive Security with InSpec - All Things Open 2019
Prescriptive Security with InSpec - All Things Open 2019
 
Using Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure SecurityUsing Chef InSpec for Infrastructure Security
Using Chef InSpec for Infrastructure Security
 
Adding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17xAdding Security to Your Workflow With InSpec - SCaLE17x
Adding Security to Your Workflow With InSpec - SCaLE17x
 
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
Habitat talk at CodeMonsters Sofia, Bulgaria Nov 27 2018
 
BuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec WorkshopBuildStuff.LT 2018 InSpec Workshop
BuildStuff.LT 2018 InSpec Workshop
 
InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018InSpec Workshop at Velocity London 2018
InSpec Workshop at Velocity London 2018
 
DevOpsDays InSpec Workshop
DevOpsDays InSpec WorkshopDevOpsDays InSpec Workshop
DevOpsDays InSpec Workshop
 
Adding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpecAdding Security and Compliance to Your Workflow with InSpec
Adding Security and Compliance to Your Workflow with InSpec
 
InSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.beInSpec - June 2018 at Open28.be
InSpec - June 2018 at Open28.be
 
habitat at docker bud
habitat at docker budhabitat at docker bud
habitat at docker bud
 
Ingite Slides for InSpec
Ingite Slides for InSpecIngite Slides for InSpec
Ingite Slides for InSpec
 
Habitat at LinuxLab IT
Habitat at LinuxLab ITHabitat at LinuxLab IT
Habitat at LinuxLab IT
 
InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017InSpec Workshop DevSecCon 2017
InSpec Workshop DevSecCon 2017
 
Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017Habitat Workshop at Velocity London 2017
Habitat Workshop at Velocity London 2017
 
InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017InSpec Workflow for DevOpsDays Riga 2017
InSpec Workflow for DevOpsDays Riga 2017
 

Kürzlich hochgeladen

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 

Kürzlich hochgeladen (20)

Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 

R for Pirates. ESCCONF October 27, 2011

  • 1. R for Pirates Mandi Walls @lnxchk EscConf, Boston, MA October 27, 2011
  • 2. whoami • stats misfit • R tinkerer • large-farm runner • not a professional statistician :D
  • 3. What is R • Scripting language for stats work • Inspired by earlier S (for statistics) developed at AT&T • FOSS • Syntax inherits through Algol family, so looks somewhat like C/C++
  • 4. What Does R Do? • Manipulate data • Complex Modeling and Computation • Graphics and Visualization
  • 6. But Other Math Stuff! • Mathematica • MatLab • Minitab • MAPLE • Excel (yes. shutup h8rs. ask your CFOs what they use) • R provides sophisticated statistical and modeling capabilities, and is extendible through your own code
  • 7. Get R • Available for Linux, Mac, Windows • http://www.r-project.org/
  • 8. Fire! • R console on Mac • Interactive interpreter for your R needs • Can also run from the command line: R
  • 9. R Basics • R considers all elements to be vectors • A single number is a one-element vector • Use <- for assignment • Use c() to concatenate values into a vector
  • 11. Practice Datasets • data() • shows the sample sets included with your R
  • 12. Functions • Looks familiar! • Let’s see one! • “evencount” counts the number of even ints in a vector
  • 13.
  • 14. Datatypes • Vectors, the important ones • Scalars are really single-element vectors • Character strings • Matrices, rectangular arrays of numbers • Lists • Tables, useful for data transitions and temp work
  • 15. Vectors • R’s most-used data structure • All elements in a vector must have the same mode or data type • To add values to a vector, you concatenate into it with the c() function • Many mathematical functions can be performed on a vector, they can also be traversed like arrays • Index starts at 1, not 0!
  • 16. Scalars • One-element vectors > x <- 8 > x[1] [1] 8 • also climb your rigging ©Disney.
  • 17. Character Strings • Single-element vectors • Can do normal string with mode character things, like > t <- paste("yo","dawg") > y <- "abc" > t > length(y) [1] "yo dawg" [1] 1 > u <- strsplit(t,"") > mode(y) > u [1] "character" [[1]] [1] "y" "o" " " "d" "a" "w" "g"
  • 18. Matrices • Two-dimensional array > m <- rbind(c(1,4),c(2,2)) > m [,1] [,2] [1,] 1 4 [2,] 2 2 > m[1,2] [1] 4 > m[1,] [1] 1 4
  • 19. Lists • Contain elements of different types • Have a particular syntax > x <- list(u=2, v="abc") > x $u [1] 2 $v [1] "abc" > x$u [1] 2
  • 20. Data Frames • Matrices are limited to only a single type for all elements • A data frame can contain different types of data, can be read in from a file or created in realtime > df <- data.frame(list(kids=c("Olivia","Madison"),ages=c(10,8))) > df kids ages 1 Olivia 10 2 Madison 8 > df$ages [1] 10 8
  • 21. Putting R to Work • Read in a log file: access <- read.table("access.log", header=FALSE) > head(access) V1 V2 V3 V4 V5 V6 V7 V8 1 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.js HTTP/1.1 401 401 2 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.js HTTP/1.1 200 1970 3 192.168.1.10 - - [23/Oct/2011:07:03:33 -0500] GET /menu/menu.css HTTP/1.1 200 2258
  • 22. Fun with Plots • This plot series is going to make use of the “return codes” from the access log • We’ll do a series of plots that gradually get more sophisticated • This is a basic histogram of the data, it’s not much fun
  • 24. Barplot v2 barplot(table(access[,7]),ylab="Number of Pages",xlab="Return Code",main="Plot of Return Codes")
  • 25. Barplot v3 barplot(table(access[,7]),ylab="Number of Pages",xlab="Return Code",main="Plot of Return Codes", col=heat.colors(length(x)))
  • 26. Barplot v4 Source: wikipedia, http://en.wikipedia.org/wiki/Bar_%28establishment%29
  • 27. Writing Graphical Output to Files • Set up the output target by calling a graphics function: • pdf(), png(), jpeg(), etc • jpeg(“/var/www/images/returncodes-date.jpg”) • Call the plot function you have chosen, then call dev.off() • Can be used in batch mode to create graphics from your data
  • 28. Shopping is Hard, Let’s Do Math • Read in some load averages (one-min) loadavg<-read.table("load_avg.txt") head(loadavg) V1 1 3.79 2 3.11 3 2.94 4 4.81
  • 29. Summary Stats • Summarize the data with one function call • Gives the min, max, mean, median, and quartiles summary(loadavg) V1 Min. :0.760 1st Qu.:1.390 Median :1.970 Mean :2.302 3rd Qu.:3.080 Max. :5.070
  • 30. Summary Stats as Boxplot
  • 31. Same Thing, 3 Datacenters > cpu<-read.table("cpu") > head(cpu) V1 V2 1 3.78 smq 2 2.57 smq 3 3.69 smq 4 0.86 smq • Looks like there’s outliers. That could spell trouble! You found them with R awesomeness. Horay! boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot of One-Minute Load Average, FEs", col=topo.colors(3))
  • 32. Running R in Your Workflow • The little bit of boxplotting we did eariler, in a script: [mandi@mandi ~]$ cat sample.R #!/usr/bin/env Rscript cpu<-read.table("cpu") jpeg("./sample.jpg") boxplot(cpu[,1] ~ cpu[,2], xlab="Load Average at Time t, by Datacenter", ylab="One-Minute Load Average", main="Box Plot of One-Minute Load Average, FEs", col=heat.colors(3)) dev.off() [mandi@mandi ~]$ Rscript sample.R > /dev/null [mandi@mandi ~]$ ls -l sample.jpg -rw-rw-r-- 1 mandi staff 20137 Oct 24 20:44 sample.jpg
  • 33. Hey! • I made a graph with a script!
  • 34. What Else? • R can read data input from a variety of files with regular formats • R can also fetch data from the internet using the url() function • R has a number of functions available for dealing with reading data, creating data frames or other structures, and converting string text into numerical data modes • Extended packages provide support for structured data formats like JSON.
  • 35. References • http://www.slideshare.net/dataspora/an- interactive-introduction-to-r-programming- language-for-statistics • http://www.harding.edu/fmccown/R/ • Art of R Programming, Norman Matloff, Copyright 2011 No Starch Press • Statistical Analysis with R, John M. Quick, Copyright 2011 Packt Publishing

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n