SlideShare a Scribd company logo
1 of 38
Download to read offline
Stat405             Problem solving


                             Hadley Wickham
Monday, 13 September 2010
1. Projects
               2. Saving data
               3. Slot machine question
               4. Basic strategy
               5. Turning ideas into code



Monday, 13 September 2010
Projects

                   By Thursday, each group should send
                   Hadley an email with
                   1. the group name
                   2. the group members




Monday, 13 September 2010
Saving
                             data
Monday, 13 September 2010
Recall

                   To load data,
                   slots <- read.csv(“slots.csv”)




Monday, 13 September 2010
Your turn

                   Guess the name of the function you might
                   use to write an R object back to a csv file
                   on disk. Use it to save slots to
                   slots-2.csv.
                   What happens if you now read in
                   slots.csv with read.csv?



Monday, 13 September 2010
write.csv(slots, "slots-2.csv")
     slots2 <- read.csv("slots-2.csv")

     head(slots)
     head(slots2)

     str(slots)
     str(slots2)

     # Better, but still loses factor levels
     write.csv(slots, file = "slots-3.csv", row.names = F)
     slots3 <- read.csv("slots-3.csv")



Monday, 13 September 2010
Saving data

               # For long-term storage
               write.csv(slots, file = "slots.csv",
                 row.names = FALSE)

               # For short-term caching
               # Preserves factors etc.
               save(slots, file = "slots.rdata")




Monday, 13 September 2010
.csv             .rdata

                            read.csv()          load()
                write.csv( row.names =
                        FALSE)                  save()

                  Only data frames          Any R object
                   Can be read by any
                        program
                                              Only by R
                                          Short term caching of
                Long term storage        expensive computations

Monday, 13 September 2010
Compression
                   Easy to store compressed files to save
                   space:
                   write.csv(slots,
                     file = bzfile("slots.csv.bz2"),
                     row.names = FALSE)
                   slots4 <- read.csv("slots.csv.bz2")
                   Files stored with save() are automatically
                   compressed.


Monday, 13 September 2010
Slot
                            machine
                            question
Monday, 13 September 2010
Slots
                   Casino claims that slot machines have
                   prize payout of 92%. Is this claim true?
                   mean(slots$prize)
                   t.test(slots$prize, mu = 0.92)
                   qplot(prize, data = slots, binwidth = 1)
                   Can we do better?


Monday, 13 September 2010
Doing that lots of times, and calculating
                   the payoff should give us a better idea of
                   the payoff. But first we need some way to
                   calculate the prize from the windows.


                   Solution: Write a function



Monday, 13 September 2010
Strategy
                   1. Break complex tasks into smaller parts
                   2. Use words to describe how each part
                   should work
                   3. Translate words to R
                   4. When all parts work, combine into a
                   function (next class)



Monday, 13 September 2010
DD DD DD                  800
                                            windows <- c("7", "C", "C")
             7   7   7                 80
                                            # How can we calculate the
            BBB BBB BBB                40
                                            # payoff?
            BB BB BB                   25
             B   B   B                 10
             C   C   C                 10
            Any bar Any bar Any bar     5
                C           C   *       5
                C           *   C       5
                C           *   *       2
                                            DD doubles any winning
                *           C   *       2   combination. Two DD
                *           *   C       2   quadruples. DD is wild.
Monday, 13 September 2010
Your turn


                   We can simplify this table into 3 basic
                   cases of prizes. What are they? Take 3
                   minutes to brainstorm with a partner.




Monday, 13 September 2010
Cases

                   1. All windows have same value
                   2. A bar (B, BB, or BBB) in every window
                   3. Cherries and diamonds
                   4. (No prize)




Monday, 13 September 2010
Same values




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?
                   2. If so, look up prize value. How?




Monday, 13 September 2010
Same values

                   1. Check whether all windows are the
                      same. How?
                   2. If so, look up prize value. How?

                                    With a partner, brainstorm
                                    for 2 minutes on how to
                                    solve one of these problems
Monday, 13 September 2010
# Same value
         same <- length(unique(windows)) == 1

         # OR
         same <- windows[1] == windows[2] &&
                 windows[2] == windows[3]

         if (same) {
           # Lookup value
         }



Monday, 13 September 2010
&&, || vs. &, |
                   Use && and || to combine sub-conditions and
                   return a single TRUE or FALSE
                            Not & and | - these return vectors when
                            given vectors
                            && and || are “short-circuiting”: they do the
                            minimum amount of work




Monday, 13 September 2010
If
                   if (condition) {
                            expression
                   }
                   Condition should be a logical vector of
                   length 1




Monday, 13 September 2010
if (TRUE) {
       # This will be run
     }

     if (FALSE) {
       # This won't be run
     } else {
       # This will be
     }

     # Single line form: (not recommended)
     if (TRUE) print("True!)
     if (FALSE) print("True!)


Monday, 13 September 2010
if (TRUE) {
       # This will be run
     }

     if (FALSE) {
       # This won't be run
     } else {
       # This will be
     }
         Note indenting.
         Very important!
     # Single line form: (not recommended)
     if (TRUE) print("True!)
     if (FALSE) print("True!)


Monday, 13 September 2010
x <- 5
     if (x < 5) print("x < 5")
     if (x == 5) print("x == 5")

     x <- 1:5
     if (x < 3) print("What should happen here?")

     if (x[1] < x[2]) print("x1 < x2")
     if (x[1] < x[2] && x[2] < x[3]) print("Asc")
     if (x[1] < x[2] || x[2] < x[3]) print("Asc")




Monday, 13 September 2010
if (window[1] == "DD") {
       prize <- 800
     } else if (windows[1] == "7") {
       prize <- 80
     } else if (windows[1] == "BBB") ...

     # Or use               subsetting
     c("DD" =               800, "7" =   80,   "BBB"   =   40)
     c("DD" =               800, "7" =   80,   "BBB"   =   40)["BBB"]
     c("DD" =               800, "7" =   80,   "BBB"   =   40)["0"]
     c("DD" =               800, "7" =   80,   "BBB"   =   40)[window[1]]



Monday, 13 September 2010
Your turn


                   Complete the previous code so that if all
                   the values in win are the same, then prize
                   variable will be set to the correct amount.




Monday, 13 September 2010
All bars

                   How can we determine if all of the
                   windows are B, BB, or BBB?
                   (windows[1] == "B" ||
                    windows[1] == "BB" ||
                    windows[1] === "BBB") && ... ?




Monday, 13 September 2010
All bars

                   How can we determine if all of the
                   windows are B, BB, or BBB?
                   (windows[1] == "B" ||
                    windows[1] == "BB" ||
                    windows[1] === "BBB") && ... ?

                                   Take 1 minute to brainstorm
                                   possible solutions
Monday, 13 September 2010
windows[1] %in% c("B", "BB", "BBB")
     windows %in% c("B", "BB", "BBB")

     allbars <- windows %in% c("B", "BB", "BBB")
     allbars[1] & allbars[2] & allbars[3]
     all(allbars)


     # See also ?any for the complement




Monday, 13 September 2010
Your turn

                   Complete the previous code so that the
                   correct value of prize is set if all the
                   windows are the same, or they are all
                   bars




Monday, 13 September 2010
payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
       "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

     same <- length(unique(windows)) == 1
     allbars <- all(windows %in% c("B", "BB", "BBB"))

     if (same) {
       prize <- payoffs[windows[1]]
     } else if (allbars) {
       prize <- 5
     }



Monday, 13 September 2010
Cherries

                   Need numbers of cherries, and numbers
                   of diamonds (hint: use sum)
                   Then need to look up values (like for the
                   first case) and multiply together




Monday, 13 September 2010
cherries <- sum(windows == "C")
         diamonds <- sum(windows == "DD")

         c(0, 2, 5)[cherries + 1] *
           c(1, 2, 4)[diamonds + 1]




Monday, 13 September 2010
payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40,
       "BB" = 25, "B" = 10, "C" = 10, "0" = 0)

     same <- length(unique(windows)) == 1
     allbars <- all(windows %in% c("B", "BB", "BBB"))

     if (same) {
       prize <- payoffs[windows[1]]
     } else if (allbars) {
       prize <- 5
     } else {
       cherries <- sum(windows == "C")
       diamonds <- sum(windows == "DD")

          prize <- c(0, 2, 5)[cherries + 1] *
            c(1, 2, 4)[diamonds + 1]
     }

Monday, 13 September 2010
Writing a function

                   Now we need to wrap up this code in to a
                   reusable fashion. We need a function
                   Have used functions a lot, next time we’ll
                   learn how to write one.




Monday, 13 September 2010

More Related Content

Similar to 07 problem-solving

Similar to 07 problem-solving (20)

08 functions
08 functions08 functions
08 functions
 
05 subsetting
05 subsetting05 subsetting
05 subsetting
 
06 data
06 data06 data
06 data
 
07 Problem Solving
07 Problem Solving07 Problem Solving
07 Problem Solving
 
04 reports
04 reports04 reports
04 reports
 
04 Reports
04 Reports04 Reports
04 Reports
 
08 Functions
08 Functions08 Functions
08 Functions
 
Progressive Advancement, by way of progressive enhancement
Progressive Advancement, by way of progressive enhancementProgressive Advancement, by way of progressive enhancement
Progressive Advancement, by way of progressive enhancement
 
10 simulation
10 simulation10 simulation
10 simulation
 
10 simulation
10 simulation10 simulation
10 simulation
 
Playing With Ruby
Playing With RubyPlaying With Ruby
Playing With Ruby
 
Distributed Counters in Cassandra (Cassandra Summit 2010)
Distributed Counters in Cassandra (Cassandra Summit 2010)Distributed Counters in Cassandra (Cassandra Summit 2010)
Distributed Counters in Cassandra (Cassandra Summit 2010)
 
ActiveSupport
ActiveSupportActiveSupport
ActiveSupport
 
Qconsp domesticando dragoes com soluções escaláveis
Qconsp   domesticando dragoes com soluções escaláveisQconsp   domesticando dragoes com soluções escaláveis
Qconsp domesticando dragoes com soluções escaláveis
 
StORM preview
StORM previewStORM preview
StORM preview
 
Phplx mongodb
Phplx mongodbPhplx mongodb
Phplx mongodb
 
09 bootstrapping
09 bootstrapping09 bootstrapping
09 bootstrapping
 
Immutability
ImmutabilityImmutability
Immutability
 
Taming Pythons with ZooKeeper
Taming Pythons with ZooKeeperTaming Pythons with ZooKeeper
Taming Pythons with ZooKeeper
 
13 case-study
13 case-study13 case-study
13 case-study
 

More from Hadley Wickham (19)

27 development
27 development27 development
27 development
 
24 modelling
24 modelling24 modelling
24 modelling
 
23 data-structures
23 data-structures23 data-structures
23 data-structures
 
Graphical inference
Graphical inferenceGraphical inference
Graphical inference
 
22 spam
22 spam22 spam
22 spam
 
21 spam
21 spam21 spam
21 spam
 
20 date-times
20 date-times20 date-times
20 date-times
 
19 tables
19 tables19 tables
19 tables
 
18 cleaning
18 cleaning18 cleaning
18 cleaning
 
17 polishing
17 polishing17 polishing
17 polishing
 
16 critique
16 critique16 critique
16 critique
 
15 time-space
15 time-space15 time-space
15 time-space
 
14 case-study
14 case-study14 case-study
14 case-study
 
12 adv-manip
12 adv-manip12 adv-manip
12 adv-manip
 
11 adv-manip
11 adv-manip11 adv-manip
11 adv-manip
 
03 extensions
03 extensions03 extensions
03 extensions
 
02 large
02 large02 large
02 large
 
01 intro
01 intro01 intro
01 intro
 
25 fin
25 fin25 fin
25 fin
 

07 problem-solving

  • 1. Stat405 Problem solving Hadley Wickham Monday, 13 September 2010
  • 2. 1. Projects 2. Saving data 3. Slot machine question 4. Basic strategy 5. Turning ideas into code Monday, 13 September 2010
  • 3. Projects By Thursday, each group should send Hadley an email with 1. the group name 2. the group members Monday, 13 September 2010
  • 4. Saving data Monday, 13 September 2010
  • 5. Recall To load data, slots <- read.csv(“slots.csv”) Monday, 13 September 2010
  • 6. Your turn Guess the name of the function you might use to write an R object back to a csv file on disk. Use it to save slots to slots-2.csv. What happens if you now read in slots.csv with read.csv? Monday, 13 September 2010
  • 7. write.csv(slots, "slots-2.csv") slots2 <- read.csv("slots-2.csv") head(slots) head(slots2) str(slots) str(slots2) # Better, but still loses factor levels write.csv(slots, file = "slots-3.csv", row.names = F) slots3 <- read.csv("slots-3.csv") Monday, 13 September 2010
  • 8. Saving data # For long-term storage write.csv(slots, file = "slots.csv", row.names = FALSE) # For short-term caching # Preserves factors etc. save(slots, file = "slots.rdata") Monday, 13 September 2010
  • 9. .csv .rdata read.csv() load() write.csv( row.names = FALSE) save() Only data frames Any R object Can be read by any program Only by R Short term caching of Long term storage expensive computations Monday, 13 September 2010
  • 10. Compression Easy to store compressed files to save space: write.csv(slots, file = bzfile("slots.csv.bz2"), row.names = FALSE) slots4 <- read.csv("slots.csv.bz2") Files stored with save() are automatically compressed. Monday, 13 September 2010
  • 11. Slot machine question Monday, 13 September 2010
  • 12. Slots Casino claims that slot machines have prize payout of 92%. Is this claim true? mean(slots$prize) t.test(slots$prize, mu = 0.92) qplot(prize, data = slots, binwidth = 1) Can we do better? Monday, 13 September 2010
  • 13. Doing that lots of times, and calculating the payoff should give us a better idea of the payoff. But first we need some way to calculate the prize from the windows. Solution: Write a function Monday, 13 September 2010
  • 14. Strategy 1. Break complex tasks into smaller parts 2. Use words to describe how each part should work 3. Translate words to R 4. When all parts work, combine into a function (next class) Monday, 13 September 2010
  • 15. DD DD DD 800 windows <- c("7", "C", "C") 7 7 7 80 # How can we calculate the BBB BBB BBB 40 # payoff? BB BB BB 25 B B B 10 C C C 10 Any bar Any bar Any bar 5 C C * 5 C * C 5 C * * 2 DD doubles any winning * C * 2 combination. Two DD * * C 2 quadruples. DD is wild. Monday, 13 September 2010
  • 16. Your turn We can simplify this table into 3 basic cases of prizes. What are they? Take 3 minutes to brainstorm with a partner. Monday, 13 September 2010
  • 17. Cases 1. All windows have same value 2. A bar (B, BB, or BBB) in every window 3. Cherries and diamonds 4. (No prize) Monday, 13 September 2010
  • 18. Same values Monday, 13 September 2010
  • 19. Same values 1. Check whether all windows are the same. How? Monday, 13 September 2010
  • 20. Same values 1. Check whether all windows are the same. How? 2. If so, look up prize value. How? Monday, 13 September 2010
  • 21. Same values 1. Check whether all windows are the same. How? 2. If so, look up prize value. How? With a partner, brainstorm for 2 minutes on how to solve one of these problems Monday, 13 September 2010
  • 22. # Same value same <- length(unique(windows)) == 1 # OR same <- windows[1] == windows[2] && windows[2] == windows[3] if (same) { # Lookup value } Monday, 13 September 2010
  • 23. &&, || vs. &, | Use && and || to combine sub-conditions and return a single TRUE or FALSE Not & and | - these return vectors when given vectors && and || are “short-circuiting”: they do the minimum amount of work Monday, 13 September 2010
  • 24. If if (condition) { expression } Condition should be a logical vector of length 1 Monday, 13 September 2010
  • 25. if (TRUE) { # This will be run } if (FALSE) { # This won't be run } else { # This will be } # Single line form: (not recommended) if (TRUE) print("True!) if (FALSE) print("True!) Monday, 13 September 2010
  • 26. if (TRUE) { # This will be run } if (FALSE) { # This won't be run } else { # This will be } Note indenting. Very important! # Single line form: (not recommended) if (TRUE) print("True!) if (FALSE) print("True!) Monday, 13 September 2010
  • 27. x <- 5 if (x < 5) print("x < 5") if (x == 5) print("x == 5") x <- 1:5 if (x < 3) print("What should happen here?") if (x[1] < x[2]) print("x1 < x2") if (x[1] < x[2] && x[2] < x[3]) print("Asc") if (x[1] < x[2] || x[2] < x[3]) print("Asc") Monday, 13 September 2010
  • 28. if (window[1] == "DD") { prize <- 800 } else if (windows[1] == "7") { prize <- 80 } else if (windows[1] == "BBB") ... # Or use subsetting c("DD" = 800, "7" = 80, "BBB" = 40) c("DD" = 800, "7" = 80, "BBB" = 40)["BBB"] c("DD" = 800, "7" = 80, "BBB" = 40)["0"] c("DD" = 800, "7" = 80, "BBB" = 40)[window[1]] Monday, 13 September 2010
  • 29. Your turn Complete the previous code so that if all the values in win are the same, then prize variable will be set to the correct amount. Monday, 13 September 2010
  • 30. All bars How can we determine if all of the windows are B, BB, or BBB? (windows[1] == "B" || windows[1] == "BB" || windows[1] === "BBB") && ... ? Monday, 13 September 2010
  • 31. All bars How can we determine if all of the windows are B, BB, or BBB? (windows[1] == "B" || windows[1] == "BB" || windows[1] === "BBB") && ... ? Take 1 minute to brainstorm possible solutions Monday, 13 September 2010
  • 32. windows[1] %in% c("B", "BB", "BBB") windows %in% c("B", "BB", "BBB") allbars <- windows %in% c("B", "BB", "BBB") allbars[1] & allbars[2] & allbars[3] all(allbars) # See also ?any for the complement Monday, 13 September 2010
  • 33. Your turn Complete the previous code so that the correct value of prize is set if all the windows are the same, or they are all bars Monday, 13 September 2010
  • 34. payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } Monday, 13 September 2010
  • 35. Cherries Need numbers of cherries, and numbers of diamonds (hint: use sum) Then need to look up values (like for the first case) and multiply together Monday, 13 September 2010
  • 36. cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] Monday, 13 September 2010
  • 37. payoffs <- c("DD" = 800, "7" = 80, "BBB" = 40, "BB" = 25, "B" = 10, "C" = 10, "0" = 0) same <- length(unique(windows)) == 1 allbars <- all(windows %in% c("B", "BB", "BBB")) if (same) { prize <- payoffs[windows[1]] } else if (allbars) { prize <- 5 } else { cherries <- sum(windows == "C") diamonds <- sum(windows == "DD") prize <- c(0, 2, 5)[cherries + 1] * c(1, 2, 4)[diamonds + 1] } Monday, 13 September 2010
  • 38. Writing a function Now we need to wrap up this code in to a reusable fashion. We need a function Have used functions a lot, next time we’ll learn how to write one. Monday, 13 September 2010