SlideShare ist ein Scribd-Unternehmen logo
1 von 108
Downloaden Sie, um offline zu lesen
+	
  




        Crowdsourcing	
  for	
  
        Mul0media	
  Retrieval	
  
        Marco	
  Tagliasacchi	
  
        Politecnico	
  di	
  Milano,	
  Italy	
  
+	
  
        Outline	
  

        n    Crowdsourcing	
  applica0ons	
  in	
  mul0media	
  retrieval	
  

        n    Aggrega0ng	
  annota0ons	
  

        n    Aggrega0ng	
  and	
  learning	
  

        n    Crowdsourcing	
  at	
  work	
  
+	
  Crowdsourcing	
  applica0ons	
  in	
  
  mul0media	
  retrieval	
  
+	
  
        Crowdsourcing	
  

        n    Crowdsourcing	
  is	
  an	
  example	
  of	
  human	
  compu+ng	
  

        n    Use	
  an	
  online	
  community	
  of	
  human	
  workers	
  to	
  complete	
  useful	
  
              tasks	
  

        n    The	
  task	
  is	
  outsourced	
  to	
  an	
  undefined	
  public	
  

        n    Main	
  idea:	
  design	
  tasks	
  that	
  are	
  
              n    Easy	
  for	
  humans	
  
              n    Hard	
  for	
  machines	
  
+	
  
        Crowdsourcing	
  

        n    Crowdsourcing	
  plaHorms	
  
              n     Paid	
  contributors	
  
                     n  Amazon	
  Mechanical	
  Turk	
  (www.mturk.com)	
  

                     n    CrowdFlower	
  (crowdflower.com)	
  
                     n    oDesk	
  (www.odesk.com)	
  
                     n    …	
  
              n     Volunteers	
  
                     n  Foldit	
  (www.fold.it)	
  

                     n    Duolingo	
  (www.duolingo.com)	
  
                     n    …	
  
              	
  
+	
  
        Applica0ons	
  in	
  mul0media	
  retrieval	
  
        	
  
        n     Create	
  annotated	
  data	
  sets	
  for	
  training	
  
               n    Reduces	
  both	
  cost	
  and	
  0me	
  needed	
  to	
  gather	
  annota0ons,	
  
               n    …but	
  annota0ons	
  might	
  be	
  noisy!	
  	
  

        n     	
  Validate	
  the	
  output	
  of	
  mul0media	
  retrieval	
  systems	
  

        n     Query	
  expansion	
  /	
  reformula0on	
  
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Sorokin	
  and	
  Forsyth,	
  2008]	
  

        n    Collect	
  annota0ons	
  for	
  computer	
  vision	
  data	
  sets	
  	
  
              n           people	
  segmenta0on	
  
              	
  
                      Protocol 1
                     Protocol 1
                      Protocol 2
                     Protocol 2
Proto
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Sorokin	
  and	
  Forsyth,	
  2008]	
  
                         Protocol 2
        n    Collect	
  annota0ons	
  for	
  computer	
  vision	
  data	
  sets	
  	
  
              n     people	
  segmenta0on	
  and	
  pose	
  annota0on	
  
              	
  
                         Protocol 3
                         Protocol 4




                        Figure 1. Example results show the example results obtained from the annotation experiments. The first column is the implementation of
+	
  
  Experiment 3: trace the boundary of the person.

       1

      0.8
                                  Crea0ng	
  annotated	
  training	
  sets	
  
                                  area(XOR)/area(AND). The lower the better. Mean 0.21, std 0.14, median 0.16

                                                                                                                     knee

                                                                                                                                                                                                            A
                                  [Sorokin	
  and	
  Forsyth,	
  2008]	
  
      0.6                                                                                                                           G                                                                                                                                 B
      0.4                                                                                                       F
                                                                                      E
                                                                C           D
      0.2           A                    B


       0
            0                      50                     100           150          200                        250                 300




                                  n             Observa0ons:	
  
  C                                              n        Annotators	
  make	
  errors	
  
                                                              D                    E                   F                                                                                                                                                              G

                                                 n        Quality	
  of	
  annotators	
  is	
  heterogeneous	
  
                                                 n        The	
  quality	
  of	
  the	
  annota0ons	
  depends	
  on	
  the	
  difficulty	
  of	
  the	
  task	
  
  Experiment 4: click on 14 landmarks
       50


                                  Mean error in pixels between annotation points. The lower the better. Mean 8.71, std 6.29, median 7.35.
       40

                                                                                                                                                                                           14                                                    12
                                                                                                                                                                                                                                                 12                                          7
                                                                                                                                                                                                                                                                                             7
                                                                                                                                                                                                                                             14
                                                                                                                                                                                                                                              14
                                                                                                                                                                                     9 13                                                      11
                                                                                                                                                                                                                                               11
                                                                                                                                                                                                                                             1310
                                                                                                                                                                                                                                             1310
       30                                                                                                                                                                                 10
                                                                         figure 6                                                                                                                                                          9                                                 8
                                                                                                                                                                                                                                                                                             8
                                                                                                                                                                                                                                           9
                                                                                                                                                                                     8                                           8
                                                                                                                                                                                                                                 8
                                                                                                                                                                            7
                                                                                                                          knee
                                                                                                                                                                                                                                                                                             9
                                                                                                                                                                                                                                                                                             9    14
                                                                                                                                                                                                                                                                                                  14
                                                                                                                                                                                                                         7
                                                                                                                                                                                                                         7                                                                   13
                                                                                                                                                                                                                                                                                             13
                                                                                                                                              G                                                  11
                                                                                      13                                                                                                                                                           rWrist
                                                                                                                                                                                                                                                                                               10
                                                                                                                                                                                                                                                                                               10
                                                                                                                                                                                                                                                                                                                                                    rHip
       20                                                                                                                               rAnkle                A                       3         4
                                                                                                                                                                                                12          B     13
                                                                                                                                                                                                                                       3
                                                                                                                                                                                                                                       3      4
                                                                                                                                                                                                                                              4
                                                                                                                                                                                                                                                                      C                           11
                                                                                                                                                                                                                                                                                                  11
                                                                                                                                                                                                                                                                                                                                 13                                                                  13

                                                                                                                                                                                                                                                                                                                                                                                                                        Neck
                                                                                                                                                                                                                                                   rElbow                                    12
                                                                                                                                                                                                                                                                                             12                                                     lHip
                                                                                                  12                      F             rKnee                                                                                        2
                                                                                                                                                                                                                                     2             5
                                                                                                                                                                                                                                                   5
                                                                                                                                                                                                                                                                                    4
                                                                                                                                                                                                                                                                                    4    3
                                                                                                                                                                                                                                                                                         3                                       12                                                                  12
                                                                                D              E                                                                                      2
                                                                                                                                                                                                5
                                                                                                                                                                                                                  12
                                                                                                                                                                                                                                                   lElbow                                                                                           rShoulder                                                           Head
       10                                    B                      C                                                                   lKnee                                                                                                          6
                                                                                                                                                                                                                                                       6                                 5
                                                                                                                                                                                                                                                                                         5
                                                                                                                                                                                                                                                                                                       2
                                                                                                                                                                                                                                                                                                       2
                    A                                                                             11
                                                                                                                                                                                       1                          11
                                                                                                                                                                                                                                                   lWrist                                                        1
                                                                                                                                                                                                                                                                                                                                 11
                                                                                                                                                                                                                                                                                                                                                    lShoulder                                        11
                                                                                                                                        lAnkle                                                  6                                1
                                                                                                                                                                                                                                 1
                                                                                                                                                                                                                                                                                6
                                                                                                                                                                                                                                                                                6                                1


                                                                                                  10                                                                                                              10                                                                                                             10                                                                  10

        0
            0                           50                  100               150              200
                                                                                                 9                        250                     300              350                                             9                                                                                                              9                                                                   9


                                                                                           8                                                                                                                       8                                                                                                              8                                                                   8


                                                                                                   7                                                                                                               7                                                                                                              7                                                                   7
                                                     14
                                                     14                                   7
                                                                                          7
                                                                                                                                                                                                                                                                                              14
                                                                                                                                                                                                                                                                                              14
                                                                                                                                                                                                                                                                                              14
                                   10
                                   10        13
                                             13                                                    6                                                                            14
                                                                                                                                                                                14
                                                                                                                                                                                                                   6                                                                                                              6                                                                   6
                                                 9                                        8
                                                                                          8
                                                                                          8        14
                                                                                                   14
                                                 9
                                                                                                                     12
                                                                                                                     12                                                                                                                                                                           10
                                                                                                                                                                                                                                                                                                  10
                                                                                                                                                                                                                                                                                                  10
                                                                                                                                                                            13
                                                                                                                                                                            13                                                                                                          13
                                                                                                                                                                                                                                                                                        13
                        11
                        11                                                                 9
                                                                                           9
                                                                                                  13
                                                                                                   5
                                                                                                  13                                                                9
                                                                                                                                                                    9        10
                                                                                                                                                                             10                                                                                                         13                                        5                                                                   5
                                                                                                       10
                                                                                                       10                                                                                                          5                                                                9
                                                                                                                                                                                                                                                                                    9
                                                                                                                                                                                                                                                                                    9
                                                                                                                                                                                                                                                                                                       11
                                                                                                                                                                                                                                                                                                       11
                                                                                                                                                                                                                                                                                                       11
                                                                                                                11
                                                                                                                11
                         4
                         4
                             12
                                    3
                                    3
                                                      8
                                                      8                                    4                                                                            8
                                                                                                                                                                        8             11
                                                                                                                                                                                      11
                                                                                                                                                                                                                   4
                                                                                                                                                                                                                                                            3
                                                                                                                                                                                                                                                            3
                                                                                                                                                                                                                                                            3                                               12
                                                                                                                                                                                                                                                                                                            12
                                                                                                                                                                                                                                                                                                            12
                                                                                                                                                                                                                                                                                                                                  4                                                                   4


   D                                                                    E                                                                     F                                                              G
                             12                                                                                                                                                            12                                                                               4
                                                                                                                                                                                                                                                                            4
                                                                                                                                                               3
                                                                                                                                                               3
                                                                                                                                                                        4
                                                                                                                                                                        4                  12                                                                               4
                                                                                                                                                                            7
                                                                                                                                                                            7
                                                           7
                                                           7                                       3                                                                                                                                                                                                        7
                                                                                                                                                                                                                                                                                                            7                     3                                                                   3
                                                                                                                                                                                                                   3                                                               8                        7
                                                                                              3
                                                                                              3
                                                                                                  100       4
                                                                                                            4

                                                                                                            5
                                                                                                            5
                                                                                                                110           120       130        140
                                                                                                                                                         150       160
                                                                                                                                                                            5
                                                                                                                                                                            5
                                                                                                                                                                                170         180       190
                                                                                                                                                                                                            200    100           110         120      130       140   150       160
                                                                                                                                                                                                                                                                                   88
                                                                                                                                                                                                                                                                                             5 170
                                                                                                                                                                                                                                                                                             5
                                                                                                                                                                                                                                                                                             5               180     190   200    100   110   120    130   140   150   160   170   180   190   200    100   110   120   130   140

                         5
                         5                                                                                                                                    2
                                                                                                                                                              2
                                             2
                                             2
                                                                                      2
                                                                                      2                                                                                                                                                        2
                                                                                                                                                                                                                                               2
                                                                                                                                                                                                                                               2
                                                                                                            6
                                                                                                            6                                                               6
                                                                                                                                                                            6
                                                                                                                                                          1
                                                                                                                                                          1                                                                  1
                                                                                                                                                                                                                             1
                                                                                                                                                                                                                             1
                                   1
                                   1
                6
                6                                                                     1
                                                                                      1                                                                                                                                                                                             6
                                                                                                                                                                                                                                                                                    6
                                                                                                                                                                                                                                                                                    6




                                                                                    Figure 6. Quality details per landmark. We present analysis of annotation quality per landmark in experiment 4. We
Figure 5. Quality details. We presentbest pair forof annotation quality forbetween 35th4. For every image the best fitting between points “C” and
                                        detailed analysis all annotations experiments 3 and and 65th percentiles -                                                          “E” of experiment 4 in fig. 5.
pair of annotations is selected. The score of the best pair is shown in the figure. For experiment 3 we score annotations by the area of
their symmetric difference (XOR) divided bysame scale:union(OR). For experimenttowe compute the average distance between the
                                       the the area of their from image 100 4 200 on horizontal axis and from 3 pixels to 13 pixels                                         of error on the vertical axis. T
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Soleymani	
  and	
  Larson,	
  2010]	
  

        n    MediaEval	
  2010	
  Affect	
  Task	
  

        n    Use	
  of	
  Amazon	
  Mechanical	
  Turk	
  to	
  annotate	
  the	
  Affect	
  Task	
  Corpus	
  

        n    126	
  videos	
  (2-­‐5	
  mins	
  in	
  length)	
  

        n    Annotate	
  
              n    Mood	
  (e.g.,	
  pleased,	
  helpless,	
  energe0c,	
  etc.)	
  
              n    Emo0on	
  (e.g.,	
  sadness,	
  joy,	
  anger,	
  etc.)	
  
              n    Boreness	
  (nine	
  point	
  ra0ng	
  scale)	
  
              n    Like	
  (nine	
  point	
  ra0ng	
  scale)	
  
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Nowak	
  and	
  Ruger.,	
  2010]	
  

        n    Crowdsourcing	
  image	
  concepts.	
  53	
  concepts,	
  e.g.,	
  
              n    Abstract	
  categories:	
  pPlace contains threehmutual exclusive concepts, namely In-
                                                artylife,	
  beach	
   olidays,	
  snow,	
  etc.	
                      3.3.1 Design of HIT Template
                                                   door, Outdoor and No Visual Place. In contrast several op-              The design of the HITs at MTurk for the im
              n    Time	
  of	
  the	
  day:	
  day,	
  tional concepts belongue	
  the category Landscape Elements.
                                                         night,	
  no	
  visual	
  c to                                 tion task is similar to the annotation tool that w
                                                   The task of the annotators was to choose exactly one concept         to the expert annotators (see Sec. 3.2). Each H
              n    …	
                            for categories with mutual exclusive concepts and to select          of the annotation of one image with all applica
                                                   all applicable concepts for optional designed concepts. All          cepts. It is arranged as a question survey and
                                                   photos were annotated at an image-based level. The anno-             into three sections. The section Scene Descript
        n    Subset	
  of	
  99	
  images	
  from	
  the	
  ImageCLEF2009	
  dataset	
  
                                                   tator tagged the whole image with all applicable concepts            section Representation each contain four questio
                                                   and then continued with the next image.                              tion Pictured Objects consists of three questions
                                                                                                                        each section the image to be annotated is pres
                                                                                                                        repetition of the image ensures that the turke
                                                                                                                        while answering the questions without scrolling
                                                                                                                        of the document. Fig. 2 illustrates the questi
                                                                                                                        section Representation.




                                                   Figure 1: Annotation tool that was used for the ac-
                                                   quisition of expert annotations.
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Nowak	
  and	
  Ruger.,	
  2010]	
  

        n    Study	
  of	
  expert	
  and	
  non-­‐expert	
  labeling	
  

              n    Inter-­‐annota0on	
  agreement	
  among	
  experts:	
  	
  
                    n  very	
  high	
  


              n    Influence	
  of	
  the	
  expert	
  ground	
  truth	
  on	
  concept-­‐based	
  retrieval	
  ranking:	
  	
  
                    n  very	
  limited	
  


              n    Inter-­‐annota0on	
  agreement	
  among	
  non-­‐experts	
  
                    n  High,	
  although	
  not	
  as	
  good	
  as	
  among	
  experts	
  


              n    Influence	
  of	
  averaged	
  annota0ons	
  (experts	
  vs.	
  non	
  experts)	
  on	
  concept-­‐based	
  
                    retrieval	
  ranking:	
  
                    n  Averaging	
  filters	
  out	
  noisy	
  non-­‐expert	
  annota0ons	
  
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Vondrick	
  et	
  al.,	
  2010]	
  

        n    Crowdsourcing	
  object	
  tracking	
  in	
  video	
  
                        4       C. Vondrick, D. Ramanan, D. Patterson
        n    Annotators	
  draw	
  bounding	
  boxes	
  




                        Fig. 2: Our video labeling user interface. All previously labeled entities are shown
+	
  
        Crea0ng	
  annotated	
  training	
  sets	
  
        [Vondrick	
  et	
  al.,	
  2010]	
  

        n    Annotators	
  label	
  the	
  enclosing	
  bounding	
  box	
  of	
  an	
  en0ty	
  every	
  T	
  
              frames	
  

        n    Bounding	
  boxes	
  at	
  intermediate	
  0me	
  instants	
  are	
  interpolated	
  

        n    Interes0ng	
  trade-­‐off	
  between	
  	
  
              n    Cost	
  of	
  12 turk	
  workers	
  D. Ramanan, D. Patterson
                                  M C. Vondrick,
              n    Cost	
  of	
  interpola0on	
  on	
  Amazon	
  EC2	
  cloud	
  




                                             (a) Field drills                        (b) Basketball players
+	
  
                         Crea0ng	
  annotated	
  training	
  sets	
  
 ments between F and each of the other
 e, every document was judged as more
                                             4.1 HI T Design
                                             The use of preference judgments is prone to have a very simple

                         [Urbano	
  et	
  al.,	
  2010]	
  
  which was judged equally similar (or       HIT design (see Figure 4). We asked workers to listen to the
  new segment appears to the left of F with
ed more relevant, and G is set up in the                                                     the two incipits to
r the second iteration, in the rightmost     compare. Next, they were asked what variation was more similar
 s needed because F and G were already       to the original melody, allowing 3 options: A is more similar, B is
d be the pivot for the leftmost segment.     more similar, and they are either equally similar or dissimilar. We
ged similar to B, but D and E are evalua0on	
  of	
  music	
  informa0on	
  retrieval	
  systems	
   they
                      n  Goal:	
  judged as indicated them that if one melody was part of another one,
  set up in a segment to the right of B. At  had to be considered equally similar, so as to comply with the
rdered groups of relevance formed with       original guidelines. As optional questions, they were asked for
                      n  Use	
  crowdsourcing	
  amusicalalterna0ve	
  if o	
  experts	
  to	
  comments gor
Note that not all the 21 judgments were      their
                                                      s	
  an	
   background, t any, and for create	
   round-­‐
                            truths	
  of	
  par0ally	
  ordered	
  lists	
  
  aggregate every incipit (e.g. G is only    suggestions to give us some feedback.

organized partially ordered list. Pivots for each
ace. Documents that have been pivots already

nts                Preference Judgments
G, B, F       C<F, D<F, E<F, A<F, G=F, B<F
B , F, G                C=B, D>B, E>B, A=B
 E , F, G                         C=A, D=E
D), (F, G)                                -
ents, the sample of rankings given to each
e than with the original method. Whenever
 over another one, it would be given a rank
n case it was judged equally similar, a rank
its sample. With the original methodology,
anks given to an incipit could rangegreement	
  (92%	
  complete	
  +	
  par0al)	
  with	
  experts	
  
                      n  Good	
  a from 1
 ch increases the variance of the samples.
eme, the two samples of rankings given to
s are the opposite and therefore have the
 Mann-Whitney U tests can be used again
 ank samples are different or not. Because
variable, the effect size is larger, which
+	
  
        Validate	
  the	
  output	
  of	
  MIR	
  systems
        [Snoek	
  et	
  al.,	
  2010][Freiburg	
  et	
  al.,	
  2011]	
  

        n      Search	
  engine	
  for	
  archival	
  rock	
  ‘n’	
  roll	
  concert	
  video	
  

        n      Use	
  of	
  crowdsourcing	
  to	
  improve,	
  extend	
  and	
  share	
  automa0cally	
  
        Audience
                detected	
  concepts	
  in	
  video	
  fragments	
  
                               Close-up             Hands              Pinkpop hat             Keyboard                Guitar player                                                                         Singer              Stage       Pink
                                                                                                                                                                 Drummer                 Over the shuolder



 Figure 1: Eleven common concert concepts we detect automatically, and for which we collect user-feed
     Audience       Close-up       Hands   Pinkpop hat      Keyboard     Guitar player   Drummer   Over the shuolder    Singer                          Stage        Pinkpop logo



  Figure 1: Eleven common concert concepts we detect automatically, and for which we collect user-feedback.
                                                                                      180

                                                                                                                                                            Excluded correct fragment labels
                                                                                                                                                                                                                                         first exp
                                                                                                                                               160                                                                                       back. A
                                                                                                                                                            Crowdsourcing errors

                                                                                                                                               140
                                                                                                                                                                                                                                         vided t
                                                                                                                                                                                                                                         a prefer
                                                                                                                                                                                                                                         showed




                                                                                                                             Video Fragments
                                                                                                                                               120

                                                                                                                                                                                                                                         respond
                                                                                                                                               100
                                                                                                                                                                                                                                         gregatio
                                                                                                                                                80                                                                                       between
                                                                                                                                                                                                                                         reliable
                                                                                                                                                60
                                                                                                                                                                                                                                         forced,
                                                                                                                                                40
                                                                                                                                                                                                                                         2%. Wi
                                                                                                                                                                                                                                         crowdso
  Figure 2: Timeline-based video player where col-
  ored dots correspond to automated visual detection
                                                                                                                                                20                                                                                       tomated
  results. Users can navigate directly to fragments of                                                                                           0
                                                                                                                                                                                                                                         can be e
  interest by interaction with the colored dots, which                                                                                           >50%                     >60%                 >70%          >80%         >90%           is an in
  pop-up a feedback overlay as displayed in Figure 3.                                                                                                                              User-Feedback Agreement

 Figure 2: Timeline-based video player where col-                                                                                                                                                                                        6. AC
  since 1970 at Landgraaf, the Netherlands. All music videos
                                                                                                                          Figure 4:                              Results for Experiment 2:                          Quality vs             We th
+	
  
        Validate	
  the	
  output	
  of	
  MIR	
  systems            Crowdsourcing Event Detection in YouTube Videos       3


        [Steiner	
  et	
  al.,	
  2011]	
  
                                      through a combination of textual, visual, and behavioral analysis techniques. When
                                      a user starts watching a video, three event detection processes start:

                                      Visual Event Detection Process We detect shots in the video by visually analyzing its
                                      content [19]. We do this with the help of a browser extension, i.e., the whole process
                                      runs on the client-side using the modern HTML5 [12] JavaScript APIs of the <video>
                                      and <canvas> elements. As soon as the shots have been detected, we offer the user the
        n    Propose	
  a	
  browser	
  extension	
  to	
  navigate	
  detected	
  events	
  in	
  videos	
  
                                      choice to quickly jump into a specific shot by clicking on a representative still frame.


              n    Visual	
  events	
  (shot	
  changes)	
   The detected named entitiesvideopresented to the
                                   Occurrence Event Detection Process We analyze the available
                                   NLP techniques, as outlined in [18].                                 are
                                                                                                             metadata using

                                   user in a list, and upon click via a timeline-like user interface allow for jumping into
              n    Occurrence	
  events	
  (analysis	
  of	
  metadata	
  by	
  means	
  of	
  NLP	
  to	
  detect	
  
                                   one of the shots where the named entity occurs.

                    named	
  en00es)	
   JavaScript Detection Processeachsoon asshotsvisualcount clicks been detected,
                                   Interest-based Event
                                   we attach             event listeners to
                                                                            As
                                                                                of the
                                                                                       the
                                                                                             and
                                                                                                  events have
                                                                                                              on shots as an

              n    Interest-­‐based	
  events	
  (click	
  counters	
  on	
  detected	
  visual	
  events)	
  
                                   expression of interest in those shots.




                                      Fig. 2: Screenshot of the YouTube browser extension, showing the three different event
+	
  
        Validate	
  the	
  output	
  of	
  MIR	
  systems
        [Goeau	
  et	
  al.,	
  2011]	
  

        n    Visual	
  plant	
  species	
  iden0fica0on	
  
              n    Based	
  on	
  local	
  visual	
  features	
  
              n    Crowdsourced	
  valida0on	
  
                                                                                    writing, 858 images were up
                                                                                    new users. These images a
                                                                                    with uniform background, o
                                                                                    background, and involve 15
                                                                                    set of 55 species. Note that
                                                                                    within ImageCLEF2011 pla

                                                                                    5. EVALUATION
                                                                                       Performances, basically i
                                                                                    rates, will be actually show
                                                                                    fline version connected to a d
                                                                                    an enjoying demo where an
                                                                                    leaves. Users would notice s
                                                                                    cation (around 2 seconds),
                                                                                    suggested in spite of the in
                                                                                    cases with occlusions or wit
                                            Figure 1: GUI of the web application.   a rough guide, a leave one
+	
  
                     Validate	
  the	
  output	
  of	
  MIR	
  systems	
  
                     [Yan	
  et	
  al.,	
  2010]	
  

                     n    CrowdSearch	
  combines	
  
                           n    Automated	
  image	
  search	
  
                                 n  Local	
  processing	
  on	
  mobile	
  phones	
  +	
  backend	
  processing	
  
                           n    Real-­‐0me	
  human	
  valida0on	
  of	
  search	
  results	
  
                                 n  Amazon	
  Mechanical	
  Turk	
  


                     n    Studies	
  the	
  trade-­‐off	
  in	
  terms	
  of	
  
                      n  Delay	
  
man error and bias to maximize accuracy. To balance these                          !"#$%&'()*#   +),-.-)/#&'()*#0             1"23.4)/#&5)3.-)/.6,&7)080


tradeoffs, CrowdSearch uses an adaptive algorithm that uses
                      n  Accuracy	
  
                                                                                                                              %        $        #          "     !

delay and result prediction models of human responses to ju-                                                        )'(*+,(   &'(     &'(      &'(         &'(   &'(
                                                                                                            +9
                      n  Cost	
  
diciously use human validation. Once a candidate image is
validated, it is returned to the user as a valid search result.                                                               %        $        #          "     !

                                                                                                                    )'(*+,(   &'(     -.       &'(         &'(   &'(
                                                                                                            +<

3.   CROWDSOURCING FOR SEARCH
  In this section,n  More	
  on	
  this	
  later…	
   of the Ama-
                                                                                                                              %        $        #          "     !
                   we first provide a background                                                                     )'(*+,(   -.      -.       -.          -.    -.
                                                                                                            +;
zon Mechanical Turk (AMT). We then discuss several design
choices that we make while using crowdsourcing for image                                                                      %        $        #          "     !
validation including: 1) how to construct tasks such that                                                           )'(*+,(   &'(     -.       -.          &'(   &'(
                                                                                                            +:
they are likely to be answered quickly, 2) how to minimize
human error and bias, and 3) how to price a validation task
to minimize delay.                                                                 Figure 2: Shown are an image search query, candi-
C./% +3/% *)% -./% 62*7,% #3% #% 3/#26.% 3-2#-/4=% "3% 6*:1/99"$4D% "-%       ,/-/6-"*$% *)% 31/6")"6% )/#-+2/3% 7"-."$% $*$AB?% :+9-":/,"#%
"$-2*,+6/3% ,"</23"-=% *)% 3/#26.% -/2:3% 3"$6/% ,"))/2/$-% :/:;/23% *)%      6*99/6-"*$3>% R"8"1/,"#% Z/-2"/<#9@% #% -#38% "$% U:#4/?MGQ% &KX(%
-./% 62*7,% 7"99% #119=% ,"))/2/$-% 3/#26.% 3-2#-/4"/3% ;#3/,% *$% -./"2%     "$<*9</3% 9*6#-"$4% 2/9/<#$-% ":#4/3% )2*:% -./% R"8"1/,"#% ":#4/%
)#:"9"#2"-=% 7"-.% -./% 3/#26.% -*1"6>% E*2/*</2@% -./% 62*7,% .#3% ;//$%     6*99/6-"*$% ;#3/,% *$% #% 12*<",/,% -/H-% 0+/2=% #$,% 3/</2#9% 3#:19/%
3.*7$% -*% 12*<",/% 4**,% 0+#9"-=% "$% 3-+,"/3% "$<*9<"$4% 2/9/<#$6/%         ":#4/3>% % R."9/% R"8"1/,"#% Z/-2"/<#9% /H#:"$/3% $*"3=% #$,%
                                                                              +$3-2+6-+2/,% -/H-+#9% #$$*-#-"*$3% "$% R"8"1/,"#% :+9-":/,"#@% -./%


           +	
  
F+,4:/$-3>%G</$%7"-.%,"</23"-=@%7/%6#$%3-"99%/H1/6-%3/#26.%0+#9"-=I%
3*:/%3-+,"/3%*$%12/,"6-"*$%"$%62*7,3*+26"$4%3=3-/:3%,/:*$3-2#-/%              3/:"3-2+6-+2/,%6*$-/$-%/<#9+#-/,%"$%U:#4/?MGQ%"3%)#2%9/33%$*"3=%
-.#-% 2/9"#;"9"-=% *)% -./% #</2#4/% *)% 12/,"6-/,% 36*2/3% ;=% -./% 62*7,%   #$,%:*2/%3-2+6-+2/,%-.#$%6*$-/$-%3/#26./3%*$%S*+C+;/>%


                    Query	
  expansion	
  /	
  reformula0on	
  
":12*</3% #3% -./% 3"J/% *)% -./% 62*7,% "$62/#3/3% &KL@% KK(>% M"8/7"3/@%    V/</2#9% 3-+,"/3% .#</% /H#:"$/,% 3/#26.% 0+#9"-=% *$% +3/23+119"/,%
3/#26.%0+#9"-=%"3%/H1/6-/,%-*%":12*</%#3%-./%$+:;/2%*)%3/#26./23%             -#43%"$%*-./2%R/;%O>L%#119"6#-"*$3>%%]"</23"-=%*)%":#4/%-#4%3/#26.%
"$% -./% 62*7,% /H1#$,3>% ?2*7,3*+26"$4% 6*$-2#3-3% 7"-.% 8$*79/,4/%          2/3+9-3% "$% Q9"682% +3"$4% #$% ":19"6"-% 2/9/<#$6/% )//,;#68% :*,/9% "3%


                    [Harris,	
  2012]	
  
:#28/-3% "$% 9/</9% *)% /$4#4/:/$-D% N"/93/$% :/$-"*$3% "$% &KO(% -.#-%       /H19*2/,%;=%<*$%^7*9%!"#$%&#%&KY(@%6*$69+,"$4%-.#-%,"</23"-=%"3%#$%
*</2% 'LP% *)% 8$*79/,4/% :#28/-% 42*+1% 1#2-"6"1#$-3% )#"9% -*%              ":1*2-#$-% 6*:1*$/$-% 7./$% 2/-2"/<#9% "3% ;#3/,% *$% 3:#99% ,#-#% 3/-3@%
6*$-2";+-/D% -./2/)*2/% -./% 62*7,3*+26"$4% #31/6-% "$-2*,+6/3% 3*:/%         3+6.% #3% -.*3/% )*+$,% "$% ":#4/% -#43>% % _*-.*% !"#$ %&#% /H19*2/%
)"$#$6"#9%"$6/$-"</%-*%:*-"<#-/%-#38%1#2-"6"1#-"*$>%                          )*983*$*:=% -#44"$4@% 7."6.% "3% ;*+$,% ;=% -./% 3#:/% $*"3=%
                                                                              +$3-2+6-+2/,%2/3-2"6-"*$3%#3%S*+C+;/%-#43%&K`(@%;+-%-./"2%3-+,=%7#3%
C./%*;F/6-"</%"$%-."3%1#1/2%"3%-*%/H#:"$/%")%-./%62*7,%6#$%12*<",/%
                                                                              12":#2"9=% )*6+3/,% *$% 2/6*::/$,/2% 3=3-/:3% +3#4/% *)% -./3/% -#43>%
#% :*2/% 12/6"3/% 3/-% *)% AB?% 3/#26.% 2/3+9-3@% 4"</$% #% 0+/2=@%
                                                                              a-./23% .#</% /H#:"$/,% :+9-":/,"#% 3/#26.% /))/6-"</$/33% *$%
6*:1#2/,% 7"-.% *-./2% :+9-":/,"#% 3/#26.% -**93>% C./% 6*$-2";+-"*$3%
*)% -."3% 1#1/2% #2/% #3% )*99*73>% Q"23-% 7/% 6*:1#2/% -./% 2/-2"/<#9%
                                                                                                                                                !"! #$%&'(%)
                                                                              8$*79/,4/%:#28/-%7/;3"-/3@%3+6.%#3%?.+#%!"#$%&#%"$%&Kb(%#$,%M"%!"#$                                                                              (:;-4)NC)O/974:
                    n    Search	
  YouTube	
  user	
  generated	
  content	
  
1/2)*2:#$6/%*)%,"))/2/$-%2/-2"/<#9%:*,/93%"$%-/2:3%*)%12/6"3"*$%*$%
3/</2#9% 6#-/4*2"/3% +3"$4% AB?% <",/*% 2/0+/3-3% -#8/$% )2*:% 9/#,"$4%
                                                                              %&#%"$%&Kc(D%.*7/</2@%-./"2%)*6+3%"3%-*%9*6#-/%#99%6*$-/$-%#,,2/33"$4%
                                                                                                                                                !"*! +,,-./0)
                                                                              #% 31/6")"6% 0+/3-"*$% d/>4>% e.*7% -*f% #$,% e7.=f% 0+/3-"*$% -=1/3g%
                                                                                                                                                                                                                                ./6478:94)F.>.
                                                                                                                                                                                                                               %4:79A)%67:640
                                                                                                                                                !"#$%& '()& *++,#$%& )-.,/.'#+$& 0)'(+12& 3)& 4.,4/,.')& '()& 567&
                                                                              7./2/#3% -./% )*6+3% *)% *+2% 3-+,=% "3% *$% )"$,"$4% #$,% 2#$8"$4% <",/*3%
8$*79/,4/% :#28/-% 7/;3"-/3>% R/% -./$% 6*:1#2/% S*+C+;/T3% *7$%                                                                                "4+8)"&9+8&).4(&+9&'()&").84(&)99+8'":&&;()")&.8)&%#-)$&#$&;.<,)&=:&&              H'/1)$'&H).84(
                                                                              -.#-%)+9)"99%#%31/6")"6%3/#26.%2/0+/3-%d/>4>@%e./91%)"$,%#%<",/*fg>%%
3/#26.%"$-/2)#6/%7"-.%#%3/#26.%6*$,+6-/,%;=%3-+,/$-3%#3%7/99%#3%#%                                                                              >(#,)&'()")&"4+8)"&"))0&8)."+$.<,)2&#'&#"&,#?),@&1/)&'+&'3+&#""/)"A&
                    n    Natural	
  language	
  queries	
  are	
  restated	
  and	
  given	
  as	
  input	
  to	
  
3/#26.% #112*#6.% +3"$4% 62*7,3*+26"$4>% % R/% /<#9+#-/% *+2% 2/3+9-3%
+3"$4% -7*% :/-.*,3I% :/#$% #</2#4/% 12/6"3"*$% ,/-/2:"$/,% #)-/2%
                                                                              h%)/7%3-+,"/3%.#</%/H#:"$/,%-./%/))/6-"</$/33%*)%62*7,3%*$%$*"3=% +/8&4.,4/,.'#+$&+9&%8+/$1&'8/'(&.$12&9+8&0+"'&").84()"2&'()8)&3)8)&
                                                                              ,#-#% 3/#26./3>% V-/"$/2% !"#$ %&#% ,/:*$3-2#-/,% 3/#26./3% *)% /</$-% "0.,,& *)84)$'.%)& +9& B+/;/<)& -#1)+"& 3)8)& 4+$"#1)8)1&
                                                                                                                                                +$,@& .&
                                                                                                                                                                                                                                   ^8+31"+/84#$%
                                                                                                                                                                                                                               &
#119="$4% 1**9"$4@% #$,% #% 3":19/% 9"3-% 12/)/2/$6/@% 7./2/% -./% /$-"2/%    ,/-/6-"*$%:/-.*,3%"$%S*+C+;/%<",/*3%#-%-./%)2#4:/$-%9/</9%&K'(>% & ;()& 48+31"+/84#$%& ").84(& "'8.')%@& .$1& '()& "'/1)$'&
                                                                                                                                                8),)-.$':&
                           YouTube	
  search	
  interface	
  
                          n 
9"3-%*)%<",/*3%F+,4/,%#3%2/9/<#$-%;=%/#6.%:/-.*,%#2/%6*:1#2/,>%%              _3+/.%!"#$%&#%/H#:"$/,%3/#26./3% "$%1*9"-"6#9%;9*43%"$%&OL(%7."6.@%"'8.')%#)"& *)89+80)1& <)'')8& '(.$& '()& B+/;/<)& ").84(&
                                                                                                                                                ").84(&                                                                              (:;-4)SC)O/9
                                                                                                                                                                                                                                      ./6478:94)F
                                                                              #9-.*+4.% $*"3=@% ,*% $*-% /H1/2"/$6/% -./% 2/3-2"6-"*$3% "$./2/$-% "$% ."& 0)."/8)1& <@& 5672& .& 8)"/,'& '(.'& #"& "'.'#"'#4.,,@&
                                                                                                                                                #$')89.4)&
C./%2/:#"$,/2%*)%-./%1#1/2%"3%*24#$"J/,%#3%)*99*73>%U$%V/6-"*$%O%
                      n  Students	
  
7/%1+-%*+2%7*28%"$%-./%6*$-/H-%*)%12/<"*+3%7*28>%U$%V/6-"*$%W%7/%
                                                                              #112*#6.% 6#99/,% ?2*7,V/#26.@% 7."6.% 12*<",/,% $/#22/#9-":/%
                                                                                                                                                "#%$#9#4.$'&C'3+&'.#,)12&*DE:EFG:&
                                                                              :+9-":/,"#% -#43>% % U$% &OK(@% S#$% !"#$ %&>% 12*<",/,% #$% "$$*<#-"</%                                                                          %4:79A)%67:640
,"36+33% *+2% /H1/2":/$-#9% 3/-+1>% V/6-"*$% X% *))/23% #% ,"36+33"*$% *)%                                                                          (:;-4)<")=>47:--)?@+)59,745)8,7)4:9A)54:79A)567:640B")                         H'/1)$'&H).84(
                                                                              #33/33:/$-% *)% ":#4/3>% h9-.*+4.% -./% #+-.*23T% )*6+3% 7#3% *$%
V/6-"*$%Y>%              n    Crowd	
  in	
  Mturk	
  
-./% 2/3+9-3>% R/% 6*$69+,/% #$,% 12*<",/% "$3"4.-% "$-*% )+-+2/% 7*28% "$%
                                                                              9#;/9"$4% ":#4/3@% -./"2% #112*#6.% 6*+9,% )/#3";9=% ;/% /H-/$,/,%%4:79A)%67:640B)
                                                                              9*6#-"$4%3":"9#2%:/,"#%*$%S*+C+;/>%
                                                                                                                                                      -*%                                   ?@+)                                   ^8+31"+/84#$
                                                                                                                                                      H'/1)$'&H).84(&                       E:FXK&                              &
                                                                                                                                                                                                                              ;+&#,,/"'8.')2&#$&;
                                                                                                                                                    ^8+31"+/84#$%&                              E:FQX&
                                                                                                                                                                                                                              U1#99#4/,'V2&'+&+<
                                                                                                                                                    B+/;/<)&H).84(&                             E:=Fa&                        3)&3+/,1&)W*)4'
                                                                                                                                           &                                                                                  4)$'":& & ;+& +<
                                                                                                                                               H#$4)& I)"'.')1& J/)8#)"& 3)8)& %8+/*)1& #$'+& '(8))& ")*.8.')&                48+31"+/84#$%2&
                                                                                                                                               4.')%+8#)"& C)."@2& 0)1#/02& .$1& 1#99#4/,'G2& 3)& )-.,/.')1& '()0&            0#$/')"&.$1&#$4/
                                                                                                                                               ")*.8.'),@& 9+8& ).4(& ").84(& "'8.')%@:& & ;()& 8)"/,'"& .8)& 8)*+8')1& #$&   8)*8)")$'& ,+$%& ')
                                                                                                                                               ;.<,)&K:&                                                                      48+312& ."& 4+0*.
                                                                                                                                                                                                                              '.?)"& '3+& '(#81"

                                                                                                                                                                                MAP	
  
                                                                                                                                                (:;-4)!C)?@+)59,745)8,7)4:9A)54:79A)567:640BD);7,E4/)F,G/)
                                                                                                                                                                  ;B)54:79A)9:640,7B")
                                                                                                                                                                                                                              )M/#-.,)$'&.0+/$
                                                                                                                                                                                                                              +/8&*.8.0+/$'&+
                                                                                                                                                                                                                              '+& *8+-#1)& '()& <
                                                                                                                                                 %4:79A)%67:640B)          $:5B)          ?4F.H2)           I.88.9H-6)        4+$"#1)8.'#+$2&+/
                                                                                                                                                  H'/1)$'&H).84(&          E:T=T&           E:FQT&            E:KXQ&          '()&<)"'&'8.1)+99&
                                                                                                                                                 ^8+31"+/84#$%&            E:TQa&           E:FQX&            E:KEK&          !"1! %.23-4)
                                                                                                                                                 B+/;/<)&H).84(&           E:FE`&           E:=KK&            E:XXK&          >)& .**,@& ^+*),
                                                                                                                                                                                                                              PXT2& XYR2& #"& .&
                                                                                                                        %       &
                                                                                                                              ;.<,)&K&8.#")"&"+0)&#$')8)"'#$%&*+#$'"&9+8&1#"4/""#+$:&&L#8"'2&567&                             *8)9)8)$4)":& & ^+
            -./012)3")+4214.25)67)892)4.:26)1281.24;<)=16>2??).@46<4.@/)A60'0B2C?)?2;1>9).@8217;>2D)?80:2@8?D);@:)892)>165:") "4+8)"&9+8&)."@&M/)8#)"&.8)&0/4(&0+8)&4+$"#"')$'&.48+""&"'8.')%#)"&                             '3+& ,#"'"& 9+8& .& %#
                                                                                                                              4+0*.8)1& 3#'(& '(+")& 9+8& 0)1#/0& +8& 1#99#4/,'& ").84()":& & ;(#"& #"&                       ."")""+8S"&*8)9)8)
   %                                                                                                                                                                                                                          <@&$/0<)8&+9&-#4
                                                                                                                              ,#?),@&.&8)"/,'&+9&.&().-#)8&8),#.$4)&9+8&"'/1)$'"&.$1&'()&48+31&+$&
                                                                                                                              '()& "'.$1.81& B+/;/<)& ").84(& #$')89.4)& 9+8& '()& )."#)8& M/)8#)"2&                          3#$$)8:&&>)&)W.0
+	
  
        Aggrega0ng	
  annota0ons	
  
+	
  
        Annota0on	
  model	
  

        n     A	
  set	
  of	
  objects	
  to	
  annotate	
   i = 1, . . . , I

        n     A	
  set	
  of	
  annotators	
   j = 1, . . . , J



        n     Types	
  of	
  annota0ons	
  
               n    Binary	
  
               n    Categorical	
  (mul0-­‐class)	
  
               n    Numerical	
  
               n    Other	
  

        	
  
+	
  
        Annota0on	
  model	
  

        True	
  labels	
     Objects	
               Annotators	
  


                                            1
                                           y1
                                             1
                                            y2        2
               y1                                    y1

                                                                      Annota0ons	
  
                                                 3
               y2                               y2                         j
                                                                          yi ∈ L
                                                      3
                                                     y3
                                                                      Binary	
   |L| = 2
               y3                                     4
                                                     y1               	
  
                                                      5               Mul0-­‐class	
   |L| > 2
                                                     y1
                                                 5
                                                y2
+	
  
        Aggrega0ng	
  annota0ons	
  

        n    Majority	
  vo0ng	
  (baseline)	
  
              n    For	
  each	
  object,	
  assign	
  the	
  label	
  that	
  received	
  the	
  largest	
  number	
  of	
  votes	
  

        n    Aggrega0ng	
  annota0ons	
  
              n    [Dawid	
  and	
  Skene,	
  1979]	
  
              n    [Snow	
  et	
  al.,	
  2008]	
  
              n    [Whitehill	
  et	
  al.,	
  2009]	
  
              n    …	
  

        n    Aggrega0ng	
  and	
  learning	
  
              n    [Sheng	
  et	
  al.,	
  2008]	
  
              n    [Donmez	
  et	
  al.,	
  2009]	
  
              n    [Raykar	
  et	
  al.,	
  2010]	
  
              n    …	
  
+	
  
        Aggrega0ng	
  annota0ons	
  
        Majority	
  vo0ng	
  

        n    Assume	
  that	
  	
  
                                                                                                                               j
              n    The	
  annotator	
  quality	
  is	
  independent	
  from	
  the	
  object	
  P (yi = yi ) = pj
                                                                                                 	
  
              n    All	
  annotators	
  have	
  the	
  same	
  quality	
   pj = p


        n    The	
  integrated	
  quality	
  of	
  majority	
  vo0ng	
  using	
  	
  I	
  	
  	
  	
  	
  	
  	
  2N	
  	
  +	
  	
  1	
  	
  
                                                                                      	
   = 	
  	
  	
  	
  	
   	
  	
  	
   	
  
              annotators	
  is	
  
                                                       2N + 1 
                                                      N
                       q = P (y M V = y) =                                 p2N +1−i · (1 − p)i
                                                                  i
                                                                      l=0
+	
  
               Aggrega0ng	
  annota0ons	
  
               Majority	
  vo0ng	
  


 -q                                     1
                                       0.9                                        p=1.0
 ly.
                                                                                  p=0.9
                  Integrated quality




me                                     0.8
                                                                                  p=0.8
                                       0.7
                                                                                  p=0.7
                                       0.6                                        p=0.6
                                       0.5                                        p=0.5
 y)                                    0.4                                        p=0.4
me                                     0.3
   ,                                   0.2
 U.                                          1   3     5    7      9    11   13
 yi                                                  Number of labelers

 is    Figure 2: The relationship between integrated label-
ue     ing quality, individual quality, and the number of la-
 el    belers.
+	
  
        Aggrega0ng	
  annota0ons	
  
        [Snow	
  et	
  al.,	
  2008]	
  

                                         j
        n    Binary	
  labels:	
  	
   yi ∈ {0, 1}

        n    The	
  true	
  label	
  is	
  es0mated	
  evalua0ng	
  the	
  posterior	
  log-­‐odds,	
  i.e.,	
  
                                                               1            J
                                                    P (yi = 1|yi , . . . , yi )
                                                log            1            J
                                                    P (yi = 0|yi , . . . , yi )


        n    Applying	
  Bayes	
  theorem	
  

                     P (yi = 1|yi , . . . , yi ) 
                                1            J               j
                                                         P (yi |yi = 1)       P (yi = 1)
                 log            1            J
                                                 =   log     j
                                                                        + log
                     P (yi = 0|yi , . . . , yi )   j     P (yi |yi = 0)       P (yi = 0)

                        posterior	
                                  likelihood	
                            prior	
  
+	
  
        Aggrega0ng	
  annota0ons	
  
        [Snow	
  et	
  al.,	
  2008]	
  

                                       j                                                              j
        n    How	
  to	
  es0mate P (yi |y	
  	
  	
  =	
  	
  1)	
  	
  and	
  P	
  	
  (y	
  i	
  	
  |y	
  i	
  	
  	
  =	
  	
  0)	
  ?	
  
                                           	
  i 	
  	
  	
   	
  	
  	
         	
  	
   	
  	
  	
   	
  	
   	
  	
   	
  	
  	
  

        n    Gold	
  standard:	
  	
  
              n    Some	
  objects	
  have	
  known	
  labels	
  
              n    Ask	
  to	
  annotate	
  these	
  objects	
  
              n    Compute	
  empirical	
  p.m.f.	
  for	
  object(s)	
  with	
  known	
  labels	
  
                                                                                 Number of correct annotations
                     P (y j = 1|y = 1) =
                                                                          Number of annotations of object with label = 1

              n    Compute	
  the	
  performance	
  of	
  annotator	
  	
  j	
  	
  (independent	
  from	
  the	
  object)	
  
                                                                            	
  	
  

                         j                j                        j
                     P (y1 |y1 = 1) = P (y2 |y2 = 1) = . . . = P (yI |yI = 1) = P (y j |y = 1)
+	
  
        Aggrega0ng	
  annota0ons	
  
        [Snow	
  et	
  al.,	
  2008]	
  

        n    Each	
  annotator	
  vote	
  is	
  weighted	
  by	
  the	
  log-­‐likelihood	
  ra0o	
  for	
  their	
  
              given	
  response	
  (Naïve	
  Bayes)	
  

        n    More	
  reliable	
  annotators	
  are	
  weighted	
  more	
  

                         P (yi = 1|yi , . . . , yi ) 
                                    1            J               j
                                                             P (yi |yi = 1)       P (yi = 1)
                     log            1            J
                                                     =   log     j
                                                                            + log
                         P (yi = 0|yi , . . . , yi )   j     P (yi |yi = 0)       P (yi = 0)

        n    Issue:	
  Obtaining	
  a	
  gold	
  standard	
  is	
  costly!	
  
              	
  
+	
  
        Aggrega0ng	
  annota0ons	
  
        [Kumar	
  and	
  Lease,	
  2011]	
  
              Figure 1: p1:w ∼U(0.6, 1.0). With very accurate annotators, generating multiple labels (to improve consensus
              label accuracy) provides little benefit. Instead, labeling effort is better spent single labeling more examples.
        n    With	
  very	
  accurate	
  annotators,	
  it	
  is	
  berer	
  to	
  label	
  more	
  examples	
  
              once	
  

                                                                                                                               pj ∼ U (0.6, 1.0)


            Figure 2: p1:w ∼U(0.4, 0.6). With very noisy annotators, single labeling yields such poor training data that
        n    With	
  very	
  noisy	
  annotators,	
  aggrega0ng	
  labels	
  helps,	
  if	
  annotator	
  
            there is no benefit from labeling more examples (i.e. a flat learning rate). MV just aggregates this noise to
            produce more ∼U(0.6, 1.0). With very accurate annotators, generating multiple labels (to improve consensus
            Figure 1: p1:w noise. In contrast, by modeling worker accuracies and weighting their labels appropriately,
              accuracies	
  are	
  taken	
  into	
  account	
  
            label accuracy) provides little benefit. Instead, labeling effort is better spent single labeling more examples.
            NB can improve consensus labeling accuracy (and thereby classifier accuracy).

              	
  

                                                                                                                               pj ∼ U (0.3, 0.7)


              Figure 2: p1:w ∼U(0.4, 0.6). With very noisy annotators, single labeling yields such poor training data that
                   Figure 3: p1:w ∼U(0.3, 0.7). With greater variance in accuracies vs. Figure 2, NB further improves.
                            SL:	
  Single	
  Labeling,	
  MV:	
  Majority	
  Vo0ng;	
  NB:	
  Naïve	
  Bayes	
  
              there is no benefit from labeling more examples (i.e. a flat learning rate). MV just aggregates this noise to
              produce more noise. In contrast, by modeling worker accuracies and weighting their labels appropriately,
              NB can improve consensus labeling accuracy (and thereby classifier accuracy).
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval
Crowdsourcing for Multimedia Retrieval

Weitere ähnliche Inhalte

Mehr von CUbRIK Project

Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeCUbRIK Project
 
Humanist machine interaction with histoGraph
Humanist machine interaction with histoGraphHumanist machine interaction with histoGraph
Humanist machine interaction with histoGraphCUbRIK Project
 
histoGraph presented to MMSP 2013
histoGraph presented to MMSP 2013histoGraph presented to MMSP 2013
histoGraph presented to MMSP 2013CUbRIK Project
 
histoGraph for historians
histoGraph for historianshistoGraph for historians
histoGraph for historiansCUbRIK Project
 
histoGraph: a case study in Digital Humanities
histoGraph: a case study in Digital HumanitieshistoGraph: a case study in Digital Humanities
histoGraph: a case study in Digital HumanitiesCUbRIK Project
 
CUbRIK research on social aspects
CUbRIK research on social aspectsCUbRIK research on social aspects
CUbRIK research on social aspectsCUbRIK Project
 
Building a social graph for the history of Europe: the CUbRIK histoGraph
Building a social graph for the history of Europe: the CUbRIK histoGraphBuilding a social graph for the history of Europe: the CUbRIK histoGraph
Building a social graph for the history of Europe: the CUbRIK histoGraphCUbRIK Project
 
The CUbRIK histoGraph Factsheet
The CUbRIK histoGraph FactsheetThe CUbRIK histoGraph Factsheet
The CUbRIK histoGraph FactsheetCUbRIK Project
 
CUbRIK Fashion Trend Analysis: a Business Intelligence Application
CUbRIK Fashion Trend Analysis: a Business Intelligence ApplicationCUbRIK Fashion Trend Analysis: a Business Intelligence Application
CUbRIK Fashion Trend Analysis: a Business Intelligence ApplicationCUbRIK Project
 
CUbRIK Social Graph Visual Interface
CUbRIK Social Graph Visual InterfaceCUbRIK Social Graph Visual Interface
CUbRIK Social Graph Visual InterfaceCUbRIK Project
 
Mining Emotions in Short Films: User Comments or Crowdsourcing?
Mining Emotions in Short Films: User Comments or Crowdsourcing?Mining Emotions in Short Films: User Comments or Crowdsourcing?
Mining Emotions in Short Films: User Comments or Crowdsourcing?CUbRIK Project
 
CUbRIK and gaming experience@Qualinet
CUbRIK and gaming experience@QualinetCUbRIK and gaming experience@Qualinet
CUbRIK and gaming experience@QualinetCUbRIK Project
 
CUbRIK: Open Box. Multimedia and Human Computation approach
CUbRIK: Open Box. Multimedia and Human Computation approachCUbRIK: Open Box. Multimedia and Human Computation approach
CUbRIK: Open Box. Multimedia and Human Computation approachCUbRIK Project
 
ICT 2013: Better Society: empowering Horizon 2020 with trustable social media
ICT 2013: Better Society: empowering Horizon 2020 with trustable social mediaICT 2013: Better Society: empowering Horizon 2020 with trustable social media
ICT 2013: Better Society: empowering Horizon 2020 with trustable social mediaCUbRIK Project
 
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...CUbRIK Project
 
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...CUbRIK Project
 
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a Purpose
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a PurposeCUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a Purpose
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a PurposeCUbRIK Project
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK Project
 
Semantic schema for geonames
Semantic schema for geonamesSemantic schema for geonames
Semantic schema for geonamesCUbRIK Project
 

Mehr von CUbRIK Project (20)

Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
 
Humanist machine interaction with histoGraph
Humanist machine interaction with histoGraphHumanist machine interaction with histoGraph
Humanist machine interaction with histoGraph
 
histoGraph presented to MMSP 2013
histoGraph presented to MMSP 2013histoGraph presented to MMSP 2013
histoGraph presented to MMSP 2013
 
histoGraph for historians
histoGraph for historianshistoGraph for historians
histoGraph for historians
 
histoGraph: a case study in Digital Humanities
histoGraph: a case study in Digital HumanitieshistoGraph: a case study in Digital Humanities
histoGraph: a case study in Digital Humanities
 
SMILA in CUbRIK
SMILA in CUbRIKSMILA in CUbRIK
SMILA in CUbRIK
 
CUbRIK research on social aspects
CUbRIK research on social aspectsCUbRIK research on social aspects
CUbRIK research on social aspects
 
Building a social graph for the history of Europe: the CUbRIK histoGraph
Building a social graph for the history of Europe: the CUbRIK histoGraphBuilding a social graph for the history of Europe: the CUbRIK histoGraph
Building a social graph for the history of Europe: the CUbRIK histoGraph
 
The CUbRIK histoGraph Factsheet
The CUbRIK histoGraph FactsheetThe CUbRIK histoGraph Factsheet
The CUbRIK histoGraph Factsheet
 
CUbRIK Fashion Trend Analysis: a Business Intelligence Application
CUbRIK Fashion Trend Analysis: a Business Intelligence ApplicationCUbRIK Fashion Trend Analysis: a Business Intelligence Application
CUbRIK Fashion Trend Analysis: a Business Intelligence Application
 
CUbRIK Social Graph Visual Interface
CUbRIK Social Graph Visual InterfaceCUbRIK Social Graph Visual Interface
CUbRIK Social Graph Visual Interface
 
Mining Emotions in Short Films: User Comments or Crowdsourcing?
Mining Emotions in Short Films: User Comments or Crowdsourcing?Mining Emotions in Short Films: User Comments or Crowdsourcing?
Mining Emotions in Short Films: User Comments or Crowdsourcing?
 
CUbRIK and gaming experience@Qualinet
CUbRIK and gaming experience@QualinetCUbRIK and gaming experience@Qualinet
CUbRIK and gaming experience@Qualinet
 
CUbRIK: Open Box. Multimedia and Human Computation approach
CUbRIK: Open Box. Multimedia and Human Computation approachCUbRIK: Open Box. Multimedia and Human Computation approach
CUbRIK: Open Box. Multimedia and Human Computation approach
 
ICT 2013: Better Society: empowering Horizon 2020 with trustable social media
ICT 2013: Better Society: empowering Horizon 2020 with trustable social mediaICT 2013: Better Society: empowering Horizon 2020 with trustable social media
ICT 2013: Better Society: empowering Horizon 2020 with trustable social media
 
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...
How Do We Deep-Link? Leveraging User-Contributed Time-Links for Non-Linear Vi...
 
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...
CUbRIK Research at CIKM 2012: Efficient Jaccard-based Diversity Analysis of L...
 
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a Purpose
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a PurposeCUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a Purpose
CUbRIK Tutorial at ICWE 2013: part 2 - Introduction to Games with a Purpose
 
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human ComputationCUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
CUbRIK tutorial at ICWE 2013: part 1 Introduction to Human Computation
 
Semantic schema for geonames
Semantic schema for geonamesSemantic schema for geonames
Semantic schema for geonames
 

Kürzlich hochgeladen

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Crowdsourcing for Multimedia Retrieval

  • 1. +   Crowdsourcing  for   Mul0media  Retrieval   Marco  Tagliasacchi   Politecnico  di  Milano,  Italy  
  • 2. +   Outline   n  Crowdsourcing  applica0ons  in  mul0media  retrieval   n  Aggrega0ng  annota0ons   n  Aggrega0ng  and  learning   n  Crowdsourcing  at  work  
  • 3. +  Crowdsourcing  applica0ons  in   mul0media  retrieval  
  • 4. +   Crowdsourcing   n  Crowdsourcing  is  an  example  of  human  compu+ng   n  Use  an  online  community  of  human  workers  to  complete  useful   tasks   n  The  task  is  outsourced  to  an  undefined  public   n  Main  idea:  design  tasks  that  are   n  Easy  for  humans   n  Hard  for  machines  
  • 5. +   Crowdsourcing   n  Crowdsourcing  plaHorms   n  Paid  contributors   n  Amazon  Mechanical  Turk  (www.mturk.com)   n  CrowdFlower  (crowdflower.com)   n  oDesk  (www.odesk.com)   n  …   n  Volunteers   n  Foldit  (www.fold.it)   n  Duolingo  (www.duolingo.com)   n  …    
  • 6. +   Applica0ons  in  mul0media  retrieval     n  Create  annotated  data  sets  for  training   n  Reduces  both  cost  and  0me  needed  to  gather  annota0ons,   n  …but  annota0ons  might  be  noisy!     n   Validate  the  output  of  mul0media  retrieval  systems   n  Query  expansion  /  reformula0on  
  • 7. +   Crea0ng  annotated  training  sets   [Sorokin  and  Forsyth,  2008]   n  Collect  annota0ons  for  computer  vision  data  sets     n  people  segmenta0on     Protocol 1 Protocol 1 Protocol 2 Protocol 2
  • 8. Proto +   Crea0ng  annotated  training  sets   [Sorokin  and  Forsyth,  2008]   Protocol 2 n  Collect  annota0ons  for  computer  vision  data  sets     n  people  segmenta0on  and  pose  annota0on     Protocol 3 Protocol 4 Figure 1. Example results show the example results obtained from the annotation experiments. The first column is the implementation of
  • 9. +   Experiment 3: trace the boundary of the person. 1 0.8 Crea0ng  annotated  training  sets   area(XOR)/area(AND). The lower the better. Mean 0.21, std 0.14, median 0.16 knee A [Sorokin  and  Forsyth,  2008]   0.6 G B 0.4 F E C D 0.2 A B 0 0 50 100 150 200 250 300 n  Observa0ons:   C n  Annotators  make  errors   D E F G n  Quality  of  annotators  is  heterogeneous   n  The  quality  of  the  annota0ons  depends  on  the  difficulty  of  the  task   Experiment 4: click on 14 landmarks 50 Mean error in pixels between annotation points. The lower the better. Mean 8.71, std 6.29, median 7.35. 40 14 12 12 7 7 14 14 9 13 11 11 1310 1310 30 10 figure 6 9 8 8 9 8 8 8 7 knee 9 9 14 14 7 7 13 13 G 11 13 rWrist 10 10 rHip 20 rAnkle A 3 4 12 B 13 3 3 4 4 C 11 11 13 13 Neck rElbow 12 12 lHip 12 F rKnee 2 2 5 5 4 4 3 3 12 12 D E 2 5 12 lElbow rShoulder Head 10 B C lKnee 6 6 5 5 2 2 A 11 1 11 lWrist 1 11 lShoulder 11 lAnkle 6 1 1 6 6 1 10 10 10 10 0 0 50 100 150 200 9 250 300 350 9 9 9 8 8 8 8 7 7 7 7 14 14 7 7 14 14 14 10 10 13 13 6 14 14 6 6 6 9 8 8 8 14 14 9 12 12 10 10 10 13 13 13 13 11 11 9 9 13 5 13 9 9 10 10 13 5 5 10 10 5 9 9 9 11 11 11 11 11 4 4 12 3 3 8 8 4 8 8 11 11 4 3 3 3 12 12 12 4 4 D E F G 12 12 4 4 3 3 4 4 12 4 7 7 7 7 3 7 7 3 3 3 8 7 3 3 100 4 4 5 5 110 120 130 140 150 160 5 5 170 180 190 200 100 110 120 130 140 150 160 88 5 170 5 5 180 190 200 100 110 120 130 140 150 160 170 180 190 200 100 110 120 130 140 5 5 2 2 2 2 2 2 2 2 2 6 6 6 6 1 1 1 1 1 1 1 6 6 1 1 6 6 6 Figure 6. Quality details per landmark. We present analysis of annotation quality per landmark in experiment 4. We Figure 5. Quality details. We presentbest pair forof annotation quality forbetween 35th4. For every image the best fitting between points “C” and detailed analysis all annotations experiments 3 and and 65th percentiles - “E” of experiment 4 in fig. 5. pair of annotations is selected. The score of the best pair is shown in the figure. For experiment 3 we score annotations by the area of their symmetric difference (XOR) divided bysame scale:union(OR). For experimenttowe compute the average distance between the the the area of their from image 100 4 200 on horizontal axis and from 3 pixels to 13 pixels of error on the vertical axis. T
  • 10. +   Crea0ng  annotated  training  sets   [Soleymani  and  Larson,  2010]   n  MediaEval  2010  Affect  Task   n  Use  of  Amazon  Mechanical  Turk  to  annotate  the  Affect  Task  Corpus   n  126  videos  (2-­‐5  mins  in  length)   n  Annotate   n  Mood  (e.g.,  pleased,  helpless,  energe0c,  etc.)   n  Emo0on  (e.g.,  sadness,  joy,  anger,  etc.)   n  Boreness  (nine  point  ra0ng  scale)   n  Like  (nine  point  ra0ng  scale)  
  • 11. +   Crea0ng  annotated  training  sets   [Nowak  and  Ruger.,  2010]   n  Crowdsourcing  image  concepts.  53  concepts,  e.g.,   n  Abstract  categories:  pPlace contains threehmutual exclusive concepts, namely In- artylife,  beach   olidays,  snow,  etc.   3.3.1 Design of HIT Template door, Outdoor and No Visual Place. In contrast several op- The design of the HITs at MTurk for the im n  Time  of  the  day:  day,  tional concepts belongue  the category Landscape Elements. night,  no  visual  c to tion task is similar to the annotation tool that w The task of the annotators was to choose exactly one concept to the expert annotators (see Sec. 3.2). Each H n  …   for categories with mutual exclusive concepts and to select of the annotation of one image with all applica all applicable concepts for optional designed concepts. All cepts. It is arranged as a question survey and photos were annotated at an image-based level. The anno- into three sections. The section Scene Descript n  Subset  of  99  images  from  the  ImageCLEF2009  dataset   tator tagged the whole image with all applicable concepts section Representation each contain four questio and then continued with the next image. tion Pictured Objects consists of three questions each section the image to be annotated is pres repetition of the image ensures that the turke while answering the questions without scrolling of the document. Fig. 2 illustrates the questi section Representation. Figure 1: Annotation tool that was used for the ac- quisition of expert annotations.
  • 12. +   Crea0ng  annotated  training  sets   [Nowak  and  Ruger.,  2010]   n  Study  of  expert  and  non-­‐expert  labeling   n  Inter-­‐annota0on  agreement  among  experts:     n  very  high   n  Influence  of  the  expert  ground  truth  on  concept-­‐based  retrieval  ranking:     n  very  limited   n  Inter-­‐annota0on  agreement  among  non-­‐experts   n  High,  although  not  as  good  as  among  experts   n  Influence  of  averaged  annota0ons  (experts  vs.  non  experts)  on  concept-­‐based   retrieval  ranking:   n  Averaging  filters  out  noisy  non-­‐expert  annota0ons  
  • 13. +   Crea0ng  annotated  training  sets   [Vondrick  et  al.,  2010]   n  Crowdsourcing  object  tracking  in  video   4 C. Vondrick, D. Ramanan, D. Patterson n  Annotators  draw  bounding  boxes   Fig. 2: Our video labeling user interface. All previously labeled entities are shown
  • 14. +   Crea0ng  annotated  training  sets   [Vondrick  et  al.,  2010]   n  Annotators  label  the  enclosing  bounding  box  of  an  en0ty  every  T   frames   n  Bounding  boxes  at  intermediate  0me  instants  are  interpolated   n  Interes0ng  trade-­‐off  between     n  Cost  of  12 turk  workers  D. Ramanan, D. Patterson M C. Vondrick, n  Cost  of  interpola0on  on  Amazon  EC2  cloud   (a) Field drills (b) Basketball players
  • 15. +   Crea0ng  annotated  training  sets   ments between F and each of the other e, every document was judged as more 4.1 HI T Design The use of preference judgments is prone to have a very simple [Urbano  et  al.,  2010]   which was judged equally similar (or HIT design (see Figure 4). We asked workers to listen to the new segment appears to the left of F with ed more relevant, and G is set up in the the two incipits to r the second iteration, in the rightmost compare. Next, they were asked what variation was more similar s needed because F and G were already to the original melody, allowing 3 options: A is more similar, B is d be the pivot for the leftmost segment. more similar, and they are either equally similar or dissimilar. We ged similar to B, but D and E are evalua0on  of  music  informa0on  retrieval  systems   they n  Goal:  judged as indicated them that if one melody was part of another one, set up in a segment to the right of B. At had to be considered equally similar, so as to comply with the rdered groups of relevance formed with original guidelines. As optional questions, they were asked for n  Use  crowdsourcing  amusicalalterna0ve  if o  experts  to  comments gor Note that not all the 21 judgments were their s  an   background, t any, and for create   round-­‐ truths  of  par0ally  ordered  lists   aggregate every incipit (e.g. G is only suggestions to give us some feedback. organized partially ordered list. Pivots for each ace. Documents that have been pivots already nts Preference Judgments G, B, F C<F, D<F, E<F, A<F, G=F, B<F B , F, G C=B, D>B, E>B, A=B E , F, G C=A, D=E D), (F, G) - ents, the sample of rankings given to each e than with the original method. Whenever over another one, it would be given a rank n case it was judged equally similar, a rank its sample. With the original methodology, anks given to an incipit could rangegreement  (92%  complete  +  par0al)  with  experts   n  Good  a from 1 ch increases the variance of the samples. eme, the two samples of rankings given to s are the opposite and therefore have the Mann-Whitney U tests can be used again ank samples are different or not. Because variable, the effect size is larger, which
  • 16. +   Validate  the  output  of  MIR  systems [Snoek  et  al.,  2010][Freiburg  et  al.,  2011]   n  Search  engine  for  archival  rock  ‘n’  roll  concert  video   n  Use  of  crowdsourcing  to  improve,  extend  and  share  automa0cally   Audience detected  concepts  in  video  fragments   Close-up Hands Pinkpop hat Keyboard Guitar player Singer Stage Pink Drummer Over the shuolder Figure 1: Eleven common concert concepts we detect automatically, and for which we collect user-feed Audience Close-up Hands Pinkpop hat Keyboard Guitar player Drummer Over the shuolder Singer Stage Pinkpop logo Figure 1: Eleven common concert concepts we detect automatically, and for which we collect user-feedback. 180 Excluded correct fragment labels first exp 160 back. A Crowdsourcing errors 140 vided t a prefer showed Video Fragments 120 respond 100 gregatio 80 between reliable 60 forced, 40 2%. Wi crowdso Figure 2: Timeline-based video player where col- ored dots correspond to automated visual detection 20 tomated results. Users can navigate directly to fragments of 0 can be e interest by interaction with the colored dots, which >50% >60% >70% >80% >90% is an in pop-up a feedback overlay as displayed in Figure 3. User-Feedback Agreement Figure 2: Timeline-based video player where col- 6. AC since 1970 at Landgraaf, the Netherlands. All music videos Figure 4: Results for Experiment 2: Quality vs We th
  • 17. +   Validate  the  output  of  MIR  systems Crowdsourcing Event Detection in YouTube Videos 3 [Steiner  et  al.,  2011]   through a combination of textual, visual, and behavioral analysis techniques. When a user starts watching a video, three event detection processes start: Visual Event Detection Process We detect shots in the video by visually analyzing its content [19]. We do this with the help of a browser extension, i.e., the whole process runs on the client-side using the modern HTML5 [12] JavaScript APIs of the <video> and <canvas> elements. As soon as the shots have been detected, we offer the user the n  Propose  a  browser  extension  to  navigate  detected  events  in  videos   choice to quickly jump into a specific shot by clicking on a representative still frame. n  Visual  events  (shot  changes)   The detected named entitiesvideopresented to the Occurrence Event Detection Process We analyze the available NLP techniques, as outlined in [18]. are metadata using user in a list, and upon click via a timeline-like user interface allow for jumping into n  Occurrence  events  (analysis  of  metadata  by  means  of  NLP  to  detect   one of the shots where the named entity occurs. named  en00es)   JavaScript Detection Processeachsoon asshotsvisualcount clicks been detected, Interest-based Event we attach event listeners to As of the the and events have on shots as an n  Interest-­‐based  events  (click  counters  on  detected  visual  events)   expression of interest in those shots. Fig. 2: Screenshot of the YouTube browser extension, showing the three different event
  • 18. +   Validate  the  output  of  MIR  systems [Goeau  et  al.,  2011]   n  Visual  plant  species  iden0fica0on   n  Based  on  local  visual  features   n  Crowdsourced  valida0on   writing, 858 images were up new users. These images a with uniform background, o background, and involve 15 set of 55 species. Note that within ImageCLEF2011 pla 5. EVALUATION Performances, basically i rates, will be actually show fline version connected to a d an enjoying demo where an leaves. Users would notice s cation (around 2 seconds), suggested in spite of the in cases with occlusions or wit Figure 1: GUI of the web application. a rough guide, a leave one
  • 19. +   Validate  the  output  of  MIR  systems   [Yan  et  al.,  2010]   n  CrowdSearch  combines   n  Automated  image  search   n  Local  processing  on  mobile  phones  +  backend  processing   n  Real-­‐0me  human  valida0on  of  search  results   n  Amazon  Mechanical  Turk   n  Studies  the  trade-­‐off  in  terms  of   n  Delay   man error and bias to maximize accuracy. To balance these !"#$%&'()*# +),-.-)/#&'()*#0 1"23.4)/#&5)3.-)/.6,&7)080 tradeoffs, CrowdSearch uses an adaptive algorithm that uses n  Accuracy   % $ # " ! delay and result prediction models of human responses to ju- )'(*+,( &'( &'( &'( &'( &'( +9 n  Cost   diciously use human validation. Once a candidate image is validated, it is returned to the user as a valid search result. % $ # " ! )'(*+,( &'( -. &'( &'( &'( +< 3. CROWDSOURCING FOR SEARCH In this section,n  More  on  this  later…   of the Ama- % $ # " ! we first provide a background )'(*+,( -. -. -. -. -. +; zon Mechanical Turk (AMT). We then discuss several design choices that we make while using crowdsourcing for image % $ # " ! validation including: 1) how to construct tasks such that )'(*+,( &'( -. -. &'( &'( +: they are likely to be answered quickly, 2) how to minimize human error and bias, and 3) how to price a validation task to minimize delay. Figure 2: Shown are an image search query, candi-
  • 20. C./% +3/% *)% -./% 62*7,% #3% #% 3/#26.% 3-2#-/4=% "3% 6*:1/99"$4D% "-% ,/-/6-"*$% *)% 31/6")"6% )/#-+2/3% 7"-."$% $*$AB?% :+9-":/,"#% "$-2*,+6/3% ,"</23"-=% *)% 3/#26.% -/2:3% 3"$6/% ,"))/2/$-% :/:;/23% *)% 6*99/6-"*$3>% R"8"1/,"#% Z/-2"/<#9@% #% -#38% "$% U:#4/?MGQ% &KX(% -./% 62*7,% 7"99% #119=% ,"))/2/$-% 3/#26.% 3-2#-/4"/3% ;#3/,% *$% -./"2% "$<*9</3% 9*6#-"$4% 2/9/<#$-% ":#4/3% )2*:% -./% R"8"1/,"#% ":#4/% )#:"9"#2"-=% 7"-.% -./% 3/#26.% -*1"6>% E*2/*</2@% -./% 62*7,% .#3% ;//$% 6*99/6-"*$% ;#3/,% *$% #% 12*<",/,% -/H-% 0+/2=% #$,% 3/</2#9% 3#:19/% 3.*7$% -*% 12*<",/% 4**,% 0+#9"-=% "$% 3-+,"/3% "$<*9<"$4% 2/9/<#$6/% ":#4/3>% % R."9/% R"8"1/,"#% Z/-2"/<#9% /H#:"$/3% $*"3=% #$,% +$3-2+6-+2/,% -/H-+#9% #$$*-#-"*$3% "$% R"8"1/,"#% :+9-":/,"#@% -./% +   F+,4:/$-3>%G</$%7"-.%,"</23"-=@%7/%6#$%3-"99%/H1/6-%3/#26.%0+#9"-=I% 3*:/%3-+,"/3%*$%12/,"6-"*$%"$%62*7,3*+26"$4%3=3-/:3%,/:*$3-2#-/% 3/:"3-2+6-+2/,%6*$-/$-%/<#9+#-/,%"$%U:#4/?MGQ%"3%)#2%9/33%$*"3=% -.#-% 2/9"#;"9"-=% *)% -./% #</2#4/% *)% 12/,"6-/,% 36*2/3% ;=% -./% 62*7,% #$,%:*2/%3-2+6-+2/,%-.#$%6*$-/$-%3/#26./3%*$%S*+C+;/>% Query  expansion  /  reformula0on   ":12*</3% #3% -./% 3"J/% *)% -./% 62*7,% "$62/#3/3% &KL@% KK(>% M"8/7"3/@% V/</2#9% 3-+,"/3% .#</% /H#:"$/,% 3/#26.% 0+#9"-=% *$% +3/23+119"/,% 3/#26.%0+#9"-=%"3%/H1/6-/,%-*%":12*</%#3%-./%$+:;/2%*)%3/#26./23% -#43%"$%*-./2%R/;%O>L%#119"6#-"*$3>%%]"</23"-=%*)%":#4/%-#4%3/#26.% "$% -./% 62*7,% /H1#$,3>% ?2*7,3*+26"$4% 6*$-2#3-3% 7"-.% 8$*79/,4/% 2/3+9-3% "$% Q9"682% +3"$4% #$% ":19"6"-% 2/9/<#$6/% )//,;#68% :*,/9% "3% [Harris,  2012]   :#28/-3% "$% 9/</9% *)% /$4#4/:/$-D% N"/93/$% :/$-"*$3% "$% &KO(% -.#-% /H19*2/,%;=%<*$%^7*9%!"#$%&#%&KY(@%6*$69+,"$4%-.#-%,"</23"-=%"3%#$% *</2% 'LP% *)% 8$*79/,4/% :#28/-% 42*+1% 1#2-"6"1#$-3% )#"9% -*% ":1*2-#$-% 6*:1*$/$-% 7./$% 2/-2"/<#9% "3% ;#3/,% *$% 3:#99% ,#-#% 3/-3@% 6*$-2";+-/D% -./2/)*2/% -./% 62*7,3*+26"$4% #31/6-% "$-2*,+6/3% 3*:/% 3+6.% #3% -.*3/% )*+$,% "$% ":#4/% -#43>% % _*-.*% !"#$ %&#% /H19*2/% )"$#$6"#9%"$6/$-"</%-*%:*-"<#-/%-#38%1#2-"6"1#-"*$>% )*983*$*:=% -#44"$4@% 7."6.% "3% ;*+$,% ;=% -./% 3#:/% $*"3=% +$3-2+6-+2/,%2/3-2"6-"*$3%#3%S*+C+;/%-#43%&K`(@%;+-%-./"2%3-+,=%7#3% C./%*;F/6-"</%"$%-."3%1#1/2%"3%-*%/H#:"$/%")%-./%62*7,%6#$%12*<",/% 12":#2"9=% )*6+3/,% *$% 2/6*::/$,/2% 3=3-/:3% +3#4/% *)% -./3/% -#43>% #% :*2/% 12/6"3/% 3/-% *)% AB?% 3/#26.% 2/3+9-3@% 4"</$% #% 0+/2=@% a-./23% .#</% /H#:"$/,% :+9-":/,"#% 3/#26.% /))/6-"</$/33% *$% 6*:1#2/,% 7"-.% *-./2% :+9-":/,"#% 3/#26.% -**93>% C./% 6*$-2";+-"*$3% *)% -."3% 1#1/2% #2/% #3% )*99*73>% Q"23-% 7/% 6*:1#2/% -./% 2/-2"/<#9% !"! #$%&'(%) 8$*79/,4/%:#28/-%7/;3"-/3@%3+6.%#3%?.+#%!"#$%&#%"$%&Kb(%#$,%M"%!"#$ (:;-4)NC)O/974: n  Search  YouTube  user  generated  content   1/2)*2:#$6/%*)%,"))/2/$-%2/-2"/<#9%:*,/93%"$%-/2:3%*)%12/6"3"*$%*$% 3/</2#9% 6#-/4*2"/3% +3"$4% AB?% <",/*% 2/0+/3-3% -#8/$% )2*:% 9/#,"$4% %&#%"$%&Kc(D%.*7/</2@%-./"2%)*6+3%"3%-*%9*6#-/%#99%6*$-/$-%#,,2/33"$4% !"*! +,,-./0) #% 31/6")"6% 0+/3-"*$% d/>4>% e.*7% -*f% #$,% e7.=f% 0+/3-"*$% -=1/3g% ./6478:94)F.>. %4:79A)%67:640 !"#$%& '()& *++,#$%& )-.,/.'#+$& 0)'(+12& 3)& 4.,4/,.')& '()& 567& 7./2/#3% -./% )*6+3% *)% *+2% 3-+,=% "3% *$% )"$,"$4% #$,% 2#$8"$4% <",/*3% 8$*79/,4/% :#28/-% 7/;3"-/3>% R/% -./$% 6*:1#2/% S*+C+;/T3% *7$% "4+8)"&9+8&).4(&+9&'()&").84(&)99+8'":&&;()")&.8)&%#-)$&#$&;.<,)&=:&& H'/1)$'&H).84( -.#-%)+9)"99%#%31/6")"6%3/#26.%2/0+/3-%d/>4>@%e./91%)"$,%#%<",/*fg>%% 3/#26.%"$-/2)#6/%7"-.%#%3/#26.%6*$,+6-/,%;=%3-+,/$-3%#3%7/99%#3%#% >(#,)&'()")&"4+8)"&"))0&8)."+$.<,)2&#'&#"&,#?),@&1/)&'+&'3+&#""/)"A& n  Natural  language  queries  are  restated  and  given  as  input  to   3/#26.% #112*#6.% +3"$4% 62*7,3*+26"$4>% % R/% /<#9+#-/% *+2% 2/3+9-3% +3"$4% -7*% :/-.*,3I% :/#$% #</2#4/% 12/6"3"*$% ,/-/2:"$/,% #)-/2% h%)/7%3-+,"/3%.#</%/H#:"$/,%-./%/))/6-"</$/33%*)%62*7,3%*$%$*"3=% +/8&4.,4/,.'#+$&+9&%8+/$1&'8/'(&.$12&9+8&0+"'&").84()"2&'()8)&3)8)& ,#-#% 3/#26./3>% V-/"$/2% !"#$ %&#% ,/:*$3-2#-/,% 3/#26./3% *)% /</$-% "0.,,& *)84)$'.%)& +9& B+/;/<)& -#1)+"& 3)8)& 4+$"#1)8)1& +$,@& .& ^8+31"+/84#$% & #119="$4% 1**9"$4@% #$,% #% 3":19/% 9"3-% 12/)/2/$6/@% 7./2/% -./% /$-"2/% ,/-/6-"*$%:/-.*,3%"$%S*+C+;/%<",/*3%#-%-./%)2#4:/$-%9/</9%&K'(>% & ;()& 48+31"+/84#$%& ").84(& "'8.')%@& .$1& '()& "'/1)$'& 8),)-.$':& YouTube  search  interface   n  9"3-%*)%<",/*3%F+,4/,%#3%2/9/<#$-%;=%/#6.%:/-.*,%#2/%6*:1#2/,>%% _3+/.%!"#$%&#%/H#:"$/,%3/#26./3% "$%1*9"-"6#9%;9*43%"$%&OL(%7."6.@%"'8.')%#)"& *)89+80)1& <)'')8& '(.$& '()& B+/;/<)& ").84(& ").84(& (:;-4)SC)O/9 ./6478:94)F #9-.*+4.% $*"3=@% ,*% $*-% /H1/2"/$6/% -./% 2/3-2"6-"*$3% "$./2/$-% "$% ."& 0)."/8)1& <@& 5672& .& 8)"/,'& '(.'& #"& "'.'#"'#4.,,@& #$')89.4)& C./%2/:#"$,/2%*)%-./%1#1/2%"3%*24#$"J/,%#3%)*99*73>%U$%V/6-"*$%O% n  Students   7/%1+-%*+2%7*28%"$%-./%6*$-/H-%*)%12/<"*+3%7*28>%U$%V/6-"*$%W%7/% #112*#6.% 6#99/,% ?2*7,V/#26.@% 7."6.% 12*<",/,% $/#22/#9-":/% "#%$#9#4.$'&C'3+&'.#,)12&*DE:EFG:& :+9-":/,"#% -#43>% % U$% &OK(@% S#$% !"#$ %&>% 12*<",/,% #$% "$$*<#-"</% %4:79A)%67:640 ,"36+33% *+2% /H1/2":/$-#9% 3/-+1>% V/6-"*$% X% *))/23% #% ,"36+33"*$% *)% (:;-4)<")=>47:--)?@+)59,745)8,7)4:9A)54:79A)567:640B") H'/1)$'&H).84( #33/33:/$-% *)% ":#4/3>% h9-.*+4.% -./% #+-.*23T% )*6+3% 7#3% *$% V/6-"*$%Y>% n  Crowd  in  Mturk   -./% 2/3+9-3>% R/% 6*$69+,/% #$,% 12*<",/% "$3"4.-% "$-*% )+-+2/% 7*28% "$% 9#;/9"$4% ":#4/3@% -./"2% #112*#6.% 6*+9,% )/#3";9=% ;/% /H-/$,/,%%4:79A)%67:640B) 9*6#-"$4%3":"9#2%:/,"#%*$%S*+C+;/>% -*% ?@+) ^8+31"+/84#$ H'/1)$'&H).84(& E:FXK& & ;+&#,,/"'8.')2&#$&; ^8+31"+/84#$%& E:FQX& U1#99#4/,'V2&'+&+< B+/;/<)&H).84(& E:=Fa& 3)&3+/,1&)W*)4' & 4)$'":& & ;+& +< H#$4)& I)"'.')1& J/)8#)"& 3)8)& %8+/*)1& #$'+& '(8))& ")*.8.')& 48+31"+/84#$%2& 4.')%+8#)"& C)."@2& 0)1#/02& .$1& 1#99#4/,'G2& 3)& )-.,/.')1& '()0& 0#$/')"&.$1&#$4/ ")*.8.'),@& 9+8& ).4(& ").84(& "'8.')%@:& & ;()& 8)"/,'"& .8)& 8)*+8')1& #$& 8)*8)")$'& ,+$%& ') ;.<,)&K:& 48+312& ."& 4+0*. '.?)"& '3+& '(#81" MAP   (:;-4)!C)?@+)59,745)8,7)4:9A)54:79A)567:640BD);7,E4/)F,G/) ;B)54:79A)9:640,7B") )M/#-.,)$'&.0+/$ +/8&*.8.0+/$'&+ '+& *8+-#1)& '()& < %4:79A)%67:640B) $:5B) ?4F.H2) I.88.9H-6) 4+$"#1)8.'#+$2&+/ H'/1)$'&H).84(& E:T=T& E:FQT& E:KXQ& '()&<)"'&'8.1)+99& ^8+31"+/84#$%& E:TQa& E:FQX& E:KEK& !"1! %.23-4) B+/;/<)&H).84(& E:FE`& E:=KK& E:XXK& >)& .**,@& ^+*), PXT2& XYR2& #"& .& % & ;.<,)&K&8.#")"&"+0)&#$')8)"'#$%&*+#$'"&9+8&1#"4/""#+$:&&L#8"'2&567& *8)9)8)$4)":& & ^+ -./012)3")+4214.25)67)892)4.:26)1281.24;<)=16>2??).@46<4.@/)A60'0B2C?)?2;1>9).@8217;>2D)?80:2@8?D);@:)892)>165:") "4+8)"&9+8&)."@&M/)8#)"&.8)&0/4(&0+8)&4+$"#"')$'&.48+""&"'8.')%#)"& '3+& ,#"'"& 9+8& .& %# 4+0*.8)1& 3#'(& '(+")& 9+8& 0)1#/0& +8& 1#99#4/,'& ").84()":& & ;(#"& #"& ."")""+8S"&*8)9)8) % <@&$/0<)8&+9&-#4 ,#?),@&.&8)"/,'&+9&.&().-#)8&8),#.$4)&9+8&"'/1)$'"&.$1&'()&48+31&+$& '()& "'.$1.81& B+/;/<)& ").84(& #$')89.4)& 9+8& '()& )."#)8& M/)8#)"2& 3#$$)8:&&>)&)W.0
  • 21. +   Aggrega0ng  annota0ons  
  • 22. +   Annota0on  model   n  A  set  of  objects  to  annotate   i = 1, . . . , I n  A  set  of  annotators   j = 1, . . . , J n  Types  of  annota0ons   n  Binary   n  Categorical  (mul0-­‐class)   n  Numerical   n  Other    
  • 23. +   Annota0on  model   True  labels   Objects   Annotators   1 y1 1 y2 2 y1 y1 Annota0ons   3 y2 y2 j yi ∈ L 3 y3 Binary   |L| = 2 y3 4 y1   5 Mul0-­‐class   |L| > 2 y1 5 y2
  • 24. +   Aggrega0ng  annota0ons   n  Majority  vo0ng  (baseline)   n  For  each  object,  assign  the  label  that  received  the  largest  number  of  votes   n  Aggrega0ng  annota0ons   n  [Dawid  and  Skene,  1979]   n  [Snow  et  al.,  2008]   n  [Whitehill  et  al.,  2009]   n  …   n  Aggrega0ng  and  learning   n  [Sheng  et  al.,  2008]   n  [Donmez  et  al.,  2009]   n  [Raykar  et  al.,  2010]   n  …  
  • 25. +   Aggrega0ng  annota0ons   Majority  vo0ng   n  Assume  that     j n  The  annotator  quality  is  independent  from  the  object  P (yi = yi ) = pj   n  All  annotators  have  the  same  quality   pj = p n  The  integrated  quality  of  majority  vo0ng  using    I              2N    +    1       =                   annotators  is   2N + 1 N q = P (y M V = y) = p2N +1−i · (1 − p)i i l=0
  • 26. +   Aggrega0ng  annota0ons   Majority  vo0ng   -q 1 0.9 p=1.0 ly. p=0.9 Integrated quality me 0.8 p=0.8 0.7 p=0.7 0.6 p=0.6 0.5 p=0.5 y) 0.4 p=0.4 me 0.3 , 0.2 U. 1 3 5 7 9 11 13 yi Number of labelers is Figure 2: The relationship between integrated label- ue ing quality, individual quality, and the number of la- el belers.
  • 27. +   Aggrega0ng  annota0ons   [Snow  et  al.,  2008]   j n  Binary  labels:     yi ∈ {0, 1} n  The  true  label  is  es0mated  evalua0ng  the  posterior  log-­‐odds,  i.e.,   1 J P (yi = 1|yi , . . . , yi ) log 1 J P (yi = 0|yi , . . . , yi ) n  Applying  Bayes  theorem   P (yi = 1|yi , . . . , yi ) 1 J j P (yi |yi = 1) P (yi = 1) log 1 J = log j + log P (yi = 0|yi , . . . , yi ) j P (yi |yi = 0) P (yi = 0) posterior   likelihood   prior  
  • 28. +   Aggrega0ng  annota0ons   [Snow  et  al.,  2008]   j j n  How  to  es0mate P (yi |y      =    1)    and  P    (y  i    |y  i      =    0)  ?    i                                     n  Gold  standard:     n  Some  objects  have  known  labels   n  Ask  to  annotate  these  objects   n  Compute  empirical  p.m.f.  for  object(s)  with  known  labels   Number of correct annotations P (y j = 1|y = 1) = Number of annotations of object with label = 1 n  Compute  the  performance  of  annotator    j    (independent  from  the  object)       j j j P (y1 |y1 = 1) = P (y2 |y2 = 1) = . . . = P (yI |yI = 1) = P (y j |y = 1)
  • 29. +   Aggrega0ng  annota0ons   [Snow  et  al.,  2008]   n  Each  annotator  vote  is  weighted  by  the  log-­‐likelihood  ra0o  for  their   given  response  (Naïve  Bayes)   n  More  reliable  annotators  are  weighted  more   P (yi = 1|yi , . . . , yi ) 1 J j P (yi |yi = 1) P (yi = 1) log 1 J = log j + log P (yi = 0|yi , . . . , yi ) j P (yi |yi = 0) P (yi = 0) n  Issue:  Obtaining  a  gold  standard  is  costly!    
  • 30. +   Aggrega0ng  annota0ons   [Kumar  and  Lease,  2011]   Figure 1: p1:w ∼U(0.6, 1.0). With very accurate annotators, generating multiple labels (to improve consensus label accuracy) provides little benefit. Instead, labeling effort is better spent single labeling more examples. n  With  very  accurate  annotators,  it  is  berer  to  label  more  examples   once   pj ∼ U (0.6, 1.0) Figure 2: p1:w ∼U(0.4, 0.6). With very noisy annotators, single labeling yields such poor training data that n  With  very  noisy  annotators,  aggrega0ng  labels  helps,  if  annotator   there is no benefit from labeling more examples (i.e. a flat learning rate). MV just aggregates this noise to produce more ∼U(0.6, 1.0). With very accurate annotators, generating multiple labels (to improve consensus Figure 1: p1:w noise. In contrast, by modeling worker accuracies and weighting their labels appropriately, accuracies  are  taken  into  account   label accuracy) provides little benefit. Instead, labeling effort is better spent single labeling more examples. NB can improve consensus labeling accuracy (and thereby classifier accuracy).   pj ∼ U (0.3, 0.7) Figure 2: p1:w ∼U(0.4, 0.6). With very noisy annotators, single labeling yields such poor training data that Figure 3: p1:w ∼U(0.3, 0.7). With greater variance in accuracies vs. Figure 2, NB further improves. SL:  Single  Labeling,  MV:  Majority  Vo0ng;  NB:  Naïve  Bayes   there is no benefit from labeling more examples (i.e. a flat learning rate). MV just aggregates this noise to produce more noise. In contrast, by modeling worker accuracies and weighting their labels appropriately, NB can improve consensus labeling accuracy (and thereby classifier accuracy).