SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Machine Learning at
                           PeerIndex


                             @fhuszar


     Ferenc Huszár
Wednesday, 16 May 12
PeerIndex.com: understand your influence




Wednesday, 16 May 12
PeerPerks.com: free stuff for influencers




Wednesday, 16 May 12
PeerPerks: free stuff for influencers




Wednesday, 16 May 12
Machine Learning @ PeerIndex




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later
                       •   visualise different aspects of influence, in an engaging way




Wednesday, 16 May 12
Machine Learning @ PeerIndex

                   •   The usual stuff
                       •   topic modelling/classification of tweets/statuses/URLs
                       •   identity resolution across twitter, facebook, linkedIn
                       •   spambot/fraud detection: identify people gaming the system
                       •   sentiment classification: happy/sad/neutral

                   •   The really exciting stuff
                       •   inferring networks of influence - more about this later
                       •   visualise different aspects of influence, in an engaging way
                       •   influence maximisation - submodular optimisation




Wednesday, 16 May 12
Inferring networks of influence




Wednesday, 16 May 12
Inferring networks of influence

           Social network




Wednesday, 16 May 12
Inferring networks of influence

           Social network                Propagation probabilities




                                                pi,j




Wednesday, 16 May 12
Inferring networks of influence

           Social network                                                            Propagation probabilities




                                                                                            pi,j



            Information cascade logs
     http://www.pcworld.com/article/239719    http://techcrunch.com/2011/11/21/...

          1079306 2011-08-25T00:03:06+01:00       259725 2011-10-24T03:32:19+01:00
          4549198 2011-08-25T04:32:25+01:00        76539 2011-10-24T03:32:23+01:00
          2662975 2011-08-25T00:35:11+01:00      1922351 2011-10-24T04:28:47+01:00
          2333224 2011-08-25T01:43:18+01:00         9183 2011-10-24T03:30:57+01:00
          3141371 2011-08-25T01:52:06+01:00      3335398 2011-10-24T03:34:01+01:00
          3482720 2011-08-25T07:18:24+01:00      1616885 2011-10-24T03:48:16+01:00
          1403682 2011-08-25T03:52:58+01:00        82198 2011-10-24T03:48:29+01:00
          4679657 2011-08-25T01:07:48+01:00       906390 2011-10-24T23:13:51+01:00
            32460 2011-08-25T01:11:43+01:00      1051322 2011-10-24T03:40:02+01:00




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                 1
                                        pi,j
                                               din (j)




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly


                •      data-driven heuristics
                                  number of items shared by j after i shared it
                          pi,j
                                         number of items shared by i




Wednesday, 16 May 12
Heurisric approaches to estimate pi,j


                •      purely based on local network structure
                                                          1
                                                pi,j
                                                        din (j)

                •      trivalency “model” my personal favourite :)
                                   pi,j     {0.1, 0.01, 0.01} randomly


                •      data-driven heuristics
                                  number of items shared by j after i shared it
                          pi,j
                                         number of items shared by i



              How do you solve this with machine learning?

Wednesday, 16 May 12
The likelihood




Wednesday, 16 May 12
The likelihood




          P( D |                        ✓ )

Wednesday, 16 May 12
The likelihood




          P( D |                                                ✓ )
                       http://www.pcworld.com/article/239719

                            1079306 2011-08-25T00:03:06+01:00
                            4549198 2011-08-25T04:32:25+01:00
                            2662975 2011-08-25T00:35:11+01:00
                            2333224 2011-08-25T01:43:18+01:00
                            3141371 2011-08-25T01:52:06+01:00
                            3482720 2011-08-25T07:18:24+01:00
                            1403682 2011-08-25T03:52:58+01:00
                            4679657 2011-08-25T01:07:48+01:00
                              32460 2011-08-25T01:11:43+01:00




Wednesday, 16 May 12
The likelihood




          P( D |                                                       )
                       http://www.pcworld.com/article/239719

                            1079306 2011-08-25T00:03:06+01:00
                            4549198 2011-08-25T04:32:25+01:00
                            2662975 2011-08-25T00:35:11+01:00
                            2333224 2011-08-25T01:43:18+01:00
                            3141371 2011-08-25T01:52:06+01:00
                            3482720 2011-08-25T07:18:24+01:00
                            1403682 2011-08-25T03:52:58+01:00
                            4679657 2011-08-25T01:07:48+01:00
                              32460 2011-08-25T01:11:43+01:00
                                                                pi,j




Wednesday, 16 May 12
The likelihood




          P( D |                                                           )
                           http://www.pcworld.com/article/239719

                                1079306 2011-08-25T00:03:06+01:00
                                4549198 2011-08-25T04:32:25+01:00
                                2662975 2011-08-25T00:35:11+01:00
                                2333224 2011-08-25T01:43:18+01:00
                                3141371 2011-08-25T01:52:06+01:00
                                3482720 2011-08-25T07:18:24+01:00
                                1403682 2011-08-25T03:52:58+01:00
                                4679657 2011-08-25T01:07:48+01:00
                                  32460 2011-08-25T01:11:43+01:00
                                                                    pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un




Wednesday, 16 May 12
The likelihood




          P( D |                                                                      )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade




Wednesday, 16 May 12
The likelihood




          P( D |                                                                      )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1




Wednesday, 16 May 12
The likelihood




          P( D |                                                                              )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                       pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1(1            (1        p0,u2 ) (1                      pu1 ,u2 ))




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                   )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                            pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade


             p0,u1(1            (1        p0,u2 ) (1                      pu1 ,u2 ))· · ·




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                   )
                                      http://www.pcworld.com/article/239719

                                           1079306 2011-08-25T00:03:06+01:00
                                           4549198 2011-08-25T04:32:25+01:00
                                           2662975 2011-08-25T00:35:11+01:00
                                           2333224 2011-08-25T01:43:18+01:00
                                           3141371 2011-08-25T01:52:06+01:00
                                           3482720 2011-08-25T07:18:24+01:00
                                           1403682 2011-08-25T03:52:58+01:00
                                           4679657 2011-08-25T01:07:48+01:00
                                             32460 2011-08-25T01:11:43+01:00
                                                                                            pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                             0                               1
                                                     n
                                                     Y                     i 1
                                                                           Y
                                               =             @1                  (1   puj ,ui )A
                                                     i=1                   j=1




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                     )
                                        http://www.pcworld.com/article/239719

                                             1079306 2011-08-25T00:03:06+01:00
                                             4549198 2011-08-25T04:32:25+01:00
                                             2662975 2011-08-25T00:35:11+01:00
                                             2333224 2011-08-25T01:43:18+01:00
                                             3141371 2011-08-25T01:52:06+01:00
                                             3482720 2011-08-25T07:18:24+01:00
                                             1403682 2011-08-25T03:52:58+01:00
                                             4679657 2011-08-25T01:07:48+01:00
                                               32460 2011-08-25T01:11:43+01:00
                                                                                              pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                               0                               1
                                                       n
                                                       Y                     i 1
                                                                             Y
                                                 =             @1                  (1   puj ,ui )A
                                                       i=1                   j=1
             for users that are not in cascade




Wednesday, 16 May 12
The likelihood




          P( D |                                                                                      )
                                        http://www.pcworld.com/article/239719

                                             1079306 2011-08-25T00:03:06+01:00
                                             4549198 2011-08-25T04:32:25+01:00
                                             2662975 2011-08-25T00:35:11+01:00
                                             2333224 2011-08-25T01:43:18+01:00
                                             3141371 2011-08-25T01:52:06+01:00
                                             3482720 2011-08-25T07:18:24+01:00
                                             1403682 2011-08-25T03:52:58+01:00
                                             4679657 2011-08-25T01:07:48+01:00
                                               32460 2011-08-25T01:11:43+01:00
                                                                                               pi,j

            what’s the probability of the cascade u1 , u2 , u3 , . . . , un
             for subsequent users in cascade
                                                               0                                1
                                                       n
                                                       Y                     i 1
                                                                             Y
                                                 =             @1                  (1    puj ,ui )A
                                                       i=1                   j=1
             for users that are not in cascade
                                                         Y                       Y
                                                                                        (1   pu,v )
                                                 u2{u1 ...un } v2users
                                                  /


Wednesday, 16 May 12
Maximum likelihood at scale




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML

                   •   use heuristics to compute probabilities at scale




Wednesday, 16 May 12
Maximum likelihood at scale



                   •   data too sparse to learn one parameter per edge

                   •   large scale gradient-based optimisation is costly

                   •   Solution: combine ensemble of heuristics with ML

                   •   use heuristics to compute probabilities at scale

                   •   use ML to tune parameters on small-scale data




Wednesday, 16 May 12
Influence maximisation




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})       f (A)   f (B [ {x})   f (B)




Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})           f (A)   f (B [ {x})   f (B)

                       • these functions are fun to optimise


Wednesday, 16 May 12
Influence maximisation


                   • Select a set of users to maximise outreach
                   • Influence of people combines non-linearly
                   • In many models it combines sub-modularly
             A ✓ B =) f (A [ {x})          f (A)   f (B [ {x})   f (B)

                       • these functions are fun to optimise
                       • pops up many times in machine learning

Wednesday, 16 May 12
Wrap up




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to
                       •   putting all aspects together into a single number, and visualise




Wednesday, 16 May 12
Wrap up

                   •   two lines of ‘data’ products: PeerIndex, PeerPerks

                   •   lots of ‘standard’ machine learning tasks

                   •   some uniquely exciting problems
                       •   inferring propagation probabilities
                       •   compute expected number of users one reaches out to
                       •   putting all aspects together into a single number, and visualise
                       •   influence maximisation




Wednesday, 16 May 12
Thanks


            We’re hiring ML scientists, interns and engineers...
                                @fhuszar
                           fh@peerindex.com




Wednesday, 16 May 12

Weitere ähnliche Inhalte

Ähnlich wie Machine Learning at PeerIndex

Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebMatthew Russell
 
Advanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsAdvanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsSloan Carne
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceDeep Shankar Yadav
 
Complex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesComplex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesS.M. Mahdi Seyednezhad, Ph.D.
 
Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Ruben Pertusa Lopez
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Lora Aroyo
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea conInnismir
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Xiaohu ZHU
 
Let’s hunt the target using OSINT
Let’s hunt the target using OSINTLet’s hunt the target using OSINT
Let’s hunt the target using OSINTChandrapal Badshah
 
Linked In 101 Workshop
Linked In 101 WorkshopLinked In 101 Workshop
Linked In 101 Workshoprocklandweb
 
SIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniSIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniMarco Canini
 
Shrp on line social networking handout
Shrp on line social networking handoutShrp on line social networking handout
Shrp on line social networking handoutTodd Nilson
 
Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lora Aroyo
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustPeter Skomoroch
 
Open Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysisikanow
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internettkisason
 
Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Eric Reiss
 

Ähnlich wie Machine Learning at PeerIndex (20)

Privacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social WebPrivacy, Ethics, and Future Uses of the Social Web
Privacy, Ethics, and Future Uses of the Social Web
 
Clarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal PresentationClarkson - Joshua White - Research Proposal Presentation
Clarkson - Joshua White - Research Proposal Presentation
 
Advanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU InvestigatorsAdvanced Research Investigations for SIU Investigators
Advanced Research Investigations for SIU Investigators
 
DECEPTICONv2
DECEPTICONv2DECEPTICONv2
DECEPTICONv2
 
OSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligenceOSINT- Leveraging data into intelligence
OSINT- Leveraging data into intelligence
 
Complex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesComplex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and Databases
 
From OSINT to Phishing presentation
From OSINT to Phishing presentationFrom OSINT to Phishing presentation
From OSINT to Phishing presentation
 
Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)Social text sentiment and tone analysis [aai 201] - (4160)
Social text sentiment and tone analysis [aai 201] - (4160)
 
Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)Social Web 2014: Final Presentations (Part I)
Social Web 2014: Final Presentations (Part I)
 
Blitzing with your defense bea con
Blitzing with your defense bea conBlitzing with your defense bea con
Blitzing with your defense bea con
 
Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1Shanghai Deep Learning Meetup #1
Shanghai Deep Learning Meetup #1
 
Let’s hunt the target using OSINT
Let’s hunt the target using OSINTLet’s hunt the target using OSINT
Let’s hunt the target using OSINT
 
Linked In 101 Workshop
Linked In 101 WorkshopLinked In 101 Workshop
Linked In 101 Workshop
 
SIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco CaniniSIGCOMM '16 Outrageous Opinion by Marco Canini
SIGCOMM '16 Outrageous Opinion by Marco Canini
 
Shrp on line social networking handout
Shrp on line social networking handoutShrp on line social networking handout
Shrp on line social networking handout
 
Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)Lecture 7: Social Web Challenges (2012)
Lecture 7: Social Web Challenges (2012)
 
O'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data ExhaustO'Reilly Strata: Distilling Data Exhaust
O'Reilly Strata: Distilling Data Exhaust
 
Open Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media AnalysisOpen Analytics: Building Effective Frameworks for Social Media Analysis
Open Analytics: Building Effective Frameworks for Social Media Analysis
 
OpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internetOpenFest 2012 : Leveraging the public internet
OpenFest 2012 : Leveraging the public internet
 
Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018Ethics and ux ux sofia nov 2018
Ethics and ux ux sofia nov 2018
 

Kürzlich hochgeladen

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Machine Learning at PeerIndex

  • 1. Machine Learning at PeerIndex @fhuszar Ferenc Huszár Wednesday, 16 May 12
  • 2. PeerIndex.com: understand your influence Wednesday, 16 May 12
  • 3. PeerPerks.com: free stuff for influencers Wednesday, 16 May 12
  • 4. PeerPerks: free stuff for influencers Wednesday, 16 May 12
  • 5. Machine Learning @ PeerIndex Wednesday, 16 May 12
  • 6. Machine Learning @ PeerIndex • The usual stuff Wednesday, 16 May 12
  • 7. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs Wednesday, 16 May 12
  • 8. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn Wednesday, 16 May 12
  • 9. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system Wednesday, 16 May 12
  • 10. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral Wednesday, 16 May 12
  • 11. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff Wednesday, 16 May 12
  • 12. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later Wednesday, 16 May 12
  • 13. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later • visualise different aspects of influence, in an engaging way Wednesday, 16 May 12
  • 14. Machine Learning @ PeerIndex • The usual stuff • topic modelling/classification of tweets/statuses/URLs • identity resolution across twitter, facebook, linkedIn • spambot/fraud detection: identify people gaming the system • sentiment classification: happy/sad/neutral • The really exciting stuff • inferring networks of influence - more about this later • visualise different aspects of influence, in an engaging way • influence maximisation - submodular optimisation Wednesday, 16 May 12
  • 15. Inferring networks of influence Wednesday, 16 May 12
  • 16. Inferring networks of influence Social network Wednesday, 16 May 12
  • 17. Inferring networks of influence Social network Propagation probabilities pi,j Wednesday, 16 May 12
  • 18. Inferring networks of influence Social network Propagation probabilities pi,j Information cascade logs http://www.pcworld.com/article/239719 http://techcrunch.com/2011/11/21/... 1079306 2011-08-25T00:03:06+01:00 259725 2011-10-24T03:32:19+01:00 4549198 2011-08-25T04:32:25+01:00 76539 2011-10-24T03:32:23+01:00 2662975 2011-08-25T00:35:11+01:00 1922351 2011-10-24T04:28:47+01:00 2333224 2011-08-25T01:43:18+01:00 9183 2011-10-24T03:30:57+01:00 3141371 2011-08-25T01:52:06+01:00 3335398 2011-10-24T03:34:01+01:00 3482720 2011-08-25T07:18:24+01:00 1616885 2011-10-24T03:48:16+01:00 1403682 2011-08-25T03:52:58+01:00 82198 2011-10-24T03:48:29+01:00 4679657 2011-08-25T01:07:48+01:00 906390 2011-10-24T23:13:51+01:00 32460 2011-08-25T01:11:43+01:00 1051322 2011-10-24T03:40:02+01:00 Wednesday, 16 May 12
  • 19. Heurisric approaches to estimate pi,j Wednesday, 16 May 12
  • 20. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) Wednesday, 16 May 12
  • 21. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly Wednesday, 16 May 12
  • 22. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly • data-driven heuristics number of items shared by j after i shared it pi,j number of items shared by i Wednesday, 16 May 12
  • 23. Heurisric approaches to estimate pi,j • purely based on local network structure 1 pi,j din (j) • trivalency “model” my personal favourite :) pi,j {0.1, 0.01, 0.01} randomly • data-driven heuristics number of items shared by j after i shared it pi,j number of items shared by i How do you solve this with machine learning? Wednesday, 16 May 12
  • 25. The likelihood P( D | ✓ ) Wednesday, 16 May 12
  • 26. The likelihood P( D | ✓ ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 Wednesday, 16 May 12
  • 27. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j Wednesday, 16 May 12
  • 28. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un Wednesday, 16 May 12
  • 29. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade Wednesday, 16 May 12
  • 30. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1 Wednesday, 16 May 12
  • 31. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1(1 (1 p0,u2 ) (1 pu1 ,u2 )) Wednesday, 16 May 12
  • 32. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade p0,u1(1 (1 p0,u2 ) (1 pu1 ,u2 ))· · · Wednesday, 16 May 12
  • 33. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 Wednesday, 16 May 12
  • 34. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 for users that are not in cascade Wednesday, 16 May 12
  • 35. The likelihood P( D | ) http://www.pcworld.com/article/239719 1079306 2011-08-25T00:03:06+01:00 4549198 2011-08-25T04:32:25+01:00 2662975 2011-08-25T00:35:11+01:00 2333224 2011-08-25T01:43:18+01:00 3141371 2011-08-25T01:52:06+01:00 3482720 2011-08-25T07:18:24+01:00 1403682 2011-08-25T03:52:58+01:00 4679657 2011-08-25T01:07:48+01:00 32460 2011-08-25T01:11:43+01:00 pi,j what’s the probability of the cascade u1 , u2 , u3 , . . . , un for subsequent users in cascade 0 1 n Y i 1 Y = @1 (1 puj ,ui )A i=1 j=1 for users that are not in cascade Y Y (1 pu,v ) u2{u1 ...un } v2users / Wednesday, 16 May 12
  • 36. Maximum likelihood at scale Wednesday, 16 May 12
  • 37. Maximum likelihood at scale • data too sparse to learn one parameter per edge Wednesday, 16 May 12
  • 38. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly Wednesday, 16 May 12
  • 39. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML Wednesday, 16 May 12
  • 40. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML • use heuristics to compute probabilities at scale Wednesday, 16 May 12
  • 41. Maximum likelihood at scale • data too sparse to learn one parameter per edge • large scale gradient-based optimisation is costly • Solution: combine ensemble of heuristics with ML • use heuristics to compute probabilities at scale • use ML to tune parameters on small-scale data Wednesday, 16 May 12
  • 43. Influence maximisation • Select a set of users to maximise outreach Wednesday, 16 May 12
  • 44. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly Wednesday, 16 May 12
  • 45. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly Wednesday, 16 May 12
  • 46. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) Wednesday, 16 May 12
  • 47. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) • these functions are fun to optimise Wednesday, 16 May 12
  • 48. Influence maximisation • Select a set of users to maximise outreach • Influence of people combines non-linearly • In many models it combines sub-modularly A ✓ B =) f (A [ {x}) f (A) f (B [ {x}) f (B) • these functions are fun to optimise • pops up many times in machine learning Wednesday, 16 May 12
  • 50. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks Wednesday, 16 May 12
  • 51. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks Wednesday, 16 May 12
  • 52. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems Wednesday, 16 May 12
  • 53. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities Wednesday, 16 May 12
  • 54. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to Wednesday, 16 May 12
  • 55. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to • putting all aspects together into a single number, and visualise Wednesday, 16 May 12
  • 56. Wrap up • two lines of ‘data’ products: PeerIndex, PeerPerks • lots of ‘standard’ machine learning tasks • some uniquely exciting problems • inferring propagation probabilities • compute expected number of users one reaches out to • putting all aspects together into a single number, and visualise • influence maximisation Wednesday, 16 May 12
  • 57. Thanks We’re hiring ML scientists, interns and engineers... @fhuszar fh@peerindex.com Wednesday, 16 May 12

Hinweis der Redaktion

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n