SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
Tha	
  Anatomy	
  of	
  a	
  Large-­‐Scale	
  
Social	
  Search	
  Engine,	
  www2010	
                               	
  
•  Damon	
  Horowitz,	
  Sepandar	
  D.	
  Kamvar	
  
•  The	
  Anatomy	
  of	
  a	
  Large-­‐Scale	
  Social	
  Search	
  
   Engine	
  
•  WWW	
  2010	
  

•  Aardvark	
             QA                              	
  
•  web                                            	
  
•  QA              	
  
•                         	
  
• 
                                 	
  
• 
            	
  

•  Google
•        	
  Aardvark	
                                 •           :	
  Google	
  
•                                  	
                   •                                  	
  
•                           	
                          • 
•                                         	
                 	
  
•                                                	
     •                                         	
  
                                                        •                                                	

                                                          	
  
“Do	
  you	
  have	
  any	
  good	
  babysiLer	
  recommendaMons	
  in	
  Palo	
  
Alto	
  for	
  my	
  6-­‐year-­‐old	
  twins?	
  I’m	
  looking	
  for	
  somebody	
  that	
  
won’t	
  let	
  them	
  watch	
  TV.”
•  Crawler	
  and	
  Indexer	
  
     –                                                	
  
•  Query	
  Analyzer	
  
     –               	
  
•  Ranking	
  FuncMon	
  
     –                             	
  
•  UI	
  
     –                                         UI
s(ui ,u j ,q) = p(ui | u j ) • p(ui | q)
                = p(ui | u j )∑ p(ui | t) p(t | q)
                                  t∈T


• p(ui|uj):	
  quality	
  score	
  
• p(ui|q):	
  relevance	
  score	
  
•                                    	
  

u:             q:            t:             	
  
P(ui|t)                                        	
•                  	
                                          p(t | ui ) p(ui )
                                                   p(ui | t) =
•                                           	
                       p(t)
•                                    	
            s(t | ui ) = p(t | ui ) + γ ∑u∈U p(t | u)
     • facebook    	
  
• blog      	
                                     ∑ p(t | u ) = 1
                                                              i
•                  /twiLer	
                       t∈T


                                     €


                                 €
•                         	
  
     •                                                                        	
  
     • 
P(ui|uj)                    	
• 
                    	
  
     –           	
  
     –                                          	
  
     –                                   	
  
     –    	
  
     –                         	
  
     –                            	
  
     –    	
  
     – 
P(t|q)                       :	
     	
•  Non	
  QuesMon	
  Classifier	
  
   –                       	
  
•  Inappropriate	
  QuesMon	
  Classifier	
  
   –                	
  
•  Trivial	
  QuesMon	
  Classifier	
  
   –                                                  	
  
•  LocaMon	
  SensiMve	
  Classifier	
  
   – 
P(t|q)                        :	
                    	
•                          	
  
     –  Keyword	
  Match	
  Topic	
  Mapper	
  
         •                                       	
  
     –  Taxonomy	
  Topic	
  Mapper	
  
         •  SVM 3000                             	
  
     –  Salient	
  Term	
  Topic	
  Mapper	
  
         •  d-­‐idf                                     	
  
     –  User	
  Tag	
  Topic	
  Mapper	
  
         • 
•                                                  	
  
     –  Topic	
  ExperMse:	
  p(ui|q)	
  
     –  Connectedness:	
  p(ui|uj)	
  
     –  Availability:	
                                   	
  
•                  	
  
     – 
                                            	
  
•                        	
  
     –  Google PC               	
  
•  Mobile	
  Google   Aardvark
      	
  
     –  Google                         Aardvark
• 
             	
  
•                        	
  




                                  	
                                        	
Aardvark	
                             18.6	
  words	
                 98.1%	
                    	
          2.2	
   	
  2.9	
  words	
        57	
   	
  63%
•                   	
  
     –  fact
•  57.2% 10                 	
  
     –  facebook 15.7% 15          	
  
•             6 37
•  87.7%                	
  
•      2.08
•  97.7%       3               	
  
•  174,605         	
  
•      1,199,323
•  Google            	
  
     –  200     Aardvark                 	
  
     –  Aardvark                         google
                                     5                                	
  
     –  10                                                     	
  

                             	
                 	
                                  	

Aardvark	
                        5 	
               71.5%	
                 3.93	
  ±	
  1.23	

Google	
                          2 	
               70.5%	
                 3.07	
  ±	
  1.46
•                                          	
  
     –                              	
  
• 
                             	
  
• 
                      	
  
•              	
  
• 
•  “       ”       Aardvark   	
  
•  Aardvark          	
  
•  Aardvark          	
  

•  “           ”
                       	
  
• 

Weitere ähnliche Inhalte

Mehr von Jun Harada

Mehr von Jun Harada (13)

決算が読めるようになるゼミ第5回_Slack_原田惇
決算が読めるようになるゼミ第5回_Slack_原田惇決算が読めるようになるゼミ第5回_Slack_原田惇
決算が読めるようになるゼミ第5回_Slack_原田惇
 
mybo concept v1.00
mybo concept v1.00mybo concept v1.00
mybo concept v1.00
 
IoT x オープンイノベーション MERC丸の内院生ラウンジ
IoT x オープンイノベーション MERC丸の内院生ラウンジIoT x オープンイノベーション MERC丸の内院生ラウンジ
IoT x オープンイノベーション MERC丸の内院生ラウンジ
 
ロボット技術が、意外な製品・サービスに変わる - ロボット技術の応用事例
ロボット技術が、意外な製品・サービスに変わる - ロボット技術の応用事例ロボット技術が、意外な製品・サービスに変わる - ロボット技術の応用事例
ロボット技術が、意外な製品・サービスに変わる - ロボット技術の応用事例
 
(途中案)本当に役立つプログラミング力を鍛える講座
(途中案)本当に役立つプログラミング力を鍛える講座(途中案)本当に役立つプログラミング力を鍛える講座
(途中案)本当に役立つプログラミング力を鍛える講座
 
コミュニケーションロボット開発から拡販までの色々
コミュニケーションロボット開発から拡販までの色々コミュニケーションロボット開発から拡販までの色々
コミュニケーションロボット開発から拡販までの色々
 
ユカイ工学 Qooboのご紹介
ユカイ工学 Qooboのご紹介ユカイ工学 Qooboのご紹介
ユカイ工学 Qooboのご紹介
 
2017-12-06 tsumugu4 人工知能特集
2017-12-06 tsumugu4 人工知能特集2017-12-06 tsumugu4 人工知能特集
2017-12-06 tsumugu4 人工知能特集
 
IoT Business in Japan
IoT Business in JapanIoT Business in Japan
IoT Business in Japan
 
東京研修プログラム
東京研修プログラム東京研修プログラム
東京研修プログラム
 
20170606 東京システムハウス様 ロボティクス思考塾_1.00
20170606 東京システムハウス様 ロボティクス思考塾_1.0020170606 東京システムハウス様 ロボティクス思考塾_1.00
20170606 東京システムハウス様 ロボティクス思考塾_1.00
 
西大和中学校様むけ、ミエタ社ワークショップ
西大和中学校様むけ、ミエタ社ワークショップ西大和中学校様むけ、ミエタ社ワークショップ
西大和中学校様むけ、ミエタ社ワークショップ
 
IoT・ロボット製品の実現に向けたアプローチの実例
IoT・ロボット製品の実現に向けたアプローチの実例IoT・ロボット製品の実現に向けたアプローチの実例
IoT・ロボット製品の実現に向けたアプローチの実例
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Lab seminar20100604

  • 1. Tha  Anatomy  of  a  Large-­‐Scale   Social  Search  Engine,  www2010  
  • 2. •  Damon  Horowitz,  Sepandar  D.  Kamvar   •  The  Anatomy  of  a  Large-­‐Scale  Social  Search   Engine   •  WWW  2010   •  Aardvark   QA   •  web  
  • 3. •  QA   •    •    •    •  Google
  • 4. •   Aardvark   •  :  Google   •    •    •    •  •      •    •    •    “Do  you  have  any  good  babysiLer  recommendaMons  in  Palo   Alto  for  my  6-­‐year-­‐old  twins?  I’m  looking  for  somebody  that   won’t  let  them  watch  TV.”
  • 5. •  Crawler  and  Indexer   –    •  Query  Analyzer   –    •  Ranking  FuncMon   –    •  UI   –  UI
  • 6.
  • 7. s(ui ,u j ,q) = p(ui | u j ) • p(ui | q) = p(ui | u j )∑ p(ui | t) p(t | q) t∈T • p(ui|uj):  quality  score   • p(ui|q):  relevance  score   •    u: q: t:  
  • 8. P(ui|t) •    p(t | ui ) p(ui ) p(ui | t) = •    p(t) •    s(t | ui ) = p(t | ui ) + γ ∑u∈U p(t | u) • facebook   • blog   ∑ p(t | u ) = 1 i •  /twiLer   t∈T € € •    •    • 
  • 9. P(ui|uj) •    –    –    –    –    –    –    –    – 
  • 10. P(t|q) :   •  Non  QuesMon  Classifier   –    •  Inappropriate  QuesMon  Classifier   –    •  Trivial  QuesMon  Classifier   –    •  LocaMon  SensiMve  Classifier   – 
  • 11. P(t|q) :   •    –  Keyword  Match  Topic  Mapper   •    –  Taxonomy  Topic  Mapper   •  SVM 3000   –  Salient  Term  Topic  Mapper   •  d-­‐idf   –  User  Tag  Topic  Mapper   • 
  • 12. •    –  Topic  ExperMse:  p(ui|q)   –  Connectedness:  p(ui|uj)   –  Availability:     •    –   
  • 13.
  • 14.
  • 15.
  • 16. •    –  Google PC   •  Mobile  Google Aardvark   –  Google Aardvark
  • 17. •    •    Aardvark 18.6  words 98.1% 2.2    2.9  words 57    63%
  • 18. •    –  fact
  • 19. •  57.2% 10   –  facebook 15.7% 15   •  6 37
  • 20. •  87.7%   •  2.08
  • 21. •  97.7% 3   •  174,605   •  1,199,323
  • 22. •  Google   –  200 Aardvark   –  Aardvark google 5   –  10   Aardvark 5 71.5% 3.93  ±  1.23 Google 2 70.5% 3.07  ±  1.46
  • 23. •    –    •    •    •    • 
  • 24. •  “ ” Aardvark   •  Aardvark   •  Aardvark   •  “ ”   •