SlideShare a Scribd company logo
1 of 19
“ BEYOND PAGES: SUPPORTING EFFICIENT, SCALABLE ENTITY SEARCH WITH DUAL INVERSION INDEX” Tao Chang, Kevin-Chen-Chuan Chang University of Illinois at Urbana-Champaign Presented By:   Mahesh Gupta CSE 6339 Web Search Mining & Integration – Paper Presentation
WHAT THIS PAPER IS ALL ABOUT? ,[object Object]
ENTITY SEARCH ,[object Object],[object Object],[object Object],[object Object],Cowboy Stadium #Location
CONTEXT MATCHING ALONE ENOUGH? ,[object Object],[object Object],[object Object],[object Object],[object Object]
HENCE…. ,[object Object],[object Object],[object Object],[object Object],[object Object],Cowboy Stadium #Location
COMPUTATIONAL CHALLENGES ,[object Object],[object Object],[object Object],[object Object]
WHAT IS INDEX HERE ,[object Object],[object Object],Keyword Document, position Cowboy (D10,12) ;(D12,34)(D46,257)…… Stadium (D10,13) ;(D34,134)(D146,357)…… ------------- ----------------
INTRODUCING DUAL-INVERSION INDEX ,[object Object],[object Object],[object Object],[object Object]
DOCUMENT INVERTED INDEX ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],D2,12 D6,17 D9,34 D9,357 D97,45 D6,18 D9,35 D56,55 D64,5 D97,46
DOCUMENT INVERTED INDEX CONTINUE.. ,[object Object],[object Object],[object Object],[object Object],[object Object],D6,23,’Arlington TX’ D9,45,’United State’ D97,50,’North Texas’ …… . …… .
DI-INDEX->  IS IT EFFICIENT NOW ? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
DI-INDEX  -> DATA PARTITIONING ,[object Object],[object Object],[object Object],[object Object],[object Object],D6,23,P8 D9,45,P86 D97,50,P8 …… .
ENTITY-INVERTED INDEX ,[object Object],[object Object],[object Object],[object Object]
EI-INDEX CONTINUE… ,[object Object],[object Object],[object Object],[object Object],[object Object],((D6,23,P8),17) ((D9,45,P86),34) ((D97,50,P8),45) ……… . ((D23,23,P8),18) ((D9,45,P86),35) ((D97,50,P8),46) ……… .
EI-INDEX PARTITIONING ,[object Object],[object Object],[object Object],[object Object]
COMPARISON D-Inverted E-Inverted Join Fast (why?) Faster (why?) Aggregation Central (why?) Distributed (why?) Space Minimal Overhead (why?) Large (why?)
BOTH INDEX CO-EXIST? DUAL-INVERSION INDEX) ,[object Object],[object Object],[object Object]
SUMMARY ,[object Object],[object Object],[object Object]
[object Object]

More Related Content

Viewers also liked

Test your taste buds
Test your taste budsTest your taste buds
Test your taste buds
kelsey-jane
 
Warsztaty Active Image | Opinie
Warsztaty Active Image | OpinieWarsztaty Active Image | Opinie
Warsztaty Active Image | Opinie
sawares
 
Arcadian Landscapes
Arcadian LandscapesArcadian Landscapes
Arcadian Landscapes
M-droid
 

Viewers also liked (16)

Test your taste buds
Test your taste budsTest your taste buds
Test your taste buds
 
A Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with DataA Better Understanding: Solving Business Challenges with Data
A Better Understanding: Solving Business Challenges with Data
 
Warsztaty Active Image | Opinie
Warsztaty Active Image | OpinieWarsztaty Active Image | Opinie
Warsztaty Active Image | Opinie
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoT
 
See the Whole Story: The Case for a Visualization Platform
See the Whole Story: The Case for a Visualization PlatformSee the Whole Story: The Case for a Visualization Platform
See the Whole Story: The Case for a Visualization Platform
 
Auto bodies
Auto bodiesAuto bodies
Auto bodies
 
Arcadian Landscapes
Arcadian LandscapesArcadian Landscapes
Arcadian Landscapes
 
Warsztaty PR-u i komunikacji | Opinie
Warsztaty PR-u i komunikacji | OpinieWarsztaty PR-u i komunikacji | Opinie
Warsztaty PR-u i komunikacji | Opinie
 
Who, What, Where and How: Why You Want to Know
 Who, What, Where and How: Why You Want to Know Who, What, Where and How: Why You Want to Know
Who, What, Where and How: Why You Want to Know
 
The Art of Visibility: Enabling Multi-Platform Management
The Art of Visibility: Enabling Multi-Platform ManagementThe Art of Visibility: Enabling Multi-Platform Management
The Art of Visibility: Enabling Multi-Platform Management
 
My OS
My OSMy OS
My OS
 
The Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning QueriesThe Key to Effective Analytics: Fast-Returning Queries
The Key to Effective Analytics: Fast-Returning Queries
 
Extracción-de-la-muestra-_ Clase Nº 2 Hematología
Extracción-de-la-muestra-_ Clase Nº 2  Hematología Extracción-de-la-muestra-_ Clase Nº 2  Hematología
Extracción-de-la-muestra-_ Clase Nº 2 Hematología
 
The Central Hub: Defining the Data Lake
The Central Hub: Defining the Data LakeThe Central Hub: Defining the Data Lake
The Central Hub: Defining the Data Lake
 
Mind Your Business: Why Privacy Matters to the Successful Enterprise
 Mind Your Business: Why Privacy Matters to the Successful Enterprise Mind Your Business: Why Privacy Matters to the Successful Enterprise
Mind Your Business: Why Privacy Matters to the Successful Enterprise
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
 

Similar to Presentation dual inversion-index

Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
Nita Pawar
 
IST365 - Project Deliverable #3Create the corresponding relation.docx
IST365 - Project Deliverable #3Create the corresponding relation.docxIST365 - Project Deliverable #3Create the corresponding relation.docx
IST365 - Project Deliverable #3Create the corresponding relation.docx
priestmanmable
 
Data Analysis using Data Flux
Data Analysis using Data FluxData Analysis using Data Flux
Data Analysis using Data Flux
Sunil Pai
 

Similar to Presentation dual inversion-index (20)

Exploiting web search engines to search structured
Exploiting web search engines to search structuredExploiting web search engines to search structured
Exploiting web search engines to search structured
 
Structured Document Search and Retrieval
Structured Document Search and RetrievalStructured Document Search and Retrieval
Structured Document Search and Retrieval
 
Search Approach - ES, GraphDB
Search Approach - ES, GraphDBSearch Approach - ES, GraphDB
Search Approach - ES, GraphDB
 
At the core you will have KUSTO
At the core you will have KUSTOAt the core you will have KUSTO
At the core you will have KUSTO
 
Normalisation in Database management System (DBMS)
Normalisation in Database management System (DBMS)Normalisation in Database management System (DBMS)
Normalisation in Database management System (DBMS)
 
About elasticsearch
About elasticsearchAbout elasticsearch
About elasticsearch
 
Intro to Data warehousing lecture 11
Intro to Data warehousing   lecture 11Intro to Data warehousing   lecture 11
Intro to Data warehousing lecture 11
 
Intro to Data warehousing lecture 14
Intro to Data warehousing   lecture 14Intro to Data warehousing   lecture 14
Intro to Data warehousing lecture 14
 
Intro to Data warehousing lecture 19
Intro to Data warehousing   lecture 19Intro to Data warehousing   lecture 19
Intro to Data warehousing lecture 19
 
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
MongoDB .local Chicago 2019: Still Haven't Found What You Are Looking For? Us...
 
February 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDBFebruary 2016 Webinar Series - Introduction to DynamoDB
February 2016 Webinar Series - Introduction to DynamoDB
 
Data Science - Part XI - Text Analytics
Data Science - Part XI - Text AnalyticsData Science - Part XI - Text Analytics
Data Science - Part XI - Text Analytics
 
IST365 - Project Deliverable #3Create the corresponding relation.docx
IST365 - Project Deliverable #3Create the corresponding relation.docxIST365 - Project Deliverable #3Create the corresponding relation.docx
IST365 - Project Deliverable #3Create the corresponding relation.docx
 
Advanced full text searching techniques using Lucene
Advanced full text searching techniques using LuceneAdvanced full text searching techniques using Lucene
Advanced full text searching techniques using Lucene
 
Incremental design v2
Incremental design v2Incremental design v2
Incremental design v2
 
What to do when one size does not fit all?!
What to do when one size does not fit all?!What to do when one size does not fit all?!
What to do when one size does not fit all?!
 
Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022Microsoft azure data fundamentals (dp 900) practice tests 2022
Microsoft azure data fundamentals (dp 900) practice tests 2022
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Connect to NoSQL Database using Node JS.pptx
Connect to NoSQL Database using Node JS.pptxConnect to NoSQL Database using Node JS.pptx
Connect to NoSQL Database using Node JS.pptx
 
Data Analysis using Data Flux
Data Analysis using Data FluxData Analysis using Data Flux
Data Analysis using Data Flux
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 

Recently uploaded (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 

Presentation dual inversion-index

  • 1. “ BEYOND PAGES: SUPPORTING EFFICIENT, SCALABLE ENTITY SEARCH WITH DUAL INVERSION INDEX” Tao Chang, Kevin-Chen-Chuan Chang University of Illinois at Urbana-Champaign Presented By: Mahesh Gupta CSE 6339 Web Search Mining & Integration – Paper Presentation
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. COMPARISON D-Inverted E-Inverted Join Fast (why?) Faster (why?) Aggregation Central (why?) Distributed (why?) Space Minimal Overhead (why?) Large (why?)
  • 17.
  • 18.
  • 19.