SlideShare a Scribd company logo
1 of 24
7.1 Search and Lucene.Net
Ash Prasad

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Agenda
•
•
•
•
•

History and New Objectives
Architecture
Lucene / Lucene.Net
Crawlers, Entities, Controllers
Ranking, Synonyms, Ignore Words,
Stemming
• Security Trimming
• Module Integration, New Crawler

Don’t forget to include #DNNCon in your tweets!

@DNNCon
History of Search
ISearchable

• Platform Edition
• SQL Server
• ISearchable

Scheduler

Module
Module

SQL

• Commercial Edition
• Lucene 2.9.2
• URL and Files

Scheduler

Lucene
Don’t forget to include #DNNCon in your tweets!

@DNNCon
Objectives of New Search
• Handle diverse Content
• CMS, Social, Localized, 3rd Party Modules)

• Consistent User Experience
• Simple for Module Developers
• Uniform Architecture
•

Feature based differentiation

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Architecture

Don’t forget to include #DNNCon in your tweets!

@DNNCon
What‟s Lucene
•

•
•
•
•
•

Java-based indexing and search
technology
Managed by Apache
NOSQL database
Near real-time, Spellchecking,
Highlighting, Ranking, Synonyms
Many companies use Lucene
directly or customize
Facebook‟s Graph search uses
similar „Inverted Index‟

Don’t forget to include #DNNCon in your tweets!

@DNNCon
What‟s Lucene.Net
•
•
•
•

Line-by-line port from Java to C#
Maintains high-performance requirements
A bit behind Java releases
Who Uses Lucene.Net
• Products - RavenDB, Orchard, Umbraco, SubText
• Commercial Sites – BBC UK Top Gear, AutoDesk,
Koders.Com

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Lucene – A Document Store
• Flexible Schema

• Consists of Documents
•

Which are collection of Fields

• Documents can have different set of Fields
•
•

Field(“ID”,”xxx-yyy-999”), Field(“Title”, “My best
doc”)
Field(“Owner”,”Ash”), Field(“Locale”,”en-US”)

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Lucene – A Document Store (Contd.)

• Denormalized (No Referential Integrity)
• Deletion – Done through a flag
• Compact reclaims deleted space

• Update is Delete + Insert
• Boost = Ranking
• Unicode compliant

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Book consulted for Search
• Book on version
3.0
• ~ 500 pages
• Very useful

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Search Phases

Content
Acquisition

Content
Indexing

•
•
•
•
•

•
•
•
•
•

Crawling
ISearchable
ModuleSearchBase
URL
Doc / PDF

Text Analysis
Ranking
Synonyms
Ignore Words
Stemming

Don’t forget to include #DNNCon in your tweets!

Content Search
•
•
•
•
•

Querying
Sorting
Security Trimming
Boolean Search
Highlighting

@DNNCon
Crawlers
• Platform
• Site Crawler
•
•

Module and Tab Metadata
Module Content
(ModuleSearchBase/ISearchable)

• Commercial Edition
• File Crawler
•

Uses IFilter for extraction of text PDF/Office files

• URL Crawler
•

Internal and External URLs
Don’t forget to include #DNNCon in your tweets!

@DNNCon
Search Entities
• SearchType
• Distinguishes Crawlers

• SearchDocument
• Properties for a Content
• Stored in the Index

• SearchQuery
• Parameters to execute a Query

• SearchResult
• Derived from SearchDocument
Don’t forget to include #DNNCon in your tweets!

@DNNCon
Search Entities – Indexing vs. Querying

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Controllers
• SearchController
• For Querying

• InternalSearchController
• For Adding / Updating / Deleting

• LuceneController
• Interacts with Lucene

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Ranking = Boosting
• Doc and/or Field can be boosted in
Lucene
• DNN does Field boosts (Default - 10)
•
•
•
•
•

Title (50)
Tag (40)
Keyword (35)
Description (20)
Author (15)

• Configured manually by HostSettings
Don’t forget to include #DNNCon in your tweets!

@DNNCon
Synonyms and Ignore Words
• Synonyms are injected into Index
• Ignore Words are removed from Index

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Stemming
• Convert words to its root
• PorterStemFilter is used
• Country and Countries = countri
• breathe, breathes, breathing, breathed =
breath
• fishing, fished, fisher = fish

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Security Trimming
• Done through Collectors (Callback)
• Each Doc found is sent to Collector
• Collector rejects/accept per
Permission
• Site Crawler - Module / Tab Permission
• File Crawler - Folder Permission
• User Crawler – Profile Permission
Don’t forget to include #DNNCon in your tweets!

@DNNCon
Module Integration
• ModuleSearchBase
• New abstract class with just one method
• Defined in BusinessControllerClass
• GetModifiedSearchDocuments
•
•
•

Returns New, Changed and Deleted content
Delta based
Granular Permission, Localization, etc.

• ISearchable continues to work (no
delta)
Don’t forget to include #DNNCon in your tweets!

@DNNCon
New Crawler (How to)
• Define a new SearchType
• Optionally use IsPrivate to hide from site
search

• Implement BaseResultController (2
methods)
• HasViewPermission
• GetDocUrl

• Create Scheduled Task
• Call AddSearchDocuments to inject
contentforget to include #DNNCon in your tweets! @DNNCon
Don’t
Demo

Don’t forget to include #DNNCon in your tweets!

@DNNCon
Recap
•
•
•
•

New Search uses Lucene.Net
Platform has Site Crawler
Commercial has URL and File Crawlers
Modules to implement
ModuleSearchBase
• New Crawler implements
BaseResultController

Don’t forget to include #DNNCon in your tweets!

@DNNCon
THANKS TO ALL OF OUR GENEROUS
SPONSORS!

Don’t forget to include #DNNCon in your tweets!

@DNNCon

More Related Content

Similar to Search features and architecture in DNN 7.1

Creating URL Providers for your Custom Extensions
Creating URL Providers for your Custom ExtensionsCreating URL Providers for your Custom Extensions
Creating URL Providers for your Custom ExtensionsEngage Software
 
SharePoint Search - SPSNYC 2014
SharePoint Search - SPSNYC 2014SharePoint Search - SPSNYC 2014
SharePoint Search - SPSNYC 2014Avtex
 
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...Paul Hunt
 
DNNCON: Lost and Found: New DNN Search
DNNCON: Lost and Found: New DNN SearchDNNCON: Lost and Found: New DNN Search
DNNCON: Lost and Found: New DNN Searchslhilbert
 
Search driven architecture in SharePoint
Search driven architecture in SharePointSearch driven architecture in SharePoint
Search driven architecture in SharePointJim Lennox
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
Agnes Molnar - Best Practices for Information Architecture and Enterprise Search
Agnes Molnar - Best Practices for Information Architecture and Enterprise SearchAgnes Molnar - Best Practices for Information Architecture and Enterprise Search
Agnes Molnar - Best Practices for Information Architecture and Enterprise SearchAgnes Molnar
 
SPSBE building an faq for end users
SPSBE building an faq for end usersSPSBE building an faq for end users
SPSBE building an faq for end usersPaul Hunt
 
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02BIWUG
 
Workshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptxWorkshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptxSimon Rawson
 
Optimizing SharePoint for Transactional Content Management
Optimizing SharePoint for Transactional Content ManagementOptimizing SharePoint for Transactional Content Management
Optimizing SharePoint for Transactional Content ManagementDocFluix, LLC
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Petter Skodvin-Hvammen
 
SDL Tridion at the RSPB (2010)
SDL Tridion at the RSPB (2010)SDL Tridion at the RSPB (2010)
SDL Tridion at the RSPB (2010)Graham Bird
 
Domain Specific Development using T4
Domain Specific Development using T4Domain Specific Development using T4
Domain Specific Development using T4Joubin Najmaie
 
#SPSLondon - Session 1 - Building an faq for end users
#SPSLondon - Session 1 - Building an faq for end users#SPSLondon - Session 1 - Building an faq for end users
#SPSLondon - Session 1 - Building an faq for end usersPaul Hunt
 
DNNcon 2016: Are There Security Flaws in Your DNN Modules?
DNNcon 2016: Are There Security Flaws in Your DNN Modules?DNNcon 2016: Are There Security Flaws in Your DNN Modules?
DNNcon 2016: Are There Security Flaws in Your DNN Modules?Engage Software
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchAgnes Molnar
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
Dnn Con Baltimore Security Flaws
Dnn Con Baltimore Security FlawsDnn Con Baltimore Security Flaws
Dnn Con Baltimore Security FlawsJoshua Bradley
 

Similar to Search features and architecture in DNN 7.1 (20)

Creating URL Providers for your Custom Extensions
Creating URL Providers for your Custom ExtensionsCreating URL Providers for your Custom Extensions
Creating URL Providers for your Custom Extensions
 
SharePoint Search - SPSNYC 2014
SharePoint Search - SPSNYC 2014SharePoint Search - SPSNYC 2014
SharePoint Search - SPSNYC 2014
 
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...
Creating an FAQ for end users, An evolution of an idea - SharePoint Saturday ...
 
DNNCON: Lost and Found: New DNN Search
DNNCON: Lost and Found: New DNN SearchDNNCON: Lost and Found: New DNN Search
DNNCON: Lost and Found: New DNN Search
 
Search driven architecture in SharePoint
Search driven architecture in SharePointSearch driven architecture in SharePoint
Search driven architecture in SharePoint
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
Agnes Molnar - Best Practices for Information Architecture and Enterprise Search
Agnes Molnar - Best Practices for Information Architecture and Enterprise SearchAgnes Molnar - Best Practices for Information Architecture and Enterprise Search
Agnes Molnar - Best Practices for Information Architecture and Enterprise Search
 
Search
SearchSearch
Search
 
SPSBE building an faq for end users
SPSBE building an faq for end usersSPSBE building an faq for end users
SPSBE building an faq for end users
 
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
Spsbe buildinganfaqforendusers-150422122027-conversion-gate02
 
Workshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptxWorkshop - Ways of Working Within the M365 Workspace.pptx
Workshop - Ways of Working Within the M365 Workspace.pptx
 
Optimizing SharePoint for Transactional Content Management
Optimizing SharePoint for Transactional Content ManagementOptimizing SharePoint for Transactional Content Management
Optimizing SharePoint for Transactional Content Management
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
SDL Tridion at the RSPB (2010)
SDL Tridion at the RSPB (2010)SDL Tridion at the RSPB (2010)
SDL Tridion at the RSPB (2010)
 
Domain Specific Development using T4
Domain Specific Development using T4Domain Specific Development using T4
Domain Specific Development using T4
 
#SPSLondon - Session 1 - Building an faq for end users
#SPSLondon - Session 1 - Building an faq for end users#SPSLondon - Session 1 - Building an faq for end users
#SPSLondon - Session 1 - Building an faq for end users
 
DNNcon 2016: Are There Security Flaws in Your DNN Modules?
DNNcon 2016: Are There Security Flaws in Your DNN Modules?DNNcon 2016: Are There Security Flaws in Your DNN Modules?
DNNcon 2016: Are There Security Flaws in Your DNN Modules?
 
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 SearchSPLive Orlando - 10 Things I Like in SharePoint 2013 Search
SPLive Orlando - 10 Things I Like in SharePoint 2013 Search
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Dnn Con Baltimore Security Flaws
Dnn Con Baltimore Security FlawsDnn Con Baltimore Security Flaws
Dnn Con Baltimore Security Flaws
 

Recently uploaded

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Recently uploaded (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Search features and architecture in DNN 7.1

  • 1. 7.1 Search and Lucene.Net Ash Prasad Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 2. Agenda • • • • • History and New Objectives Architecture Lucene / Lucene.Net Crawlers, Entities, Controllers Ranking, Synonyms, Ignore Words, Stemming • Security Trimming • Module Integration, New Crawler Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 3. History of Search ISearchable • Platform Edition • SQL Server • ISearchable Scheduler Module Module SQL • Commercial Edition • Lucene 2.9.2 • URL and Files Scheduler Lucene Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 4. Objectives of New Search • Handle diverse Content • CMS, Social, Localized, 3rd Party Modules) • Consistent User Experience • Simple for Module Developers • Uniform Architecture • Feature based differentiation Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 5. Architecture Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 6. What‟s Lucene • • • • • • Java-based indexing and search technology Managed by Apache NOSQL database Near real-time, Spellchecking, Highlighting, Ranking, Synonyms Many companies use Lucene directly or customize Facebook‟s Graph search uses similar „Inverted Index‟ Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 7. What‟s Lucene.Net • • • • Line-by-line port from Java to C# Maintains high-performance requirements A bit behind Java releases Who Uses Lucene.Net • Products - RavenDB, Orchard, Umbraco, SubText • Commercial Sites – BBC UK Top Gear, AutoDesk, Koders.Com Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 8. Lucene – A Document Store • Flexible Schema • Consists of Documents • Which are collection of Fields • Documents can have different set of Fields • • Field(“ID”,”xxx-yyy-999”), Field(“Title”, “My best doc”) Field(“Owner”,”Ash”), Field(“Locale”,”en-US”) Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 9. Lucene – A Document Store (Contd.) • Denormalized (No Referential Integrity) • Deletion – Done through a flag • Compact reclaims deleted space • Update is Delete + Insert • Boost = Ranking • Unicode compliant Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 10. Book consulted for Search • Book on version 3.0 • ~ 500 pages • Very useful Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 11. Search Phases Content Acquisition Content Indexing • • • • • • • • • • Crawling ISearchable ModuleSearchBase URL Doc / PDF Text Analysis Ranking Synonyms Ignore Words Stemming Don’t forget to include #DNNCon in your tweets! Content Search • • • • • Querying Sorting Security Trimming Boolean Search Highlighting @DNNCon
  • 12. Crawlers • Platform • Site Crawler • • Module and Tab Metadata Module Content (ModuleSearchBase/ISearchable) • Commercial Edition • File Crawler • Uses IFilter for extraction of text PDF/Office files • URL Crawler • Internal and External URLs Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 13. Search Entities • SearchType • Distinguishes Crawlers • SearchDocument • Properties for a Content • Stored in the Index • SearchQuery • Parameters to execute a Query • SearchResult • Derived from SearchDocument Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 14. Search Entities – Indexing vs. Querying Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 15. Controllers • SearchController • For Querying • InternalSearchController • For Adding / Updating / Deleting • LuceneController • Interacts with Lucene Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 16. Ranking = Boosting • Doc and/or Field can be boosted in Lucene • DNN does Field boosts (Default - 10) • • • • • Title (50) Tag (40) Keyword (35) Description (20) Author (15) • Configured manually by HostSettings Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 17. Synonyms and Ignore Words • Synonyms are injected into Index • Ignore Words are removed from Index Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 18. Stemming • Convert words to its root • PorterStemFilter is used • Country and Countries = countri • breathe, breathes, breathing, breathed = breath • fishing, fished, fisher = fish Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 19. Security Trimming • Done through Collectors (Callback) • Each Doc found is sent to Collector • Collector rejects/accept per Permission • Site Crawler - Module / Tab Permission • File Crawler - Folder Permission • User Crawler – Profile Permission Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 20. Module Integration • ModuleSearchBase • New abstract class with just one method • Defined in BusinessControllerClass • GetModifiedSearchDocuments • • • Returns New, Changed and Deleted content Delta based Granular Permission, Localization, etc. • ISearchable continues to work (no delta) Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 21. New Crawler (How to) • Define a new SearchType • Optionally use IsPrivate to hide from site search • Implement BaseResultController (2 methods) • HasViewPermission • GetDocUrl • Create Scheduled Task • Call AddSearchDocuments to inject contentforget to include #DNNCon in your tweets! @DNNCon Don’t
  • 22. Demo Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 23. Recap • • • • New Search uses Lucene.Net Platform has Site Crawler Commercial has URL and File Crawlers Modules to implement ModuleSearchBase • New Crawler implements BaseResultController Don’t forget to include #DNNCon in your tweets! @DNNCon
  • 24. THANKS TO ALL OF OUR GENEROUS SPONSORS! Don’t forget to include #DNNCon in your tweets! @DNNCon