SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
1
An Overview of the Enterprise Search
Market, & Current Best Practices
Iain Fletcher
ifletcher@Searchtechnologies.com
April 20, 2015
2
Agenda
• A brief overview of the current enterprise search
market
• The convergence of search with analytics
disciplines
• Likely future architectures for search applications
3
Search engines continue to proliferate…
4
High-level Search Engine Classifications
1. Part of a portfolio, many are recently acquired technologies
– E.g. SharePoint, HP Autonomy, IBM/Vivisimo, Dassault/Exalead
2. Stand-alone specialists, often bought to address specific apps
– E.g. GSA, Coveo, Attivio, Sinequa, Recommind
3. Open source, with or without support or proprietary add-ons
– Raw: E.g. Lucene, Solr, Elasticsearch
– With support/add-ons: E.g. LucidWorks, Cloudera Search, Elastic
4. Cloud-based services, typically based on open source technology
– E.g. Amazon Cloudsearch, MS Azure search
5
The dominant market share is with SharePoint, open
source, and the Google Search Appliance
• SharePoint 2013 search is credible, and bundled
– Search teams are under pressure to use it, or to provide a
compelling reason to do otherwise
• Solr and Elasticsearch are robust and reliable
– Thanks to very wide-spread deployment
• The Google brand sells search – and a lot of GSAs have
been shipped during the past few years
Market Observations
6
Functional Observations
• Core indexing / searching is generally fast and reliable
– Search is a maturing technology
• Key differences remain in peripheral functionality, such as
content processing prior to indexing. For example:
– Coveo, Attivio, Sinequa all have well-developed indexing
pipelines, UI tools, and a range of data connectors
– SharePoint and GSA have limited content processing
functionality and rely on 3rd parties for connectivity
– Solr, Elasticsearch, AWS Cloudsearch and Azure search don’t
provide a formal indexing pipeline, UI, or connectors
7
Further Observations
• The search engines with less focus on peripheral issues
(such as content processing and connectivity) have
dominant market share
• Connectivity remains challenging, especially when
combined with continual data growth
• The movement of data sets to the cloud adds further
complexity
– Hybrid indexing environments will be with us for some years
8
Content Processing / Text Analysis Examples
• Normalization
– Names, dates, synonyms, spelling
• Entity identification and resolution
• Additional metadata from content analysis
• Categorization
• Document vector extraction
• Splitting and concatenation
• Dupe & near-dupe detection
• Link analysis
• Ingesting external signals
• Security enforcement and analysis
Index
security
category
metadata
9
Future Directions
So what will search architectures look like in the future?
Important Influences:
• The need for organizational and analytical agility
• The convergence of search and (“big data”) analytics
• Continual growth in data volumes, and churn in repository
/ storage fashions
10
Converging Architectures
Let’s take a brief look at:
1. The “Big Data Architecture”, evangelized by IBM,
Cloudera, etc.
2. Contemporary Search Architectures
Background Info
11
The Big Data Architecture
Designed for Structured Data
12
The Traditional Search Architecture
Integrated Search EngineContent
Sources
Connectors Index Pipeline Search
IndexEmployee
Directory
CMS
File Share
UI
Etc.
Designed for Unstructured Content
13
The Traditional Search Architecture
Integrated Search EngineContent
Sources
Connectors Index Pipeline Search
IndexEmployee
Directory
CMS
File Share
UI
Etc.
• A few documents-per-second?
• There are only 2.6 million seconds in a month
• If you change something significant in the index
pipeline, you will need to re-index
RE-INDEX
14
A Better Search Architecture
• Re-indexing rates greatly improved
• “Touch-time” with repositories can be managed autonomously
Search EngineContent
Sources
Connectors
Index
Pipeline
Search
Index
Employee
Directory
CMS
Etc.
RE-INDEX
Content
Processing
Staging
Repository
Iterative
Development
15
The Future Architecture?
Hadoop
Search EngineContent
Sources
Connectors
Index
Pipeline
Search
IndexEmployee
Directory
CMS
Etc.
RE-INDEX
Content
Processing
Staging
Repository
Iterative
Development
• This environment will encourage ever more sophisticated content
processing
• We expect much innovation in text analytics during the next few years
• Driven by cheap, easily available processing power
• The deliverable is a richer search index
16
The Future Architecture
Hadoop
Search EngineContent
Sources
Connectors
Index
Pipeline
Search
IndexEmployee
Directory
CMS
Etc.
RE-INDEX
Content
Processing
Staging
Repository
Iterative
Development
• Google.com works something like this
for 10+ years
17
An Integrated Search/Analytics Architecture
Hadoop
Content
Sources
Connectors
/ Crawlers
CMS
File system
Rapid, & ad hoc Indexing
Content
Processing
Staging
Repository
Iterative
Development
ETL
Data
Sources
Data
Warehouse
Logfiles
Etc.
OSINT Search
App.
Search
App.
Analysis
App.
Analysis
App.
• Encourages agile exploitation of data and content resources
18
Summary
• Search and Analytics are tending towards to the same
architecture
• Autonomous connectivity and content processing systems
simplify and de-risk projects
• The “search index” is a mature technology, and becoming a
commodity
– Thanks to open source alternatives setting high standards
• The centre of attention is shifting from the index to the
content preparation
– This perhaps fits well with the profile of dominant market
leaders: SharePoint, GSA, Solr, Elasticsearch….
19
Conclusion
• The foundation of great search and analytical applications
is a clean, rich and detailed index
• Much of the innovation during the next years will be in
content analytics
– The architecture discussed makes it easy to adopt new ideas
and products
– And it promotes agility, experimentation, and innovation
• In a data-driven world, agility is vital
20
The analyst quote….
And finally….
“Enterprise Search Can Bring Big Data Within Reach”
• Multiple, purpose-built indexes that are derived from enriched
content are necessary.
http://blogs.gartner.com/darin-stewart/2014/04/01/enterprise-search-can-bring-big-data-within-reach/
* Darin Stewart, Enterprise Search Can Bring Big Data Within Reach, April 2014 Blog
21
An Overview of the Current Enterprise
Search Market, & Current Best Practices
Iain Fletcher
ifletcher@Searchtechnologies.com
April 20, 2015
Thank you!
22
Spare Slides
23
Reference Architecture
Content
sources
Connectors
Indexes
Semantics
Text Mining
Quality
Metrics
Content Processing Pipelines
Big Data Framework
Indexes
Query
parsing
Search Engine
Web Browser
Staging
Repository
24CONFIDENTIAL
25CONFIDENTIAL

Weitere ähnliche Inhalte

Andere mochten auch

II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...Dr. Haxel Consult
 
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...Dr. Haxel Consult
 

Andere mochten auch (9)

II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...II-SDV 2015 The International Information Conference on Search, Data Mining a...
II-SDV 2015 The International Information Conference on Search, Data Mining a...
 
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
II-SDV 2017 in Nice - The International Information Conference on Search, Dat...
 

Mehr von Dr. Haxel Consult

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementDr. Haxel Consult
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...Dr. Haxel Consult
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...Dr. Haxel Consult
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...Dr. Haxel Consult
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...Dr. Haxel Consult
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...Dr. Haxel Consult
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...Dr. Haxel Consult
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...Dr. Haxel Consult
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...Dr. Haxel Consult
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...Dr. Haxel Consult
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...Dr. Haxel Consult
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...Dr. Haxel Consult
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...Dr. Haxel Consult
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterDr. Haxel Consult
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPDr. Haxel Consult
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCDr. Haxel Consult
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...Dr. Haxel Consult
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
 

Mehr von Dr. Haxel Consult (20)

AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering ManagementAI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
AI-SDV 2022: Henry Chang Patent Intelligence and Engineering Management
 
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
 
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
AI-SDV 2022: The race to net zero: Tracking the green industrial revolution t...
 
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
AI-SDV 2022: Accommodating the Deep Learning Revolution by a Development Proc...
 
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
AI-SDV 2022: Domain Knowledge makes Artificial Intelligence Smart Linda Ander...
 
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
AI-SDV 2022: Embedding-based Search Vs. Relevancy Search: comparing the new w...
 
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
AI-SDV 2022: Rolling out web crawling at Boehringer Ingelheim - 10 years of e...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...AI-SDV 2022: Machine learning based patent categorization: A success story in...
AI-SDV 2022: Machine learning based patent categorization: A success story in...
 
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
AI-SDV 2022: Finding the WHAT – Will AI help? Nils Newman (Search Technology,...
 
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
AI-SDV 2022: New Insights from Trademarks with Natural Language Processing Al...
 
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
AI-SDV 2022: Extracting information from tables in documents Holger Keibel (K...
 
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
AI-SDV 2022: Scientific publishing in the age of data mining and artificial i...
 
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
AI-SDV 2022: AI developments and usability Linus Wretblad (IPscreener / Uppdr...
 
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
AI-SDV 2022: Where’s the one about…? Looney Tunes® Revisited Jay Ven Eman (CE...
 
AI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance CenterAI-SDV 2022: Copyright Clearance Center
AI-SDV 2022: Copyright Clearance Center
 
AI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IPAI-SDV 2022: Lighthouse IP
AI-SDV 2022: Lighthouse IP
 
AI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOCAI-SDV 2022: New Product Introductions: CENTREDOC
AI-SDV 2022: New Product Introductions: CENTREDOC
 
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
AI-SDV 2022: Possibilities and limitations of AI-boosted multi-categorization...
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 

KĂźrzlich hochgeladen

在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋nirzagarg
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceDelhi Call girls
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...SUHANI PANDEY
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...SUHANI PANDEY
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...SUHANI PANDEY
 
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...tanu pandey
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLimonikaupta
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 

KĂźrzlich hochgeladen (20)

在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
💚😋 Salem Escort Service Call Girls, 9352852248 ₹5000 To 25K With AC💚😋
 
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort ServiceBusty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
Busty Desi⚡Call Girls in Vasundhara Ghaziabad >༒8448380779 Escort Service
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
Wadgaon Sheri $ Call Girls Pune 10k @ I'm VIP Independent Escorts Girls 80057...
 
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
6.High Profile Call Girls In Punjab +919053900678 Punjab Call GirlHigh Profil...
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
Sarola * Female Escorts Service in Pune | 8005736733 Independent Escorts & Da...
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
📱Dehradun Call Girls Service 📱☎️ +91'905,3900,678 ☎️📱 Call Girls In Dehradun 📱
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
valsad Escorts Service ☎️ 6378878445 ( Sakshi Sinha ) High Profile Call Girls...
 
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
Pirangut | Call Girls Pune Phone No 8005736733 Elite Escort Service Available...
 
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Pollachi 7001035870 Whatsapp Number, 24/07 Booking
 
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...Pune Airport ( Call Girls ) Pune  6297143586  Hot Model With Sexy Bhabi Ready...
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
 
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRLLucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
Lucknow ❤CALL GIRL 88759*99948 ❤CALL GIRLS IN Lucknow ESCORT SERVICE❤CALL GIRL
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 

II-SDV 2015, 20 - 21 April, in Nice

  • 1. 1 An Overview of the Enterprise Search Market, & Current Best Practices Iain Fletcher ifletcher@Searchtechnologies.com April 20, 2015
  • 2. 2 Agenda • A brief overview of the current enterprise search market • The convergence of search with analytics disciplines • Likely future architectures for search applications
  • 3. 3 Search engines continue to proliferate…
  • 4. 4 High-level Search Engine Classifications 1. Part of a portfolio, many are recently acquired technologies – E.g. SharePoint, HP Autonomy, IBM/Vivisimo, Dassault/Exalead 2. Stand-alone specialists, often bought to address specific apps – E.g. GSA, Coveo, Attivio, Sinequa, Recommind 3. Open source, with or without support or proprietary add-ons – Raw: E.g. Lucene, Solr, Elasticsearch – With support/add-ons: E.g. LucidWorks, Cloudera Search, Elastic 4. Cloud-based services, typically based on open source technology – E.g. Amazon Cloudsearch, MS Azure search
  • 5. 5 The dominant market share is with SharePoint, open source, and the Google Search Appliance • SharePoint 2013 search is credible, and bundled – Search teams are under pressure to use it, or to provide a compelling reason to do otherwise • Solr and Elasticsearch are robust and reliable – Thanks to very wide-spread deployment • The Google brand sells search – and a lot of GSAs have been shipped during the past few years Market Observations
  • 6. 6 Functional Observations • Core indexing / searching is generally fast and reliable – Search is a maturing technology • Key differences remain in peripheral functionality, such as content processing prior to indexing. For example: – Coveo, Attivio, Sinequa all have well-developed indexing pipelines, UI tools, and a range of data connectors – SharePoint and GSA have limited content processing functionality and rely on 3rd parties for connectivity – Solr, Elasticsearch, AWS Cloudsearch and Azure search don’t provide a formal indexing pipeline, UI, or connectors
  • 7. 7 Further Observations • The search engines with less focus on peripheral issues (such as content processing and connectivity) have dominant market share • Connectivity remains challenging, especially when combined with continual data growth • The movement of data sets to the cloud adds further complexity – Hybrid indexing environments will be with us for some years
  • 8. 8 Content Processing / Text Analysis Examples • Normalization – Names, dates, synonyms, spelling • Entity identification and resolution • Additional metadata from content analysis • Categorization • Document vector extraction • Splitting and concatenation • Dupe & near-dupe detection • Link analysis • Ingesting external signals • Security enforcement and analysis Index security category metadata
  • 9. 9 Future Directions So what will search architectures look like in the future? Important Influences: • The need for organizational and analytical agility • The convergence of search and (“big data”) analytics • Continual growth in data volumes, and churn in repository / storage fashions
  • 10. 10 Converging Architectures Let’s take a brief look at: 1. The “Big Data Architecture”, evangelized by IBM, Cloudera, etc. 2. Contemporary Search Architectures Background Info
  • 11. 11 The Big Data Architecture Designed for Structured Data
  • 12. 12 The Traditional Search Architecture Integrated Search EngineContent Sources Connectors Index Pipeline Search IndexEmployee Directory CMS File Share UI Etc. Designed for Unstructured Content
  • 13. 13 The Traditional Search Architecture Integrated Search EngineContent Sources Connectors Index Pipeline Search IndexEmployee Directory CMS File Share UI Etc. • A few documents-per-second? • There are only 2.6 million seconds in a month • If you change something significant in the index pipeline, you will need to re-index RE-INDEX
  • 14. 14 A Better Search Architecture • Re-indexing rates greatly improved • “Touch-time” with repositories can be managed autonomously Search EngineContent Sources Connectors Index Pipeline Search Index Employee Directory CMS Etc. RE-INDEX Content Processing Staging Repository Iterative Development
  • 15. 15 The Future Architecture? Hadoop Search EngineContent Sources Connectors Index Pipeline Search IndexEmployee Directory CMS Etc. RE-INDEX Content Processing Staging Repository Iterative Development • This environment will encourage ever more sophisticated content processing • We expect much innovation in text analytics during the next few years • Driven by cheap, easily available processing power • The deliverable is a richer search index
  • 16. 16 The Future Architecture Hadoop Search EngineContent Sources Connectors Index Pipeline Search IndexEmployee Directory CMS Etc. RE-INDEX Content Processing Staging Repository Iterative Development • Google.com works something like this for 10+ years
  • 17. 17 An Integrated Search/Analytics Architecture Hadoop Content Sources Connectors / Crawlers CMS File system Rapid, & ad hoc Indexing Content Processing Staging Repository Iterative Development ETL Data Sources Data Warehouse Logfiles Etc. OSINT Search App. Search App. Analysis App. Analysis App. • Encourages agile exploitation of data and content resources
  • 18. 18 Summary • Search and Analytics are tending towards to the same architecture • Autonomous connectivity and content processing systems simplify and de-risk projects • The “search index” is a mature technology, and becoming a commodity – Thanks to open source alternatives setting high standards • The centre of attention is shifting from the index to the content preparation – This perhaps fits well with the profile of dominant market leaders: SharePoint, GSA, Solr, Elasticsearch….
  • 19. 19 Conclusion • The foundation of great search and analytical applications is a clean, rich and detailed index • Much of the innovation during the next years will be in content analytics – The architecture discussed makes it easy to adopt new ideas and products – And it promotes agility, experimentation, and innovation • In a data-driven world, agility is vital
  • 20. 20 The analyst quote…. And finally…. “Enterprise Search Can Bring Big Data Within Reach” • Multiple, purpose-built indexes that are derived from enriched content are necessary. http://blogs.gartner.com/darin-stewart/2014/04/01/enterprise-search-can-bring-big-data-within-reach/ * Darin Stewart, Enterprise Search Can Bring Big Data Within Reach, April 2014 Blog
  • 21. 21 An Overview of the Current Enterprise Search Market, & Current Best Practices Iain Fletcher ifletcher@Searchtechnologies.com April 20, 2015 Thank you!
  • 23. 23 Reference Architecture Content sources Connectors Indexes Semantics Text Mining Quality Metrics Content Processing Pipelines Big Data Framework Indexes Query parsing Search Engine Web Browser Staging Repository