SlideShare ist ein Scribd-Unternehmen logo
1 von 15
ENTERPRISE SEARCH
by jonathan.rey@searchbox.com
ENTERPRISE SEARCH
PANORAMA
IBM
Dassault Systems
Open source
Google
Commercial solutions (all acquired in the past 5 years)
WHAT IS SEARCHBOX?
Searchbox leverages Apache Solr Technology and:
Offers various Solr plugins
Offers a Search Framework which can be used to
develop custom search engines tied to business
needs
Search-as-a-Service (On the cloud)
ENTERPRISE SEARCH VS.
WEB SEARCH
Productivity
Heterogeneous data
False negative / False positive
Structured, semi-structured, unstructured data
PRODUCTIVITY
Web search is
driven by the
engine.
HETEROGENEOUS DATA
Intra/extranet
Websites
CMS
Filesystem
Repository (Files, XML)
FALSE NEGATIVES ARE A
KILLER
On the web it’s ok not to
find a specific document
Not an option within a
company
Real time indexing
Liability concerns
Compliance (Why this
result?)
ENTERPRISE DATA IS...
Structured
Semi-structured
Unstructured
Web Search today is “semi-structured”
and somewhat consistent.
BUSINESS-CENTRIC
DATA
Data Normalization
Adaptation to
business needs
=> Goal: Productivity
gain
Linkedin is a great
“enterprise search”
example
WHAT IS BIG DATA?
Distributed & disparate data from several sources
Structured - semi structured - non structured
Big data & machine learning
Enhance existing unstructured data (tagging, entity
extraction, summarization)
Content curation
FROM BIG TO SMALL
DATA STACK
Scalable Backend infrastructure & archiving
Information Retrieval
Analysis / Discovery
Visualization
Sharepoint, Cassandra, Hadoop, Oracle, SAP, MangoDB, ...
Solr, Lucene, Elastic Search, Business Warehouse,
SAP BW, ...
Searchbox backend
Searchbox
frontend
Big Data
Small Data
OUR APPROACH TO
CONVERGENCE
- Index
- Crawl
- Fields
- Metadata
- Facets
- Filters
- More Like This
- Search Framework
- Presets
- Templating
- Tagging
- Summarization
- Sorting
Connect
Discover
Lift /
Enhance
Specialize
CONCLUSION
Working with Big data is expensive and time
consuming
Requires high level of expertise in multiple fields
(Networking, Programming, ML, NLP, Mathematics,
Statistics, ...)
Information Retrieval / mining can serve as an
iterative tool to leverage value from big data
SEARCHBOX FOR BIG
DATA
Data centric (Machine learning based enhancements)
Solr storage (Solr 4.x as scalable key-value store)
Hosted Solr Cluster with sharding and replication
Iterative process
Guided administration panel
Human friendly as opposed to CLI
“Sort by”
“Clickable
tags”
Range
facets with
histogram
Searchbox demo on
http://pubmed.searchbox.com
Search with
autocomplete

Weitere ähnliche Inhalte

Andere mochten auch

Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesYunyao Li
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 
Pagerank Algorithm Explained
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explainedjdhaar
 
In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataSAP Technology
 
Slides cloud computing
Slides cloud computingSlides cloud computing
Slides cloud computingHaslina
 

Andere mochten auch (6)

Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open ChallengesEnterprise Search in the Big Data Era: Recent Developments and Open Challenges
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 
A data analyst view of Bigdata
A data analyst view of Bigdata A data analyst view of Bigdata
A data analyst view of Bigdata
 
Pagerank Algorithm Explained
Pagerank Algorithm ExplainedPagerank Algorithm Explained
Pagerank Algorithm Explained
 
In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big Data
 
Slides cloud computing
Slides cloud computingSlides cloud computing
Slides cloud computing
 

Ähnlich wie Enterprise search - big data

Enterprise search: search at scale
Enterprise search: search at scaleEnterprise search: search at scale
Enterprise search: search at scaleUllyCarolinneSampaio
 
Siebel 8 Quick Hits: Search
Siebel 8 Quick Hits: SearchSiebel 8 Quick Hits: Search
Siebel 8 Quick Hits: SearchScott Nash
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise searchUsama Nada
 
Cloud as a Data Platform
Cloud as a Data PlatformCloud as a Data Platform
Cloud as a Data PlatformAndrei Savu
 
Search technologies & aws cloud search
Search technologies & aws cloud searchSearch technologies & aws cloud search
Search technologies & aws cloud searchAmazon Web Services
 
Web Crawling-based Search Engine using Python
Web Crawling-based Search Engine using PythonWeb Crawling-based Search Engine using Python
Web Crawling-based Search Engine using PythonMuditBansal20
 
The Searchmaster's Toolbox - David Hawking, Funnelback Search
The Searchmaster's Toolbox - David Hawking, Funnelback SearchThe Searchmaster's Toolbox - David Hawking, Funnelback Search
The Searchmaster's Toolbox - David Hawking, Funnelback SearchSquiz
 
The New World Of Work And Search
The New World Of Work And SearchThe New World Of Work And Search
The New World Of Work And SearchAlexander Meijers
 
AWS Big Data combo
AWS Big Data comboAWS Big Data combo
AWS Big Data comboJulien SIMON
 
Amplify The Impact Of Your People Business Productivity Infrastructur...
Amplify  The  Impact  Of  Your  People  Business  Productivity  Infrastructur...Amplify  The  Impact  Of  Your  People  Business  Productivity  Infrastructur...
Amplify The Impact Of Your People Business Productivity Infrastructur...Joseph Lopez
 
B2 B Sc And Emarkets
B2 B Sc And EmarketsB2 B Sc And Emarkets
B2 B Sc And EmarketsKimmy Chen
 
Introduction to Mahout with HDInsight
Introduction to Mahout with HDInsightIntroduction to Mahout with HDInsight
Introduction to Mahout with HDInsightChris Price
 
Build Your Own Search Engine
Build Your Own Search EngineBuild Your Own Search Engine
Build Your Own Search Enginegoodfriday
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimizationMike Maadarani
 
Enterprise Mashup Infrastructure Kapow Mashup Server
Enterprise Mashup Infrastructure   Kapow Mashup ServerEnterprise Mashup Infrastructure   Kapow Mashup Server
Enterprise Mashup Infrastructure Kapow Mashup ServerAndreas Krohn
 

Ähnlich wie Enterprise search - big data (20)

Fundamentals Of Search
Fundamentals Of SearchFundamentals Of Search
Fundamentals Of Search
 
Enterprise search: search at scale
Enterprise search: search at scaleEnterprise search: search at scale
Enterprise search: search at scale
 
Siebel 8 Quick Hits: Search
Siebel 8 Quick Hits: SearchSiebel 8 Quick Hits: Search
Siebel 8 Quick Hits: Search
 
Introduction to enterprise search
Introduction to enterprise searchIntroduction to enterprise search
Introduction to enterprise search
 
Cloud as a Data Platform
Cloud as a Data PlatformCloud as a Data Platform
Cloud as a Data Platform
 
Search technologies & aws cloud search
Search technologies & aws cloud searchSearch technologies & aws cloud search
Search technologies & aws cloud search
 
Introduction to Force.com
Introduction to Force.comIntroduction to Force.com
Introduction to Force.com
 
Web Crawling-based Search Engine using Python
Web Crawling-based Search Engine using PythonWeb Crawling-based Search Engine using Python
Web Crawling-based Search Engine using Python
 
The Searchmaster's Toolbox - David Hawking, Funnelback Search
The Searchmaster's Toolbox - David Hawking, Funnelback SearchThe Searchmaster's Toolbox - David Hawking, Funnelback Search
The Searchmaster's Toolbox - David Hawking, Funnelback Search
 
The New World Of Work And Search
The New World Of Work And SearchThe New World Of Work And Search
The New World Of Work And Search
 
AWS Big Data combo
AWS Big Data comboAWS Big Data combo
AWS Big Data combo
 
Amplify The Impact Of Your People Business Productivity Infrastructur...
Amplify  The  Impact  Of  Your  People  Business  Productivity  Infrastructur...Amplify  The  Impact  Of  Your  People  Business  Productivity  Infrastructur...
Amplify The Impact Of Your People Business Productivity Infrastructur...
 
Search Enginesv2
Search Enginesv2Search Enginesv2
Search Enginesv2
 
Week 5
Week 5Week 5
Week 5
 
Week 5
Week 5Week 5
Week 5
 
B2 B Sc And Emarkets
B2 B Sc And EmarketsB2 B Sc And Emarkets
B2 B Sc And Emarkets
 
Introduction to Mahout with HDInsight
Introduction to Mahout with HDInsightIntroduction to Mahout with HDInsight
Introduction to Mahout with HDInsight
 
Build Your Own Search Engine
Build Your Own Search EngineBuild Your Own Search Engine
Build Your Own Search Engine
 
#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization#SPSPhilly search topology & optimization
#SPSPhilly search topology & optimization
 
Enterprise Mashup Infrastructure Kapow Mashup Server
Enterprise Mashup Infrastructure   Kapow Mashup ServerEnterprise Mashup Infrastructure   Kapow Mashup Server
Enterprise Mashup Infrastructure Kapow Mashup Server
 

Enterprise search - big data