2. dkd
development
kommunikation
design
Freitag, 15. Juni 12
3. Welcome
TYPO3 Conference
Quebec Canada
Olivier Dobberkau, Founder and CIO dkd
Member of the Expert Advisory Board TYPO3 Assoc.
Twitter @T3RevNeverend
olivier.dobberkau@dkd.de
Freitag, 15. Juni 12
4. Everything You Always Wanted to
Know About Search in TYPO3.
But Were Afraid to Ask
Freitag, 15. Juni 12
5. Woody Allen
Inspiration for this Talk:
Woody Allen Movie: „Everything You Always
Wanted to Know About Sex * But Were Afraid to
Ask“
Freitag, 15. Juni 12
6. Woody Allen
Inspiration for this Talk:
Woody Allen Movie: „Everything You Always
Wanted to Know About Sex * But Were Afraid to
Ask“
Internet Movie Database:
http://www.imdb.com/title/tt0068555/
Freitag, 15. Juni 12
7. Agenda
A short history of Search
Slang
The need to Search
Who is searching and what is (s)he searching for?
Search in TYPO3 with Apache Solr
Questions & Answers
Freitag, 15. Juni 12
8. History
A short trip in the History of Searchsolutions in
times of IT.
Really short, really lots of missing facts and not
scientific at all.
Freitag, 15. Juni 12
9. Scratch your own itch, IBM.
At the beginning was the Mainframe
IBM develops in 1969 STAIRS (storage and
information retrieval system)
Fulltext Search for Terminal Applications
Performance: „far below anyone‘s expectations“
First use in the DOJ Case againts IBM
Source: A history of online information services,
1963-1976 von Charles P. Bourne,Trudi Bellardo
Hahn
Freitag, 15. Juni 12
10. Internet years are dog years
The Internet changes the needs in Fulltextsearch.
With Lycos, Alltheweb, Infoseek, Excite and
Altavista Searchpages compete in solving the
„How do i find something in the Internet?“
Its a race for the love of the seeking internet
users in 1995.
Yahoo tries to be the Directory of Websites
Freitag, 15. Juni 12
11. And then came GOOGLE
Who does not know about Googles Secret?
The Anatomy of a Large-Scale Hypertextual Web
Search Engine
http://infolab.stanford.edu/~backrub/google.html
Visionary Paper
The named technologies and principles are
industry standard and are still changing our IT
Industry. (Map reduce, Big data & Pagerank)
A must read!
Freitag, 15. Juni 12
13. Its all about words!
Irformation Retrieval (IR)
Term versus Query
Index
Recall & Precision
Relevancy
Index, Inverted Index & Posting List
Recency & Authority
Freitag, 15. Juni 12
14. The need to Search
What leads us when we search?
How do we search?
How does what we find change us?
Freitag, 15. Juni 12
15. People are like Bears
(only less fur)
How do we search?
Marcia Bates, 1989
THE DESIGN OF BROWSING AND BERRYPICKING
TECHNIQUES FOR THE ONLINE SEARCH
INTERFACE
http://pages.gseis.ucla.edu/faculty/bates/
berrypicking.html
Every search can be described with this
Freitag, 15. Juni 12
16. Marcia J. Bates Berrypicking techniques for the online search interface (1989)
Freitag, 15. Juni 12
17. Carrots & Sticks
Search Behavior Patterns, John Ferrara
http://www.boxesandarrows.com/view/search-
behavior
Domain Expertise
Search Expertise
Cognitive Style
Goal Type
Mode of seeking
Situational idiosyncrasies
Freitag, 15. Juni 12
18. Neo: The Matrix
Matrix of Scope/Style of information needs
Scope & Type -Tyler Tate. Sohn et al. Church & Smyth
http://twigkit.com/blog/2011/12/06/mobile-information-needs.html
Freitag, 15. Juni 12
19. Search = Success for your Website
Benefits for your Visitors & Users
They will find it on your Website
Serendipity
Better and faster knowledge transfer
Business benefits
ROI
Agility
Awareness and Enablement
Freitag, 15. Juni 12
20. TYPO3 & Search
Shameless Plug: Apache Solr for TYPO3
I still have some „I love Indexed Search“ Buttons to
giveaway.
Freitag, 15. Juni 12
26. Query Options
Operators
“+” and “-” to add or exclude terms
Soon “and” und “or” to combine terms
Quotes to tie words together
ie. “This is a Search with many Terms”
Diacritical Characters
cuvée = cuvee
Søren = Sören = Soeren = Sœren = Soren
Freitag, 15. Juni 12
27. Query
Takes care of Access Control Rights
Autocomplete
Did you mean?
Freitag, 15. Juni 12
29. Results
Searchresults linking to a result
Page Browser
Sorting
Relavancy (Score)
Author
Date (cr_date of TYPO3 Page)
your own criterias
Freitag, 15. Juni 12
30. Results
View-Helper to display additional Information like
Custom Prices & Preview images.
Preset Filters so that Facets are activated with a
Query
Freitag, 15. Juni 12
31. Results
Field Boosting (Terms in certain Field score
higher. Can be freely set)
Boost-Functions (Functions on values of
documents. I.e. newer documents are more
ranked higher)
Query-Manipulation (Can be changed before
they hit Solr)
Elevation (Paid content)
Freitag, 15. Juni 12
32. Results
Template Engine: flexible Template to customize
your results listing fast and easily
Search word highlighting
Spell-Checking: "Did you mean?"
Common Searches
Recent Searches
Freitag, 15. Juni 12
33. Facets
Type-Facets
Author
Type of Document (Pages, News, Files & many more)
Range-Facets (Work in Progress)
(ie. 1-10 $ or Slider)
Hierarchical Facets
(Great if you have lots of categorized Data like in News or
Filerepositories)
Facets can be combined with each other
(ie. Show me all red & blue shoes)
Freitag, 15. Juni 12
34. Facets
Geo-Search (work in progress)
(i.e If you want to search and display the location of your data of
a certain type: Stores, Servicepoints, Bus-stops )
Geo-IP based on IP of your visitor
(ie: Where is the next salespoint for your products)
Facets are TYPO3 content objects
(can be manipulated with typoscript i.e Gifbuilder)
Filters can be preset
(You can preset certain facets)
...
Freitag, 15. Juni 12
36. Analysis
Query Logging
Stats on Queries (Work in Progress)
Userbased Ranking (Work in Progress)
Integration with analytics tools posible
Roll your own
There might be a Solr Server feature coming
up ...
Freitag, 15. Juni 12
38. Additional Components
More like this Component on the Details page
can show related additional documents
Its possible to access Indexed Data
Nutch Crawler to Index non TYPO3 Websites
Data Import Handler
Freitag, 15. Juni 12
39. dkd
development
kommunikation
design
Thank You! Merci.
Freitag, 15. Juni 12
40. Quellenangaben
Lucene Scoring for dummies: http://
www.supermind.org/blog/378/lucene-scoring-
for-dummies
Fotos: Søren Schaffstein
Freitag, 15. Juni 12