7. Sphinx
Sphinx is an open source full
text search server.
It's written in C++ and works
on Linux
(RedHat, Ubuntu, etc), Window
s, MacOS, Solaris, FreeBSD, a
nd a few other systems.
Sphinx lets you either batch
index and search data stored
in an SQL database, NoSQL
storage, or just files quickly
and easily
8. Sphinx
Text processing features
Searching via SphinxAPI is as simple as
3 lines of code, and querying via
SphinxQL is even simpler
Sphinx clusters scale up to billions of
documents and tens of millions search
queries per day, powering top websites
such as
Craigslist, DailyMotion, NetLog, etc.
9. Performance and scalability
Indexing performance: Sphinx indexes up to 10-
15 MB of text per second per single CPU core.
Searching performance: Searching through
1,000,000-document, 1.2 GB text collection that
they use for everyday development and testing runs
at 500+ queries/sec on a 2-core desktop machine
with 2 GB of RAM.
Scalability: Biggest known Sphinx cluster indexes
almost 5 billion documents, resulting in over 6 TB of
data.
Busiest known one is, unsurpisingly, Craigslist, top-
10 website in the US that serves 50+ million search
10. Key Features
Batch and Real-Time full-text indexes
Non-text attributes support
SQL database indexing
Non-SQL storage indexing
Easy application integration
Advanced full-text searching syntax
Rich database-like querying features
Better relevance ranking
Flexible text processing
Distributed searching
12. Solr is the
popular, blazing fast
open source enterprise
search platform from
the Apache Lucene
project.
13. Its major features include
powerful full-text search, hit
highlighting, faceted
search, dynamic
clustering, database
integration, rich document
(e.g., Word, PDF)
handling, and geospatial
14. Solr is written in Java
and runs as a
standalone full-text
search server within a
servlet container such
as Tomcat.
15. Solr Features
Advanced Full-Text Search Capabilities
Optimized for High Volume Web Traffic
Standards Based Open Interfaces - XML,JSON
and HTTP
Comprehensive HTML Administration Interfaces
Server statistics exposed over JMX for monitoring
Scalability - Efficient Replication to other Solr
Search Servers
Flexible and Adaptable with XML configuration
Extensible Plugin Architecture