1. Real time indexes in Sphinx
Search 1.10
This presentation about new feature of Sphinx
Search 1.10 - Real Time indexes
2. About Me
● Yaroslav Vorozhko
● Web developer at Ivinco
● Specialized at search engines and high load
systems
● E-mail: yaroslav@ivinco.com
3. The problem of plain indexes
● New data required to update entire index
● Index merge in main + delta scheme
● Depend on indexer tool
● Not simple to manage
4. Real Time indexes
● What is RT index
● Index update on the fly
● Support of mysql protocol with SphinxQL
6. Index schema comparation
● HDD usage
HDD usage - Plain vs RT indexes
9
8
7
6
5
Plain index
GB
4 Real time index
3
2
1
0
10,000 100,000 1,000,000 2,000,000
Indexed records
7. Index schema comparation
● Single query performance
SphinxAPI performance for single query
0.25
0.2
Query time sec.
0.15
Plain index
Real time index
0.1
0.05
0
10,000 100,000 1,000,000 2,000,000
Records in index
8. Index schema comparation
● Multy query performance
SphinxAPI performance for multi query
0.05
0.05
0.04
0.04
Query time sec.
0.03
0.03 Plain index
Real time index
0.02
0.02
0.01
0.01
0
10,000 100,000 1,000,000 2,000,000
Records in index
9. Index schema comparation
● Single query performance with loads
SphinxAPI performance for single query and insert loads
0.2 0.2
0.18 0.18
0.16 0.16
0.14 0.14
Query time sec.
0.12 0.12
RT binlog off
RT binlog 0
0.1 0.1
RT binlog 1
RT binlog 2
0.08 0.08
Plain index
0.06 0.06
0.04 0.04
0.02 0.02
0 0
10000 100000 1000000 2000000
Records in index
10. Demonstration
● Easy to create an index
index rt
{
type = rt
path = /usr/local/sphinx/data/rt
rt_field = title
rt_field = content
rt_attr_uint = gid
}
11. Demonstration
● Easy to CRUD
mysql -h 127.0.0.1 -P 9306
INSERT INTO rt VALUES ....
SELECT * FROM rt;
DELETE FROM rt WHERE id=2;
REPLACE INTO rt VALUES ....
● SphinxAPI support
12. Migration
● Simple and easy using existing tools
mysqldump -uroot blog users > users_dump.sql
mysql -P9306 < users_dump.sql
13. Migration
● Custom script as replace of ”source” block
● Support all ”source” settings
– Connection settings
– SQL query among with Pre and Post SQL
– SQL Range queries
● Support fill index from scratch
● Support index update
14. Migration
● Support of mixed indexes
index distributed
{
type = distributed
local = plain_main_index
local = real_time_increment_index
}
17. References
● Sphinx Search http://sphinxsearch.com/
● Migration from plain to real time indexes script
https://launchpad.net/migrate-sphinx-plain-
indexes-into-real-time-indexes
● Ivinco blog – is good resource about Sphinx
Search http://www.ivinco.com/blog/
● My blog – also good resource about Sphinx
Search on russian language
http://pro100pro.com/