SlideShare ist ein Scribd-Unternehmen logo
1 von 67
Serve out any page
with an HA Sphinx
environment
Art van Scheppingen
Head of Database Engineering
2
1. Who is Spil Games?
2. What is Sphinx Search?
3. Make Sphinx highly available
4. How does Spil Games use Sphinx?
5. Sphinx benchmarks
6. Questions?
Overview
Who are we?
Who is Spil Games?
4
• Game publishers & distributors
• Company founded in 2001
• 130+ employees
• 150M+ unique visitors per month
• Over 60M registered users
• 45 portals in 19 languages
• Casual games
• Social games
• Real time multiplayer games
• Mobile (html5) games
• 40+ MySQL clusters
• 65k queries per second
• 10 Sphinx servers
• 8k queries per second
Facts
5
Geographic Reach
150 Million Monthly Active Users(*)
Source: (*) Google Analytics, August 2012
6
Girls, Teens and Family
spielen.com
juegos.com
gamesgames.com
games.co.uk
Brands
Sphinx Search
Advanced seaching
8
• MyISAM / InnoDB (5.6.4 or higher)
CREATE TABLE articles (
id int(11) not null auto_increment,
author varchar(40) not null,
title varchar(50) not null,
body text,
PRIMARY KEY (id),
FULLTEXT idx (title, body)
) ENGINE=InnoDB;
• SELECT id, author FROM articles WHERE MATCH (title,body)
AGAINST (’somephrase');
• Complex queries
• SELECT id, author, MATCH (title,body) AGAINST (’somephrase' IN
BOOLEAN MODE) as score FROM articles ORDER BY score DESC,
id ASC;
• Drawbacks:
• Slow response times
Full text search in MySQL
9
• PostgreSQL tsquery
• Elasticsearch
• Apache Lucene
• Sphinx Search
• Many other alternatives
Alternatives to MySQL full text search
10
• Sphinx
• SELECT author FROM articles WHERE
MATCH('(@title,body) database');
• Complex queries
• SELECT author FROM articles WHERE
MATCH('(@title,body) database') ORDER BY
WEIGHT(), id ASC;
• Drawbacks:
• Not straightforward swap
• Specialized knowledge is needed
Full text search in Sphinx
11
• Generic (site) search
• Document search
• Logdata analysis
• Geo-distance calculation
Sphinx Search typical use cases
12
• Consists out of two components
• Indexer
• Index (textual) data
• Search daemon
• Search indexes and return matched items
• Three types of indexes:
• Disk indexes
• Real Time indexes
• Distributed indexes
Sphinx is a full text search engine
13
• Comparable to archive tables
• Indexer indexes data and updates full index
• Index is “written once”
• Only attributes can be changed (run time)
• Use --rotate to reload new indexes
• Less resources needed (ram/cpu)
• Not dependent on a specific database engine
• MySQL
• PostgreSQL
• MSSQL
• ODBC
• Xml/tsp pipes
Disk indexes
14
• Comparable to normal tables
• Online indexes
• Will be (eventually) written to disk
• Dynamically alter the indexes
• Insert/replace/delete operations
• Consume more memory
• Changes are generally updated within milliseconds
• Sometimes stalls for seconds, so not guaranteed
• High update rate influences the performance
Real time indexes
15
• Comparable to federated tables in MySQL
• Distribute the search over multiple nodes
• Many smaller indexes
• Sends queries to all defined nodes/indexes
• Aggregates and merges results
• Slowest node slows down responses
• Setting timeouts can keep this lower
Distributed indexes
16
• Two types of data:
• Fields
• Textual data to be indexed
• Attributes
• Data to sort/filter upon
• Special: unique identifier
• Special: (last update) timestamp
• Example:
+-------+----------------+---------------+-----------------+
| id | author | title | publishing_date |
+-------+----------------+---------------+-----------------+
| 12345 | Linus Torvalds | Just for fun | 2002-06-04 |
+-------+----------------+---------------+-----------------+
Indexing: attributes and fields
17
• Support for stopwords
• Ignore common words like “and”, “the” and “to”
• Ignore specific words like “game” and “juego”
• Still affects the keyword position
• Language and characters
• Morphology
• Similar words
• Lemmatization
• Run/ran/running
• Character folding
• U+FF10..U+FF19->0..9
Indexing: stopwords and stemmers
18
• Search daemon has three interfaces:
• SphinxAPI: Native Sphinx binary protocol
• SphinxQL: MySQL protocol
• SphinxSE: MySQL/MariaDB integration
• Example native:
<?php
$s = new SphinxClient;
$s->setServer("localhost", 6712);
$s->setMatchMode(SPH_MATCH_ANY);
$s->setMaxQueryTime(3);
$result = $s->query(”somephrase”, “articles”);
var_dump($result);
?>
• Example SphinxQL:
echo “SELECT author FROM articles WHERE MATCH('(@title,body)
somephrase') ORDER BY WEIGHT(), id ASC;” | mysql –P 6713
Searching: the interfaces
19
• Supports various ranking algorithms:
• None
• Any
• Phrase proximity
• Okapi BM25 (probabilistic)
• Wordcount
• Many more
• User weighting
• Boost columns with a multiplier
Searching: Search daemon
20
mysql> SELECT title, id, publication_date FROM articles WHERE
MATCH('(@title,body) database') ORDER BY WEIGHT(), publication_date ASC
LIMIT 0,5 OPTION field_weights=(title=10,body=3);
+-----------------------------+-------+------------------+
| title | id | publication_date |
+-----------------------------+-------+------------------+
| MySQL Cookbook | 75532 | 2014-07-01 |
| High performance MySQL | 94325 | 2012-04-02 |
| MySQL Administrator’s Bible | 63627 | 2009-05-11 |
| MySQL (4th Edition) | 39922 | 2008-09-08 |
| MySQL in a nutshell | 58793 | 2008-04-01 |
+-----------------------------+-------+------------------+
5 rows in set (0.01 sec)
Returned data
Making Sphinx
Highly Available
22
• Application handles:
• Connections
• Failovers
• Timeouts
• Distribution scheme
• Random
• Round robin
• Weighted
• Be creative!
Client side HA
Client side HA
Server-1 Server-2 Server-n
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Client side HA
Server-1 Server-2 Server-n
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Timeouts
25
<?php
function mysql_ha_connect(array $servers) {
foreach ($servers as $server){
$mysqli = new mysqli($server, 'user', 'pass', '', 9306);
if (is_null($mysqli->connect_error)) {
return $mysqli;
}
}
return false;
}
$servers = array(’node1.domain.com', 'node2.domain.com');
shuffle($servers);
$connection = mysql_ha_connect($servers);
if($connection === false) {
die('Could not connect to any node');
}
…
Client side HA Example
26
• Application connects to one single host
• LB / Proxy handles:
• Connections
• Failovers
• Timeouts
• Solutions:
• HAProxy
• MySQL Proxy
• MaxScale(?)
• Distribution scheme
• Random
• Round robin
• Weighted
• Least connections
• Fastest response
Load balancer / Proxy
Load Balancer / Proxy
Server-1 Server-2 Server-n
Load balancer
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Load Balancer / Proxy
Server-1 Server-2 Server-n
Load balancer
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Removed from load balancer
29
• Application connects to Sphinx on localhost
• Sphinx agent mirroring handles:
• Connections
• Failovers
• Timeouts
• Distribution scheme
• Random
• Round robin
• Nodeads (removes dead mirrors)
• Noerrors (removes worse performing mirrors)
Sphinx agent mirroring
Sphinx agent mirroring
Server-1 Server-2 Server-n
Sphinx
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Sphinx Sphinx
Sphinx agent mirroring
Server-1 Server-2 Server-n
Sphinx
Sphinx
Node 1
Sphinx
Node 2
Sphinx
Node n
Sphinx Sphinx
Removed from Sphinx
32
Sphinx agent mirroring example
index dist {
type = distributed
ha_strategy = nodeads
agent_query_timeout = 100
agent = node1:9312|node2:9312|node3:9312:game_index
}
How do we use
Sphinx Search?
Not only search
34
• Started using Sphinx in 2009
• Simple game search
• Replaced our MySQL / MyISAM search
• Added search for multiple columns
• Change weight per column
• Distributed mirrored indexes
• Index rebuilds performed per node
• Updates happen more frequently
Game search
35
Distributed mirrored indexes
Sphinx Node 1
Brand A
Brand B
Sphinx Node 2
Brand A+
Application
Server
Brand B+
36
Game Search
37
• Profile service
• Friends function
• Searches friends on
• username
• firstname / lastname
• Find friends across portals (within brands)
• Distributed partitioned index
Friends search
38
Distributed partitioned index
Sphinx Node 1
Partition >=
today
Partition >=
this month
<= today
Partition >=
3 months
<= this
month
Partition <=
3 months
Sphinx Node 2
Partition >=
today
Application
Server
39
Friends search
40
• ROAR is a database abstraction layer
• See Percona Live Santa Clara 2014 presentation
• Sphinx complementary to MySQL and Couchbase
• Translate a title to a gamepage
• Search url parts to fetch the application id
• Translate keywords to lists of games
• Search url parts to fetch a list of application ids
• Filter applications on portal and brand
• Filter applications on browser capabilities
• Sort on publishing date, popularity and rating
ROAR storage layer
41
• Legacy:
• Url without identifiers
• There can only be one game with the same url
• Sphinx does a fast lookup of (existing) game to id
• Example:
http://www.agame.com/game/rig-bmx
Translates into application id 123456
• Future improvements:
• Correct non-existing pages (404)
http://www.agame.com/game/rig-bmxx
with a redirect (301) to:
http://www.agame.com/game/rig-bmx
Translating a title to a gamepage
42
Translating a title to a gamepage
43
Translating a title to a gamepage
44
• Filter on url parts
• One or multiple
• Complex filtering on capabilities
• Blacklist incompatible games (Flash/Unity)
Translating keywords to game listings
45
• Example 1 url part:
http://www.agame.com/games/puzzle
Sends this query to Sphinx:
SELECT title, appid FROM game_index WHERE brandid=1 AND portalid=88 AND
MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION
max_matches=10000;
• Example 2 url parts:
http://www.agame.com/games/puzzle/match-3
Sends this query to Sphinx:
SELECT title, appid FROM game_index WHERE brandid=1 AND portalid=88 AND
MATCH('@url "puzzle" && "match-3" ') ORDER BY date_onsite desc LIMIT 0,10
OPTION max_matches=10000;
Search on url parts
46
Search on url parts
47
Search on url parts
48
• Blacklisting performed on capabilities encoded bitmask
• Example normal desktop browser (no filter):
http://www.agame.com/games/puzzle
Opening the puzzle category on a desktop sends this query to Sphinx:
SELECT title, appid,(bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS bitfilter FROM
game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle"') AND
bitcheck = 0 ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000;
• Example Chrome on Android 4.4 (filter out 11):
http://www.agame.com/games/puzzle
Opening the puzzle category on a Nexus 7 sends this query to Sphinx:
SELECT title, appid,(bitmask1 & 11) AS bitcheck, (bitmask1 & 11) AS bitfilter
FROM game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle"') AND
bitcheck = 0 ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000;
Filter on browser capabilities
49
Filter on browser capabilities
50
Filter on browser capabilities
51
• Real time indexes decreased performance
• Make the indexing process “nicer”
/bin/taskset 0x00000001 /usr/bin/indexer --all --config /etc/sphinx.conf
• Send statistics to Graphite
http://engineering.spilgames.com/tamed-sphinx-search/
What we encountered
Benchmarking
Sphinx
53
• Sysbench 0.5
• Custom lua scripts
• Disabled caching
• Openstack virtuals:
• Benchmark driver: 4 core CPU, 4GB memory
• Sphinx nodes: 4 core CPU, 16GB memory
• MySQL nodes: 4 core CPU, 16GB memory
• At least 3 runs per test
• Average of tests counts
• Repeat tests when outliers were found
Sphinx Benchmark specifications
54
• InnoDB discrete match
SELECT l.url, gd.title, g.appid, bitmask1, date_onsite FROM games g LEFT
JOIN game_capabilities gc ON g.appid=gc.app INNER JOIN game_cat c ON
g.appid = c.appid AND g.portalid = c.portalid AND g.brandid = c.brandid
INNER JOIN cat_data cd ON c.portalid = cd.portalid AND c.brandid =
cd.brandid AND c.catname = cd.catname WHERE g.brandid=1 AND g.portalid=88
AND cd.url='puzzle' ORDER BY date_onsite desc LIMIT 0,10;
• Sphinx single phrase
SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS
bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0
AND MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION
max_matches=10000;
InnoDB vs Sphinx
55
InnoDB vs Sphinx
0
50
100
150
200
250
300
4 8 16 24 32 48 64
Sphinx single phrase
InnoDB discrete match
threads
95thperc.responsetimeinms
56
• MyISAM single match-against
Select title, appid, (bitmask1 & 0) AS bitfilter, MATCH(`url`)
AGAINST('puzzle') AS score FROM game_index WHERE MATCH(`url`)
AGAINST('puzzle') AND portalid=88 AND brandid=1 AND (bitmask1 & 0) = 0
ORDER BY score DESC, date_onsite DESC LIMIT 0,10;
• Sphinx single phrase
SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS
bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0
AND MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION
max_matches=10000;
MyISAM full text vs Sphinx 1
57
MyISAM full text vs Sphinx 1
0
200
400
600
800
1000
1200
1400
1600
1800
2000
4 8 16 24 32 48 64
Sphinx single phrase
MyISAM single match-against
threads
95thperc.responsetimeinms
58
• MyISAM multiple match-against
SELECT title, appid, (bitmask1 & 0) AS bitfilter, MATCH(`url`)
AGAINST('+puzzle +sudoku' IN BOOLEAN MODE) AS score FROM game_index WHERE
MATCH(`url`) AGAINST('+puzzle +sudoku' IN BOOLEAN MODE) AND portalid=88 AND
brandid=1 AND (bitmask1 & 0) = 0 ORDER BY score DESC, date_onsite DESC
LIMIT 0,10;
• Sphinx multiple phrases
SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS
bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0
AND MATCH('@url "puzzle" && "sudoku"') ORDER BY date_onsite desc LIMIT 0,10
OPTION max_matches=10000;
MyISAM full text vs Sphinx 2
59
MyISAM full text vs Sphinx 2
0
50
100
150
200
250
4 8 16 24 32 48 64
MyISAM multiple match-against
Sphinx multiple phrases
threads
95thperc.responsetimeinms
60
MyISAM full text vs Sphinx 2
0
200
400
600
800
1000
1200
1400
1600
1800
2000
4 8 16 24 32 48 64
Sphinx single phrase
MyISAM multiple match-against
Sphinx multiple phrases
MyISAM single match-against
threads
95thperc.responsetimeinms
61
InnoDB vs MyISAM vs Sphinx
0
500
1000
1500
2000
2500
3000
3500
4000
4 8 16 24 32 48 64
Sphinx single phrase
InnoDB single match-against
MyISAM single match-against
threads
95thperc.responsetimeinms
62
• Sphinx on localhost
• Talks MySQL on localhost
• One or two remote agent(s)
• Sphinx behind loadbalancer
• Proxies MySQL
Sphinx HA solutions
63
Sphinx HA solutions
0
20
40
60
80
100
120
140
160
4 8 16 24 32 48 64
Direct connection single host
Localhost 2 nodes
localhost 1 node
Load Balancer 2 nodes
threads
Avgresponsetimeinms
64
• Sphinx Search is faster than MySQL full text search
• Smaller result sets increase performance
• Due to sorting by relevance
• Smaller temporary tables
• InnoDB performs worse than MyISAM
• Sphinx agent mirroring performs better
• Probably due to Sphinx native protocol
• Load balances seems to perform better
• Probably due to dedicated (better) hardware
Conclusion
Questions?
66
• This presentation can be found at:
http://spil.com/pluk2014sphinx
• Sphinx Search:
http://www.sphinxsearch.com
• Sending Sphinx Search metrics to Graphite:
http://engineering.spilgames.com/tamed-sphinx-search/
• About the ROAR storage layer:
http://spil.com/plsc2014storage
• If you wish to contact me:
Email: art@spilgames.com
Twitter: @banpei
Blog: http://engineering.spilgames.com
Twitter Spil Engineering: @spilengineering
Thank you!
67
Google Snail Search:
Boomerang Cards
http://data.boomerang.nl/b/boomerang/image/google-classic/s600/3.jpg
Jean-Claude van Damme
Volvo Trucks
http://www.volvotrucks.com/trucks/UAE-market/en-
ae/newsmedia/pressreleases/Pages/pressreleases.aspx?pubid=17613
Bench mates
Craig Sunter
https://www.flickr.com/photos/16210667@N02/12381776985
Photo sources

Weitere ähnliche Inhalte

Was ist angesagt?

Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous AvailabilityRamp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
Pythian
 

Was ist angesagt? (20)

Percona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA'sPercona tool kit for MySQL DBA's
Percona tool kit for MySQL DBA's
 
MySQL in the Hosted Cloud - Percona Live 2015
MySQL in the Hosted Cloud - Percona Live 2015MySQL in the Hosted Cloud - Percona Live 2015
MySQL in the Hosted Cloud - Percona Live 2015
 
How to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOSHow to deploy Apache Spark 
to Mesos/DCOS
How to deploy Apache Spark 
to Mesos/DCOS
 
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
 
Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)Diagnosing Problems in Production (Nov 2015)
Diagnosing Problems in Production (Nov 2015)
 
Is hadoop for you
Is hadoop for youIs hadoop for you
Is hadoop for you
 
Bootstrapping Using Free Software
Bootstrapping Using Free SoftwareBootstrapping Using Free Software
Bootstrapping Using Free Software
 
Building Distributed Systems in Scala
Building Distributed Systems in ScalaBuilding Distributed Systems in Scala
Building Distributed Systems in Scala
 
The Future of Apache Storm
The Future of Apache StormThe Future of Apache Storm
The Future of Apache Storm
 
keyvi the key value index @ Cliqz
keyvi the key value index @ Cliqzkeyvi the key value index @ Cliqz
keyvi the key value index @ Cliqz
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop EcosystemLarge-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache Samza
 
Introduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developersIntroduction to Cassandra and CQL for Java developers
Introduction to Cassandra and CQL for Java developers
 
Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by...
Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by...Searching The Enterprise Data Lake With Solr  - Watch Us Do It!: Presented by...
Searching The Enterprise Data Lake With Solr - Watch Us Do It!: Presented by...
 
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous AvailabilityRamp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
Ramp-Tutorial for MYSQL Cluster - Scaling with Continuous Availability
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Oss4b - pxc introduction
Oss4b   - pxc introductionOss4b   - pxc introduction
Oss4b - pxc introduction
 

Andere mochten auch

How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
oysteing
 

Andere mochten auch (6)

Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricksQuery Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
Query Optimization with MySQL 5.7 and MariaDB 10: Even newer tricks
 
How to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better PerformanceHow to Analyze and Tune MySQL Queries for Better Performance
How to Analyze and Tune MySQL Queries for Better Performance
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
 
MySQL Optimizer Cost Model
MySQL Optimizer Cost ModelMySQL Optimizer Cost Model
MySQL Optimizer Cost Model
 
MySQL Schema Design in Practice
MySQL Schema Design in PracticeMySQL Schema Design in Practice
MySQL Schema Design in Practice
 
Using Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query PerformanceUsing Optimizer Hints to Improve MySQL Query Performance
Using Optimizer Hints to Improve MySQL Query Performance
 

Ähnlich wie Percona Live London 2014: Serve out any page with an HA Sphinx environment

My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At Craigslist
MySQLConference
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
gregchanan
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
Michael Hackstein
 

Ähnlich wie Percona Live London 2014: Serve out any page with an HA Sphinx environment (20)

ElasticSearch AJUG 2013
ElasticSearch AJUG 2013ElasticSearch AJUG 2013
ElasticSearch AJUG 2013
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Data Engineering with Solr and Spark
Data Engineering with Solr and SparkData Engineering with Solr and Spark
Data Engineering with Solr and Spark
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
SQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The MoveSQL To NoSQL - Top 6 Questions Before Making The Move
SQL To NoSQL - Top 6 Questions Before Making The Move
 
FITC - Node.js 101
FITC - Node.js 101FITC - Node.js 101
FITC - Node.js 101
 
My Sql And Search At Craigslist
My Sql And Search At CraigslistMy Sql And Search At Craigslist
My Sql And Search At Craigslist
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
 
ELK stack introduction
ELK stack introduction ELK stack introduction
ELK stack introduction
 
Mongo db admin_20110329
Mongo db admin_20110329Mongo db admin_20110329
Mongo db admin_20110329
 
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 20185 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
5 Popular Choices for NoSQL on a Microsoft Platform - Tulsa - July 2018
 
Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012Sphinx at Craigslist in 2012
Sphinx at Craigslist in 2012
 
Rapid API Development ArangoDB Foxx
Rapid API Development ArangoDB FoxxRapid API Development ArangoDB Foxx
Rapid API Development ArangoDB Foxx
 
ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014ElasticSearch - DevNexus Atlanta - 2014
ElasticSearch - DevNexus Atlanta - 2014
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
Service stack all the things
Service stack all the thingsService stack all the things
Service stack all the things
 
Wmware NoSQL
Wmware NoSQLWmware NoSQL
Wmware NoSQL
 
Ingesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmedIngesting hdfs intosolrusingsparktrimmed
Ingesting hdfs intosolrusingsparktrimmed
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 

Percona Live London 2014: Serve out any page with an HA Sphinx environment

  • 1. Serve out any page with an HA Sphinx environment Art van Scheppingen Head of Database Engineering
  • 2. 2 1. Who is Spil Games? 2. What is Sphinx Search? 3. Make Sphinx highly available 4. How does Spil Games use Sphinx? 5. Sphinx benchmarks 6. Questions? Overview
  • 3. Who are we? Who is Spil Games?
  • 4. 4 • Game publishers & distributors • Company founded in 2001 • 130+ employees • 150M+ unique visitors per month • Over 60M registered users • 45 portals in 19 languages • Casual games • Social games • Real time multiplayer games • Mobile (html5) games • 40+ MySQL clusters • 65k queries per second • 10 Sphinx servers • 8k queries per second Facts
  • 5. 5 Geographic Reach 150 Million Monthly Active Users(*) Source: (*) Google Analytics, August 2012
  • 6. 6 Girls, Teens and Family spielen.com juegos.com gamesgames.com games.co.uk Brands
  • 8. 8 • MyISAM / InnoDB (5.6.4 or higher) CREATE TABLE articles ( id int(11) not null auto_increment, author varchar(40) not null, title varchar(50) not null, body text, PRIMARY KEY (id), FULLTEXT idx (title, body) ) ENGINE=InnoDB; • SELECT id, author FROM articles WHERE MATCH (title,body) AGAINST (’somephrase'); • Complex queries • SELECT id, author, MATCH (title,body) AGAINST (’somephrase' IN BOOLEAN MODE) as score FROM articles ORDER BY score DESC, id ASC; • Drawbacks: • Slow response times Full text search in MySQL
  • 9. 9 • PostgreSQL tsquery • Elasticsearch • Apache Lucene • Sphinx Search • Many other alternatives Alternatives to MySQL full text search
  • 10. 10 • Sphinx • SELECT author FROM articles WHERE MATCH('(@title,body) database'); • Complex queries • SELECT author FROM articles WHERE MATCH('(@title,body) database') ORDER BY WEIGHT(), id ASC; • Drawbacks: • Not straightforward swap • Specialized knowledge is needed Full text search in Sphinx
  • 11. 11 • Generic (site) search • Document search • Logdata analysis • Geo-distance calculation Sphinx Search typical use cases
  • 12. 12 • Consists out of two components • Indexer • Index (textual) data • Search daemon • Search indexes and return matched items • Three types of indexes: • Disk indexes • Real Time indexes • Distributed indexes Sphinx is a full text search engine
  • 13. 13 • Comparable to archive tables • Indexer indexes data and updates full index • Index is “written once” • Only attributes can be changed (run time) • Use --rotate to reload new indexes • Less resources needed (ram/cpu) • Not dependent on a specific database engine • MySQL • PostgreSQL • MSSQL • ODBC • Xml/tsp pipes Disk indexes
  • 14. 14 • Comparable to normal tables • Online indexes • Will be (eventually) written to disk • Dynamically alter the indexes • Insert/replace/delete operations • Consume more memory • Changes are generally updated within milliseconds • Sometimes stalls for seconds, so not guaranteed • High update rate influences the performance Real time indexes
  • 15. 15 • Comparable to federated tables in MySQL • Distribute the search over multiple nodes • Many smaller indexes • Sends queries to all defined nodes/indexes • Aggregates and merges results • Slowest node slows down responses • Setting timeouts can keep this lower Distributed indexes
  • 16. 16 • Two types of data: • Fields • Textual data to be indexed • Attributes • Data to sort/filter upon • Special: unique identifier • Special: (last update) timestamp • Example: +-------+----------------+---------------+-----------------+ | id | author | title | publishing_date | +-------+----------------+---------------+-----------------+ | 12345 | Linus Torvalds | Just for fun | 2002-06-04 | +-------+----------------+---------------+-----------------+ Indexing: attributes and fields
  • 17. 17 • Support for stopwords • Ignore common words like “and”, “the” and “to” • Ignore specific words like “game” and “juego” • Still affects the keyword position • Language and characters • Morphology • Similar words • Lemmatization • Run/ran/running • Character folding • U+FF10..U+FF19->0..9 Indexing: stopwords and stemmers
  • 18. 18 • Search daemon has three interfaces: • SphinxAPI: Native Sphinx binary protocol • SphinxQL: MySQL protocol • SphinxSE: MySQL/MariaDB integration • Example native: <?php $s = new SphinxClient; $s->setServer("localhost", 6712); $s->setMatchMode(SPH_MATCH_ANY); $s->setMaxQueryTime(3); $result = $s->query(”somephrase”, “articles”); var_dump($result); ?> • Example SphinxQL: echo “SELECT author FROM articles WHERE MATCH('(@title,body) somephrase') ORDER BY WEIGHT(), id ASC;” | mysql –P 6713 Searching: the interfaces
  • 19. 19 • Supports various ranking algorithms: • None • Any • Phrase proximity • Okapi BM25 (probabilistic) • Wordcount • Many more • User weighting • Boost columns with a multiplier Searching: Search daemon
  • 20. 20 mysql> SELECT title, id, publication_date FROM articles WHERE MATCH('(@title,body) database') ORDER BY WEIGHT(), publication_date ASC LIMIT 0,5 OPTION field_weights=(title=10,body=3); +-----------------------------+-------+------------------+ | title | id | publication_date | +-----------------------------+-------+------------------+ | MySQL Cookbook | 75532 | 2014-07-01 | | High performance MySQL | 94325 | 2012-04-02 | | MySQL Administrator’s Bible | 63627 | 2009-05-11 | | MySQL (4th Edition) | 39922 | 2008-09-08 | | MySQL in a nutshell | 58793 | 2008-04-01 | +-----------------------------+-------+------------------+ 5 rows in set (0.01 sec) Returned data
  • 22. 22 • Application handles: • Connections • Failovers • Timeouts • Distribution scheme • Random • Round robin • Weighted • Be creative! Client side HA
  • 23. Client side HA Server-1 Server-2 Server-n Sphinx Node 1 Sphinx Node 2 Sphinx Node n
  • 24. Client side HA Server-1 Server-2 Server-n Sphinx Node 1 Sphinx Node 2 Sphinx Node n Timeouts
  • 25. 25 <?php function mysql_ha_connect(array $servers) { foreach ($servers as $server){ $mysqli = new mysqli($server, 'user', 'pass', '', 9306); if (is_null($mysqli->connect_error)) { return $mysqli; } } return false; } $servers = array(’node1.domain.com', 'node2.domain.com'); shuffle($servers); $connection = mysql_ha_connect($servers); if($connection === false) { die('Could not connect to any node'); } … Client side HA Example
  • 26. 26 • Application connects to one single host • LB / Proxy handles: • Connections • Failovers • Timeouts • Solutions: • HAProxy • MySQL Proxy • MaxScale(?) • Distribution scheme • Random • Round robin • Weighted • Least connections • Fastest response Load balancer / Proxy
  • 27. Load Balancer / Proxy Server-1 Server-2 Server-n Load balancer Sphinx Node 1 Sphinx Node 2 Sphinx Node n
  • 28. Load Balancer / Proxy Server-1 Server-2 Server-n Load balancer Sphinx Node 1 Sphinx Node 2 Sphinx Node n Removed from load balancer
  • 29. 29 • Application connects to Sphinx on localhost • Sphinx agent mirroring handles: • Connections • Failovers • Timeouts • Distribution scheme • Random • Round robin • Nodeads (removes dead mirrors) • Noerrors (removes worse performing mirrors) Sphinx agent mirroring
  • 30. Sphinx agent mirroring Server-1 Server-2 Server-n Sphinx Sphinx Node 1 Sphinx Node 2 Sphinx Node n Sphinx Sphinx
  • 31. Sphinx agent mirroring Server-1 Server-2 Server-n Sphinx Sphinx Node 1 Sphinx Node 2 Sphinx Node n Sphinx Sphinx Removed from Sphinx
  • 32. 32 Sphinx agent mirroring example index dist { type = distributed ha_strategy = nodeads agent_query_timeout = 100 agent = node1:9312|node2:9312|node3:9312:game_index }
  • 33. How do we use Sphinx Search? Not only search
  • 34. 34 • Started using Sphinx in 2009 • Simple game search • Replaced our MySQL / MyISAM search • Added search for multiple columns • Change weight per column • Distributed mirrored indexes • Index rebuilds performed per node • Updates happen more frequently Game search
  • 35. 35 Distributed mirrored indexes Sphinx Node 1 Brand A Brand B Sphinx Node 2 Brand A+ Application Server Brand B+
  • 37. 37 • Profile service • Friends function • Searches friends on • username • firstname / lastname • Find friends across portals (within brands) • Distributed partitioned index Friends search
  • 38. 38 Distributed partitioned index Sphinx Node 1 Partition >= today Partition >= this month <= today Partition >= 3 months <= this month Partition <= 3 months Sphinx Node 2 Partition >= today Application Server
  • 40. 40 • ROAR is a database abstraction layer • See Percona Live Santa Clara 2014 presentation • Sphinx complementary to MySQL and Couchbase • Translate a title to a gamepage • Search url parts to fetch the application id • Translate keywords to lists of games • Search url parts to fetch a list of application ids • Filter applications on portal and brand • Filter applications on browser capabilities • Sort on publishing date, popularity and rating ROAR storage layer
  • 41. 41 • Legacy: • Url without identifiers • There can only be one game with the same url • Sphinx does a fast lookup of (existing) game to id • Example: http://www.agame.com/game/rig-bmx Translates into application id 123456 • Future improvements: • Correct non-existing pages (404) http://www.agame.com/game/rig-bmxx with a redirect (301) to: http://www.agame.com/game/rig-bmx Translating a title to a gamepage
  • 42. 42 Translating a title to a gamepage
  • 43. 43 Translating a title to a gamepage
  • 44. 44 • Filter on url parts • One or multiple • Complex filtering on capabilities • Blacklist incompatible games (Flash/Unity) Translating keywords to game listings
  • 45. 45 • Example 1 url part: http://www.agame.com/games/puzzle Sends this query to Sphinx: SELECT title, appid FROM game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; • Example 2 url parts: http://www.agame.com/games/puzzle/match-3 Sends this query to Sphinx: SELECT title, appid FROM game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle" && "match-3" ') ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; Search on url parts
  • 48. 48 • Blacklisting performed on capabilities encoded bitmask • Example normal desktop browser (no filter): http://www.agame.com/games/puzzle Opening the puzzle category on a desktop sends this query to Sphinx: SELECT title, appid,(bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle"') AND bitcheck = 0 ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; • Example Chrome on Android 4.4 (filter out 11): http://www.agame.com/games/puzzle Opening the puzzle category on a Nexus 7 sends this query to Sphinx: SELECT title, appid,(bitmask1 & 11) AS bitcheck, (bitmask1 & 11) AS bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND MATCH('@url "puzzle"') AND bitcheck = 0 ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; Filter on browser capabilities
  • 49. 49 Filter on browser capabilities
  • 50. 50 Filter on browser capabilities
  • 51. 51 • Real time indexes decreased performance • Make the indexing process “nicer” /bin/taskset 0x00000001 /usr/bin/indexer --all --config /etc/sphinx.conf • Send statistics to Graphite http://engineering.spilgames.com/tamed-sphinx-search/ What we encountered
  • 53. 53 • Sysbench 0.5 • Custom lua scripts • Disabled caching • Openstack virtuals: • Benchmark driver: 4 core CPU, 4GB memory • Sphinx nodes: 4 core CPU, 16GB memory • MySQL nodes: 4 core CPU, 16GB memory • At least 3 runs per test • Average of tests counts • Repeat tests when outliers were found Sphinx Benchmark specifications
  • 54. 54 • InnoDB discrete match SELECT l.url, gd.title, g.appid, bitmask1, date_onsite FROM games g LEFT JOIN game_capabilities gc ON g.appid=gc.app INNER JOIN game_cat c ON g.appid = c.appid AND g.portalid = c.portalid AND g.brandid = c.brandid INNER JOIN cat_data cd ON c.portalid = cd.portalid AND c.brandid = cd.brandid AND c.catname = cd.catname WHERE g.brandid=1 AND g.portalid=88 AND cd.url='puzzle' ORDER BY date_onsite desc LIMIT 0,10; • Sphinx single phrase SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0 AND MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; InnoDB vs Sphinx
  • 55. 55 InnoDB vs Sphinx 0 50 100 150 200 250 300 4 8 16 24 32 48 64 Sphinx single phrase InnoDB discrete match threads 95thperc.responsetimeinms
  • 56. 56 • MyISAM single match-against Select title, appid, (bitmask1 & 0) AS bitfilter, MATCH(`url`) AGAINST('puzzle') AS score FROM game_index WHERE MATCH(`url`) AGAINST('puzzle') AND portalid=88 AND brandid=1 AND (bitmask1 & 0) = 0 ORDER BY score DESC, date_onsite DESC LIMIT 0,10; • Sphinx single phrase SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0 AND MATCH('@url "puzzle"') ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; MyISAM full text vs Sphinx 1
  • 57. 57 MyISAM full text vs Sphinx 1 0 200 400 600 800 1000 1200 1400 1600 1800 2000 4 8 16 24 32 48 64 Sphinx single phrase MyISAM single match-against threads 95thperc.responsetimeinms
  • 58. 58 • MyISAM multiple match-against SELECT title, appid, (bitmask1 & 0) AS bitfilter, MATCH(`url`) AGAINST('+puzzle +sudoku' IN BOOLEAN MODE) AS score FROM game_index WHERE MATCH(`url`) AGAINST('+puzzle +sudoku' IN BOOLEAN MODE) AND portalid=88 AND brandid=1 AND (bitmask1 & 0) = 0 ORDER BY score DESC, date_onsite DESC LIMIT 0,10; • Sphinx multiple phrases SELECT title, appid, (bitmask1 & 0) AS bitcheck, (bitmask1 & 0) AS bitfilter FROM game_index WHERE brandid=1 AND portalid=88 AND bitcheck = 0 AND MATCH('@url "puzzle" && "sudoku"') ORDER BY date_onsite desc LIMIT 0,10 OPTION max_matches=10000; MyISAM full text vs Sphinx 2
  • 59. 59 MyISAM full text vs Sphinx 2 0 50 100 150 200 250 4 8 16 24 32 48 64 MyISAM multiple match-against Sphinx multiple phrases threads 95thperc.responsetimeinms
  • 60. 60 MyISAM full text vs Sphinx 2 0 200 400 600 800 1000 1200 1400 1600 1800 2000 4 8 16 24 32 48 64 Sphinx single phrase MyISAM multiple match-against Sphinx multiple phrases MyISAM single match-against threads 95thperc.responsetimeinms
  • 61. 61 InnoDB vs MyISAM vs Sphinx 0 500 1000 1500 2000 2500 3000 3500 4000 4 8 16 24 32 48 64 Sphinx single phrase InnoDB single match-against MyISAM single match-against threads 95thperc.responsetimeinms
  • 62. 62 • Sphinx on localhost • Talks MySQL on localhost • One or two remote agent(s) • Sphinx behind loadbalancer • Proxies MySQL Sphinx HA solutions
  • 63. 63 Sphinx HA solutions 0 20 40 60 80 100 120 140 160 4 8 16 24 32 48 64 Direct connection single host Localhost 2 nodes localhost 1 node Load Balancer 2 nodes threads Avgresponsetimeinms
  • 64. 64 • Sphinx Search is faster than MySQL full text search • Smaller result sets increase performance • Due to sorting by relevance • Smaller temporary tables • InnoDB performs worse than MyISAM • Sphinx agent mirroring performs better • Probably due to Sphinx native protocol • Load balances seems to perform better • Probably due to dedicated (better) hardware Conclusion
  • 66. 66 • This presentation can be found at: http://spil.com/pluk2014sphinx • Sphinx Search: http://www.sphinxsearch.com • Sending Sphinx Search metrics to Graphite: http://engineering.spilgames.com/tamed-sphinx-search/ • About the ROAR storage layer: http://spil.com/plsc2014storage • If you wish to contact me: Email: art@spilgames.com Twitter: @banpei Blog: http://engineering.spilgames.com Twitter Spil Engineering: @spilengineering Thank you!
  • 67. 67 Google Snail Search: Boomerang Cards http://data.boomerang.nl/b/boomerang/image/google-classic/s600/3.jpg Jean-Claude van Damme Volvo Trucks http://www.volvotrucks.com/trucks/UAE-market/en- ae/newsmedia/pressreleases/Pages/pressreleases.aspx?pubid=17613 Bench mates Craig Sunter https://www.flickr.com/photos/16210667@N02/12381776985 Photo sources

Hinweis der Redaktion

  1. The three main brands: Girls, aimed at girls ages from 8 to 12 Teens aimed at boys and girls 10 to 15 and Family basically mothers playing with their children Strong domains localized over 19 different languages spielen.com, juegos.com, gamesgames.com, games.co.uk, oyunonya.com All content is localized