Session slide at db tech showcase 2012
How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation
- About Rakuten
- Rakuten database environment and operational issues
- What is Clustrix?
- Clustrix verification results and implementation effectiveness
- Summary
How Rakuten Reduced Database Management Spending by 90% through Clustrix implementation
1. 楽天事例紹介: Clustrix導入による
DB管理コストの削減
How Rakuten Reduced Database
Management Spending by 90% through
Clustrix implementation
October 17th, 2012
Ryutaro Yada (矢田 龍太郎)
Database Platform Group
Global Infrastructure Development Dept.
Rakuten, Inc.
2. Introduction
Ryutaro Yada
First employed by Rakuten in 2008
Present job
Development of a platform database to support Rakuten
Testing and discussion of new techniques and new architecture in view of having it
adopted for use.
Previous functions
Promotion of Oracle business with specified customer
Establish collaborative network with Oracle, develop and verify new solutions, etc..
LinkedIn profile: http://www.linkedin.com/pub/ryutaro-yada/32/368/4b0
1
3. Agenda
About Rakuten
Rakuten database environment and
operational issues
What is Clustrix?
Clustrix verification results and
implementation effectiveness
Summary
2
4. Introduction to Rakuten
About 3000 employees: (Group approx. 7000)
Market / more than 40 services provided including travel
More than 120,000 contracted firms; more than 80,000,000
registered products
Group distribution total: 3.2 trillion yen (2011)
Rakuten market
5. Rakuten Global Expansion
Our Goal is to become the No. 1 Internet Service in the World
LS(UK)
★★
★
★ ★
★★
★
★
★★
★
★ ★ ★
★
★ ★
★ ★ ★
★ ★ ★ ★
★
★
★
★★
★
★★
★ ★
★ ★★
★
★
★
★★
★ ★★
Taiwan ★
★
★
★
*To be open soon ★
★
★
★
★ Ichiba (EC)
★ ★ Travel ★ Performance marketing
★ ★
6. Rakuten’s Global Position
Rakuten is aiming to be the world’s largest internet firm.
Firm and highly flexible infrastructure is required to achieve this
goal
Retail / auction site global ranking 2011 based on unique (no. of) visitors
300000
250000
200000
150000
100000
50000
0
Amazon e-Bay Alibaba Apple Rakuten Walmart
Source: comScore Media Metrics
7. Rakuten Database
Breakdown according to the number of databases:
approx. 80% MySQL (more than 1100)
More than 350 MySQL database servers
MySQL has the largest share
Oracle PostgreSQL Teraddata
No. of databases according
Informix
to actual environment
RDBMS
Same number of databases
for each STG and DEV
MySQL
6
8. MySQL Database Issue (1)
Data Sharding Operations
Required for functionality scaling
Instance/database/table splitting, data redistribution
Correction of application code, control of database access
Data Protection, HA Securing
Replication cannot realize zero data loss at failure
Switch back/switch over management takes a lot of effort
7
9. MySQL Database Issue (2)
Online Maintainability
Schema modification and index addition, rebuild
Lock, access concentration
Number of Units Tends to Increase
Load distribution slave, redundant configuration of slave
Tendency for preparations on an individual service basis (service level
differences, maintenance adjustment diversion)
CPU efficiency decreases; increases in data center costs
8
10. Clustrix Characteristics
What is Clustrix?
Appliance-style database server
Cluster database
NewSQL = LegacySQL + NoSQL
LegacySQL: SQL access, transaction consistency
NoSQL: Scalability, high performance
Fault-tolerance function
MySQL compatibility
Usually access is through MySQL protocol
9
23. Complex and Heavy SQL Comparison
Clustrix IA with SSD SPARC with SAN
J) Count+GroupBy+OrderBy+Limit 1.9s (3.4s) 2.1s (8.5s) 3.4s (409.32s)
K) Count+GroupBy+OrderBy+Limit 0.7s (1.13s) 5.9s (7.49s) 13.0s (39.41s)
L) 2000 of IN+GroupBy 3.8s (8.97s) 106.5s (103.77s) 193.0s (321.68s)
M) Case+OrderBy 31.0s (45.66s) 47.3s (60.9s) 22 90.5s (112.24s)
24. Example of Performance Improvements
Example improvements regarding a particular service
Before: 116.8ms
After: 21.4ms
23
25. Fault-Tolerance Inspection
Failure Test Items Downtime
1 Front network (port1) No
2 Front network (port2) No
3 Internal network (primary) < 12s
4 Internal network (standby) No
Front SW1 Front SW2
5 MySQL instance < 4s
6 Node OS < 4s 1
Online data disk 11
7 < 5s 2
(SSD) failure
Log/work data disk
8 No DB DB DB DB
(SATA) failure 5,6 7,8
9 4 12
Infiniband switch (primary) < 12s
3
10 Infiniband switch (standby) No
11 Front network (port1&2) < 18s 10
9
Internal network
12 < 12s Infiniband SW1 Infiniband SW2
(primary & standby)
24
26. Time Required for Online Maintenance
Table Rows and Size
Small Medium Large
Row 50,000 500,000 5,000,000
Size (byte) 113,639,424 1,063,190,528 10,696,130,560
Implementation Time
Small Medium Large
Create Column 1.6s 13.5 149.8
Create Index 1.6s 13.0s 172.7s
Drop Column 1.5s 13.8s 125.5s
Drop Index 0.5s 0.5s 0.5s
25
27. Impacts During Online Scheme Modification
No impact on access performance in areas other than those subject to work operations
Some impact on performance of access to table being subject to work operations (taking
periods with little impacts, such as night service, into consideration)
Online execution –
5 million
cases, total tables
10G
26
29. Clustrix Implementation Impacts Release from Sharding (2)
No need for correction of application
No need for DB distribution
Sharding production costs reduction (over 90%) for
both application engineer and DBA
In case of large-scale
as-i
s sharding
project, actual
production costs
DBA
compared to
to-be
APP
original
0 2 4 6 8 10 12 14
m an-m onth
28
30. Clustrix Implementation Impacts Cost Reductions due to Consolidation (1)
Sufficient performance scalability
Fault-tolerance ready for mission critical
No data loss
High online maintainability that doesn’t affect other services
Possibility of consolidation to Clustrix of existing MySQL
database
29
31. Clustrix Implementation Impacts Cost Reductions due to Consolidation (2)
Consolidation of all existing MySQL within Clustrix
Number of servers will be reduced to 10%
Monthly system costs will be reduced to 40%
30
32. Back-up Structure
Clustrix
DB DB DB
…
Node 1 Node 2 Node 3
Replication
Slave as first
backup
Backup by mysqldump
MySQL
DB
NFS
NAS
31
33. Data Migration Procedure
Replication to DEV for verification
Replication to PRO for migration
Conversion of application access point to PRO
MySQL
DB
Replication
Replication
Clustrix DEV
Clustrix PRO DB DB DB
DB DB DB
32
34. Other Advantages of Clustrix
Auto-Defrag
Cordial Support Service
Advice regarding structure
Troubleshooting
Tuning advice
Etc.
33
35. Operational Issues Resolved with Clustrix
Data sharding operations Unnecessary, operational
cost reduction
Data protection, HA securing Possible
Online maintenance Possible
Tendency for large number of units Consolidation possible
Cost reduction
34
36. Clustrix at Rakuten
An important database platform
Provided as Database-as-a-Service
No lead-time
Usage volume rate structure
35