Breaking the Kubernetes Kill Chain: Host Path Mount
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Shah, Bloomreach
1. O C T O B E R 1 1 - 1 4 , 2 0 1 6 • B O S T O N , M A
2. Rebalance API for SolrCloud
Nitin Sharma - Senior Software Engineer, Netflix
Suruchi Shah - Software Engineer, Bloomreach
3. 3
Agenda
➢ Motivation
➢ Introduction to Rebalance API
➢ Scaling Scenarios
○ Query Performance
○ Redistribution of Data
○ Removing/Replacing Nodes
○ Indexing Performance
➢ Allocation Strategies
➢ Open Source
➢ Summary
4. 4
Motivation
● Harder to scale & guarantee SLA (Availability, Query Performance, Data Freshness ) for a multi-tenant, cross datacenter
search architecture based on solrcloud
● Scaling issues related to Query Performance:
■ Inability to auto scale Solr serving with increasing index sizes
■ Dynamic Shard Setup based on index size
● SLA issues due to Availability:
■ Nightmare to manipulate cluster/collection setup with expanding clusters and frequent node replacements (AWS
issues)
■ Flexible Replica Allocation based on custom strategies to guarantee 99.995% availability
● Data Freshness (aka Indexing Performance) hiccups:
■ Break the tight coupling between Indexing and Serving latency (Tp 95) SLAs
● Generic framework for Solr SLA management that can be open-sourced
5. 5
Rebalance API
● Fine-grained SLA management in Solr
● Smarter Index, Cluster & Data Management for SolrCloud
● Forms the basis for Solr Auto Scale
● Admin handler in Solr
● 2 levels of abstraction
● Scaling Strategies
○ Aid with shard, replica and cluster manipulation
techniques to guarantee SLA
○ Zero Downtime operations
○ Tunable for Availability, Performance or Cost
● Allocation Strategies
○ Decides Core Placement
○ Tunable for Availability, Performance or Cost
Rebalance API
Scaling Strategies Allocation Strategies
Auto Shard
Redistribute
Smart Merge
Replace
Scale Up
Scale Down
Least Used
Unused
AZ Aware
6. 6
Agenda
➢ Motivation
➢ Introduction to Rebalance API
➢ Scaling Scenarios
○ Query Performance
○ Redistribution of Data
○ Removing/Replacing Nodes
○ Indexing Performance
➢ Allocation Strategies
➢ Open Source
➢ Summary
7. 7
01
Query Performance Issues
● Indexing doubles documents - Per
shard latency goes up
● Tp 95 shoots up
● No way to change shards
dynamically
● Delete, Recreate, Re-index
● Availability goes down
Node 1
Zk Ensemble
Node 2
Solr Collection A
Indexing 2x documents
Shard 1 Shard 2
Query Tp95
Shard 3 Shard 4
8. 8
03
Rebalance Auto Shard
● Re-sharding existing collection to any number of destination shards. (e.g can help with reducing latency)
● Includes re-distributing the index and configs consistently.
● Zero downtime - No query failures
● Avoiding any heavy re-indexing processes.
● Can be size based as well
● Sample API Call:
/admin/collections?action=REBALANCE &scaling_strategy=AUTO_SHARD
&collection=collection_name&num_shards=number of shards &size_cap=1G
&allocation_strategy=least_used
10. 10
01
Auto Shard (1) - Internals - Simple Strategy
● Merge documents from all shards -
Lucene library based
● Split the merged shard into desired
number
● Auto Zk update for shard ranges
● Heavyweight but works
● Based on size, could take in 20-30 of
minutes to complete
Solr Collection A
Shard 1 Shard 2
Merged Shard
(Temp)
Shard 1 Shard 2 Shard 3 Shard 4
Solr Collection A’
merge
Even Split
11. 11
01
Auto Shard (2) - Internals - Smart Split Strategy
● Identify minimum number of splits
● Split shards in parallel to required desired
setup
● Relatively high performance
○ 2 Tb index from 2 to 4 shards - 2.5 mins
● Auto Zk Update
Solr Collection A
Shard 1 Shard 2
Shard
1_1
Shard
1_2
Shard
2_1
Shard
2_2
Solr Collection A
Shard 1 Shard 2 Shard 3 Shard 4
Solr Collection A
Smart Split Renamed Shards
12. 12
01
Solution : Auto Sharding
● Dynamically Increase/Decrease shards
● E.g. Increase shards from 2 to 4 to
reduce latency
● E.g. Tp 95 reduced from > 1 sec to 250
ms.
Solr Collection A
Node 1
Zk Ensemble
Node 2
Shard 1 Shard 2
Shard 3 Shard 4
13. 13
03
Agenda
➢ Motivation
➢ Introduction to Rebalance API
➢ Scaling Scenarios
○ Query Performance
○ Redistribution of Data
○ Removing/Replacing Nodes
○ Indexing Performance
○ Data Consistency
➢ Allocation Strategies
➢ Open Source
➢ Summary
14. 14
01
Data distribution issues
● Adding a new node - Does nothing
● No Redistribution of solr cores
● Machines running out of disk space -
heavier collections need to be moved out
● Problem amplifies at large scale - 100s of
nodes, 1000s of collections
● Manual Management of core placement
becomes an issue
Default Solr Behavior
Node
1
Zk Ensemble
Node
2
Node
3
Core
A2
Core
B1
Core
D2
Core
D1
Core
A1
Core
B2
Core
C1
Core
C2
Node
4
15. 15
01
Re-distribute Strategy - Internals
● Internal topology construction from ZK
● Desired Core placement computation - External
or Trigger based
● Migration of cores within the cluster
● Knobs to control min/max to reduce cluster
load
● Zero downtime - No query failures and
resiliency to node failures
● API Call: /admin/collections?
action=REBALANCE&scaling_strategy=REDIST
RIBUTE
Compute &
Redistribute
Node
1
Zk Ensemble
Node
2
Node
3
Core
A2
Core
B1
Core
D1
Core
A1
Core
B2
Core
C1
Node
1
Zk Ensemble
Node
2
Node
3
Core
A2
Core
B1
Core
D1
Core
A1
Core
B2
Core
C1
17. 17
03
Agenda
➢ Motivation
➢ Introduction to Rebalance API
➢ Scaling Scenarios
○ Query Performance
○ Redistribution of Data
○ Removing/Replacing Nodes
○ Indexing Performance
➢ Allocation Strategies
➢ Open Source
➢ Summary
18. 18
01
Replace Solr Nodes
Default Solr Behavior
● A node might die, need to be replaced,
decommissioned
● Default behavior - Do nothing
● Can cause downtime - Heavy cores on
the nodes
● Problem exacerbated with 1000s of
nodes/collections
Node
1
ZkEnsemble
Node
2
Node
3
Core
A2
Core
B1
Core
D1
Core
A1
Core
B2
Core
C1
19. 19
01
Replace Nodes with Rebalance
● Read the Topology of cluster
● Migrate replicas from node about to die to
new node
● Zero downtime
● API Call:
○ /admin/collections?
action=REBALANCE&scaling_strategy=REPLACE
&collection=collectionName&source_node=so
urce_host &dest_node=dest_host
Node
1
ZkEnsemble
Node
2
Node
3
Core
A2
Core
B1
Core
D1
Core
A1
Core
B2
Core
C1
Node
4
Core
B1
Core
A1
Replaced
Node
20. 20
03
Agenda
➢ Motivation
➢ Introduction to Rebalance API
➢ Scaling Scenarios
○ Query Performance
○ Redistribution of Data
○ Removing/Replacing Nodes
○ Indexing Performance
➢ Allocation Strategies
➢ Open Source
➢ Summary
21. 21
01
Indexing Performance
● Higher the shards, faster the indexing
(parallelism)
● Faster indexing - Data Freshness SLA
● Solr - # shards is the same for indexing
vs serving
● Shard setup - tweaked for serving
query performance
● E.g.
○ Indexing 100M docs in 2 shards - 2
hours
○ Serving 100M docs in 2 shards - Tp
95 < 100 ms
Shard 1 Shard 2 Shard 3 Shard 4
Indexing
500M Documents
Performance Hit
Serving Queries
22. 22
01
Indexing Performance - Smart Merge
● Separate Indexing shard setup vs serving
● More shards for indexing - Merged into lesser shards for serving
● Post Indexing issue API to merge into serving collection
● API Call:
○ /admin/collections?
action=Rebalance&scaling_strategy=SMART_MERGE_DISTRIBUTED&collection=collecti
onName&num_shards=numRequiredShards
● Parallel Merge
● Zero downtime
23. 23
01
Indexing Performance - Smart Merge
● Index vs Serving has
different collections
● Indexed Collection
merged into Serving -
Using smart merge call
● Indexing can be tuned
independently for
performance
● Serving SLA unaffected
Shard 1 Shard 2 Shard 3 Shard 4
Indexing
500M Documents
Serving Queries
Shard
1
Shard
4
Shard
5
Shard
8
Shard
9
Shard
12
Shard
13
Shard
16… … … …
Collection A_Indexing
Collection A
Parallel merge Parallel merge Parallel merge Parallel merge
24. 24
Rebalance API
● Fine-grained SLA management in Solr
● Smarter Index, Cluster & Data Management for SolrCloud
● Forms the basis for Solr Auto Scale
● Admin handler in Solr
● 2 levels of abstraction
● Scaling Strategies
○ Aid with shard, replica and cluster manipulation
techniques to guarantee SLA
○ Zero Downtime operations
○ Tunable for Availability, Performance or Cost
● Allocation Strategies
○ Decides Core Placement
○ Tunable for Availability, Performance or Cost
Rebalance API
Scaling Strategies Allocation Strategies
Auto Shard
Redistribute
Smart Merge
Replace
Scale Up
Scale Down
Least Used
Unused
AZ Aware
25. 25
01
Allocation Strategies
● Abstracts out the core placement methodology
● Least Used Strategy - Pick the node that has the least amount of cores
● AZ aware Strategy - Pick the node that is in a different availability zone than the other
cores for a given collection
● Unused Strategy - Pick the node that does not have any cores for a given collection
● All of them are compatible with all scaling strategies
26. 26
01
Open Source
● Fully open sourced - SOLR-9241
(4.6.1).
● Contributed patch works on top of
4.6.1 and tested up to 4.10
● SOLR-9241 (epic) - patches/features
on master. Has sub patches
○ SOLR-93{16-21}
○ SOLR-9407
● Actively working with community to
get the rest of the API 6+ compatible.
27. 27
01
Summary
● Harder to scale & guarantee SLA (Availability, Query Performance, Data Freshness ) for a multi-tenant, cross datacenter search
architecture based on solrcloud
● Rebalance API
○ Scaling Strategies - How to scale?
○ Allocation Strategies - Where to place cores?
● Forms the basis for Solr Auto Scale
● Zero Downtime operations for
○ Dynamically changing shard setup
○ Decoupling indexing SLA from Serving
○ Replacing Nodes
○ Auto -Redistributing data with cluster expansion
● Open Source