SlideShare a Scribd company logo
1 of 23
Unified read-only cache
proposal
Design goals
• A standalone SSD caching library that can be re-used between librbd
RGW
• Use cases:
• Librbd read-only cache: caching block contents on SSD
• Librbd parent/clone images, caching parent rbd contents on SSD, all cloned image can read
from parent image cache before COW happen
• RGW immutable caching: caching rados objects on SSD
• A small CDN farm behind RGW cluster
Cache
daemon
General architecture
• Libcachestore: common lib that
does read/write on SSD
• Sparse-file based cache
• Cache Daemon: controlling on the
cache promotion/demotion, sizing
of the cache
• Simple LRU based
• librbd/librgw hooks: call API from
libcachefile
FileImageCache
RBD_0
SSD
libCacheStore
RGW_DataCache
librbd librgw
RGW_civetweb
RBD_1
RBD_2
RGW_civetweb
RGW_civetweb
RADOS
librbd librados
hooks hooks
policy
Shared read-only RBD SSD cache
PR #16788
• A generic file-based persistent cache store
• Sparse-file-based cache
• Sync interfaces provided
• A generic read-only caching framework
• Cache promotion on reads
• Cache invalidate on writes, write requests will go to RADOS directly
• A simple shared read-only cache implementation(“happy” data path)
• Shared cache will be fully promoted on the opening of 1st child
• The missing:
• A standalone cache daemon controls the cache state
• A configurable policy to control promotion/demotion on shared cache
Initial results
4k Rand Read Op_Size Op_Type QD Runtime(sec) IOPS BW(MB/s) Latency(ms)
99.99%
Latency(ms)
1 osd 1 replica baseline 4k randread qd32 600 12927 50.5MB/s 2.437ms 8.889ms
Read-only cache 4k randread qd32 600 51436 200MB/s 0.563ms 4.832ms
Shared read-only cache(2
volumes)
4k randread qd32 600 69370 270MB/s 0.868ms 5.024ms
Cache 1G, volume 10G 4k randread qd32 600 12219 47.7MB/s 2.571ms 8.256ms
Cache 2G, volume 10G 4k randread qd32 600 14203 55.5MB/s 2.207ms 10.56ms
Cache 4G, volume 10G 4k randread qd32 600 19099 74.6MB/s 1.630ms 6.944ms
Cache 8G, volume 10G 4k randread qd32 600 46633 182MB/s 0.641ms 5.088ms
1 osd 1 replica baseline 4k randwrite qd32 600 8920 34.8MB/s 3.49ms 125ms
1 osd 1 replica with cache 4k randwrite qd32 600 8895 34.7MB/s 3.51ms 195.584ms
Shared read-only cache for RBD –rbd clone flow
RBD_0 RBD_0@snap1 RBD_1
RBD_2
RBD_N
…
Parent image Protected snapshot
Cloned image
Cloned image
Cloned image
This is the shared image content
Shared read-only cache for RBD – Cache Daemon
• Read-only blocks from parent image(s) are
cached in a shared area on compute node(s)
• Reads are served from the shared cache until the
first COW request
• A Cache Daemon
• On each compute node to control the shared
cache state
• Policy thread - owns a policy to control
promotion/demotion of the shared cache
• RBD instances do IPC with the daemon do
read/write lock on a shared cache block
• Upon recovery from crash or reboot, the
daemon tries to rebuild shared cache state
from persistent metadata
• Rebuild process is simple – read persistent
metafile, check existence of image and
corresponding cachefile
• If rebuild fails (for example, on a meta/cachefile
read error), reset to empty cache
RBD_instance
Write I/O
Read I/O
SSD
Compute node
Shared RO
Cache
RADOS
OSD OSD OSD OSD OSD OSD OSD
policy
Promote/demote
Cache
Daemon
IPC
IPC
Read I/O
(post-COW)
RBD_instance …
Meta
File
RBD_2 (cloned)
librbd
FileImageCache
librbd
FileImageCache
Shared Read-only cache for RBD – Promote flow
Cache_demon
Shared Cache file
RADOS
RBD_1 (cloned)
librbd
FileImageCache
Cache lookup2
Compute node
SSD
COW Cache
mapping
policy
In shared cache but missing now:
- WriterLock()
- Promote from RADOS
- Notify cloned image ready to read
COW Cache
mapping
1
3
4
Shared Cache file Meta
3
RBD_2 (cloned)
librbd
FileImageCache
librbd
FileImageCache
Shared Read-only cache for RBD – Demote flow
Cache_demon
Shared Cache file
RADOS
RBD_1 (cloned)
librbd
FileImageCache
Compute node
SSD
COW Cache
mapping
policy
COW Cache
mappingIn shared cache but cold
- Demote the block
Shared Cache file Meta
RBD_2 (cloned)
librbd
FileImageCache
librbd
FileImageCache
Shared Read-only cache for RBD – IO flow(read)
Cache_demon
Shared Cache file
RADOS
RBD_1 (cloned)
librbd
FileImageCache
Cache lookup
Compute node
SSD
COW Cache
mapping
policy
In shared cache now:
- ReaderLock()
- Check Meta
- Read from the shared cache
COW Cache
mapping
1
2
On COW :
- Read from RADOS
2’
Shared Cache file Meta
RBD_2 (cloned)
librbd
FileImageCache
librbd
FileImageCache
Shared Read-only cache for RBD – IO flow(write)
Cache_demon
Shared Cache file
RADOS
RBD_1 (cloned)
librbd
FileImageCache
Cache lookup
2
Compute node
SSD
COW Cache
mapping
policy
COW Cache
mapping
1
Write to RADOS
Shared Cache file Meta
Issues/Corner cases
• How to do VM migration? VM Crash?
• We could rebuild the cache state on RBD re-open
• RBD removed on other nodes?
• Policy thread in cache daemon will periodically check the local cache, and
remove those old cache eventually
• Cache daemon crash?
• The shared cache state will be persistent to local metafile
• The daemon is stateless, we only need to restart the daemon process and
rebuild the cache state
• Cache file inconsistent?
• We’re relying the filesystem to do the check, if some read error happen, we
simply re-issue a read from the RADOS
Shared RGW read-only SSD cache
Shared Read-only cache for RGW
chunk_id RGW instance id Cache_chunk_id
7e21a6b2-89b9-4de6-869e-
1ddc0198a82b.5228.1__shadow_.Tzk
bVV_syqJ2vumnFe8uAaiL9j6ghtC_34
Rgw_1 7e21a6b2-89b9-4de6-869e-
1ddc0198a82b.5228.1__shadow_.Tzk
bVV_syqJ2vumnFe8uAaiL9j6ghtC_34
• A CDN cluster behind the RGW clusters
• L1 cache: allow to read from SSD cache of local RGW instance
• L2 cache(configurable): allow to read from SSD cache on other remote RGW instances
• Each object/chunk has an unique ID
• Need a centralized distributed K/V to store the mapping as the chunks maybe spreaded
on different RGW instances
Shared Read-only cache for RGW
rgw_1 rgw_2
RADOS
Local
cache
Local
cache
librados
Immutable Cache
S3 API Swift API
rgw_frontend
rgw_rados
rgw_cache
datacachepolicy
Immutable Cache
L1 L2
Issues
• different caching semantics for block and object?
• Promoting at block level(default 8k) for librbd
• Promoting at object level for RGW
• #13144 is not compiling
• https://github.com/maniaabdi/engage1.git
• Jewel based, need to rebase against master
• Currently the logic is inside rgw_rados, need to be decupled to cope with our
design(libcachefile + policy)
RGW datacache (PR #13144)
rgw_1 rgw_2
RADOS
Local
cache
Local
cache
librados
Immutable Cache
S3 API Swift API
rgw_frontend
rgw_rados
rgw_cache
datacache
policy
Immutable Cache
L1 L2
backup
Shared read-only cache for RBD -- overview
• Read-only blocks from parent
image(s) are cached in a shared
area on compute node(s)
• Cloned image will read from the
shared cache unless COW happen
Local Cache
Write I/O
Read I/O
SSD backend
Write I/O
Read I/O
…
…
Compute node
Local CacheShared Cache
Shared
Cache
…
…
Compute node
RADOS
OSD OSD OSD OSD OSD OSD OSD
SSD backend
Cache
daemon
Shared read-only cache for RBD – fast cache
warmup
• The state of the shared cache will be persistent to a local metadata
file along with the cache file
• The state of the local cache will be persistent to RBD metadata
• On restart the cache controller will load the cache metadata file
and reuse the shared cache file
• On RBD instance restart the cache state will be loaded as an in-
mem map to tell the COWed parts
Each cloned image will have its COW cache mapping:
- For each read hit, either in shared cache, or in its own
cache
- Cache mapping bits for COWed data
- Updated when COW happen
Cache fileCache file
RBD_2(cloned)
librbd
FileImageCache
COW
data
librbd
FileImageCache
Shared Read-only cache for RBD – IO flow(private cache)
Cache_demon
Shared Cache file
RADOS
1
RBD_1(cloned)
librbd
FileImageCache
Cache lookup
COW
data
2
in shared cache:
- Read from shared cache
2’
in cow cache:
- Read from cow cache
Compute node
read
SSD
COW Cache mapping
rbd_id lba length
Rbd_1 8192 4096
Rbd_1 1048576 4096
COW Cache mapping
rbd_id lba length
Rbd_2 8192 4096
Rbd_2 1048576 4096
image_store
policy GC
librbd
FileImageCache
Cache fileCache file
RBD_2(cloned)
librbd
FileImageCache
COW
data
librbd
FileImageCache
Cache_demo
Shared Cache file
RADOS
1
COW Cache mapping
RBD_1(cloned)
Cache lookup
COW
data
2
2’
in cow cache:
- Invalidate the chunk in the cache file
- Write to RADOS
Compute node
rbd_id lba length
rbd_1 8192 4096
rbd_1 1048576 4096
write
SSD
Shared Read-only cache for RBD – IO flow(private cache)
COW Cache mapping
rbd_id lba length
rbd_2 81920 4096
rbd_2 1048576 4096
image_store
policy GC
in shared cache:
- Create entry in COW mapping table
- Write to RADOS

More Related Content

Recently uploaded

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Recently uploaded (20)

Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

Featured

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 

Featured (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Unified readonly cache for ceph updates sep cdm

  • 2. Design goals • A standalone SSD caching library that can be re-used between librbd RGW • Use cases: • Librbd read-only cache: caching block contents on SSD • Librbd parent/clone images, caching parent rbd contents on SSD, all cloned image can read from parent image cache before COW happen • RGW immutable caching: caching rados objects on SSD • A small CDN farm behind RGW cluster
  • 3. Cache daemon General architecture • Libcachestore: common lib that does read/write on SSD • Sparse-file based cache • Cache Daemon: controlling on the cache promotion/demotion, sizing of the cache • Simple LRU based • librbd/librgw hooks: call API from libcachefile FileImageCache RBD_0 SSD libCacheStore RGW_DataCache librbd librgw RGW_civetweb RBD_1 RBD_2 RGW_civetweb RGW_civetweb RADOS librbd librados hooks hooks policy
  • 5. PR #16788 • A generic file-based persistent cache store • Sparse-file-based cache • Sync interfaces provided • A generic read-only caching framework • Cache promotion on reads • Cache invalidate on writes, write requests will go to RADOS directly • A simple shared read-only cache implementation(“happy” data path) • Shared cache will be fully promoted on the opening of 1st child • The missing: • A standalone cache daemon controls the cache state • A configurable policy to control promotion/demotion on shared cache
  • 6. Initial results 4k Rand Read Op_Size Op_Type QD Runtime(sec) IOPS BW(MB/s) Latency(ms) 99.99% Latency(ms) 1 osd 1 replica baseline 4k randread qd32 600 12927 50.5MB/s 2.437ms 8.889ms Read-only cache 4k randread qd32 600 51436 200MB/s 0.563ms 4.832ms Shared read-only cache(2 volumes) 4k randread qd32 600 69370 270MB/s 0.868ms 5.024ms Cache 1G, volume 10G 4k randread qd32 600 12219 47.7MB/s 2.571ms 8.256ms Cache 2G, volume 10G 4k randread qd32 600 14203 55.5MB/s 2.207ms 10.56ms Cache 4G, volume 10G 4k randread qd32 600 19099 74.6MB/s 1.630ms 6.944ms Cache 8G, volume 10G 4k randread qd32 600 46633 182MB/s 0.641ms 5.088ms 1 osd 1 replica baseline 4k randwrite qd32 600 8920 34.8MB/s 3.49ms 125ms 1 osd 1 replica with cache 4k randwrite qd32 600 8895 34.7MB/s 3.51ms 195.584ms
  • 7. Shared read-only cache for RBD –rbd clone flow RBD_0 RBD_0@snap1 RBD_1 RBD_2 RBD_N … Parent image Protected snapshot Cloned image Cloned image Cloned image This is the shared image content
  • 8. Shared read-only cache for RBD – Cache Daemon • Read-only blocks from parent image(s) are cached in a shared area on compute node(s) • Reads are served from the shared cache until the first COW request • A Cache Daemon • On each compute node to control the shared cache state • Policy thread - owns a policy to control promotion/demotion of the shared cache • RBD instances do IPC with the daemon do read/write lock on a shared cache block • Upon recovery from crash or reboot, the daemon tries to rebuild shared cache state from persistent metadata • Rebuild process is simple – read persistent metafile, check existence of image and corresponding cachefile • If rebuild fails (for example, on a meta/cachefile read error), reset to empty cache RBD_instance Write I/O Read I/O SSD Compute node Shared RO Cache RADOS OSD OSD OSD OSD OSD OSD OSD policy Promote/demote Cache Daemon IPC IPC Read I/O (post-COW) RBD_instance … Meta File
  • 9. RBD_2 (cloned) librbd FileImageCache librbd FileImageCache Shared Read-only cache for RBD – Promote flow Cache_demon Shared Cache file RADOS RBD_1 (cloned) librbd FileImageCache Cache lookup2 Compute node SSD COW Cache mapping policy In shared cache but missing now: - WriterLock() - Promote from RADOS - Notify cloned image ready to read COW Cache mapping 1 3 4 Shared Cache file Meta 3
  • 10. RBD_2 (cloned) librbd FileImageCache librbd FileImageCache Shared Read-only cache for RBD – Demote flow Cache_demon Shared Cache file RADOS RBD_1 (cloned) librbd FileImageCache Compute node SSD COW Cache mapping policy COW Cache mappingIn shared cache but cold - Demote the block Shared Cache file Meta
  • 11. RBD_2 (cloned) librbd FileImageCache librbd FileImageCache Shared Read-only cache for RBD – IO flow(read) Cache_demon Shared Cache file RADOS RBD_1 (cloned) librbd FileImageCache Cache lookup Compute node SSD COW Cache mapping policy In shared cache now: - ReaderLock() - Check Meta - Read from the shared cache COW Cache mapping 1 2 On COW : - Read from RADOS 2’ Shared Cache file Meta
  • 12. RBD_2 (cloned) librbd FileImageCache librbd FileImageCache Shared Read-only cache for RBD – IO flow(write) Cache_demon Shared Cache file RADOS RBD_1 (cloned) librbd FileImageCache Cache lookup 2 Compute node SSD COW Cache mapping policy COW Cache mapping 1 Write to RADOS Shared Cache file Meta
  • 13. Issues/Corner cases • How to do VM migration? VM Crash? • We could rebuild the cache state on RBD re-open • RBD removed on other nodes? • Policy thread in cache daemon will periodically check the local cache, and remove those old cache eventually • Cache daemon crash? • The shared cache state will be persistent to local metafile • The daemon is stateless, we only need to restart the daemon process and rebuild the cache state • Cache file inconsistent? • We’re relying the filesystem to do the check, if some read error happen, we simply re-issue a read from the RADOS
  • 14. Shared RGW read-only SSD cache
  • 15. Shared Read-only cache for RGW chunk_id RGW instance id Cache_chunk_id 7e21a6b2-89b9-4de6-869e- 1ddc0198a82b.5228.1__shadow_.Tzk bVV_syqJ2vumnFe8uAaiL9j6ghtC_34 Rgw_1 7e21a6b2-89b9-4de6-869e- 1ddc0198a82b.5228.1__shadow_.Tzk bVV_syqJ2vumnFe8uAaiL9j6ghtC_34 • A CDN cluster behind the RGW clusters • L1 cache: allow to read from SSD cache of local RGW instance • L2 cache(configurable): allow to read from SSD cache on other remote RGW instances • Each object/chunk has an unique ID • Need a centralized distributed K/V to store the mapping as the chunks maybe spreaded on different RGW instances
  • 16. Shared Read-only cache for RGW rgw_1 rgw_2 RADOS Local cache Local cache librados Immutable Cache S3 API Swift API rgw_frontend rgw_rados rgw_cache datacachepolicy Immutable Cache L1 L2
  • 17. Issues • different caching semantics for block and object? • Promoting at block level(default 8k) for librbd • Promoting at object level for RGW • #13144 is not compiling • https://github.com/maniaabdi/engage1.git • Jewel based, need to rebase against master • Currently the logic is inside rgw_rados, need to be decupled to cope with our design(libcachefile + policy)
  • 18. RGW datacache (PR #13144) rgw_1 rgw_2 RADOS Local cache Local cache librados Immutable Cache S3 API Swift API rgw_frontend rgw_rados rgw_cache datacache policy Immutable Cache L1 L2
  • 20. Shared read-only cache for RBD -- overview • Read-only blocks from parent image(s) are cached in a shared area on compute node(s) • Cloned image will read from the shared cache unless COW happen Local Cache Write I/O Read I/O SSD backend Write I/O Read I/O … … Compute node Local CacheShared Cache Shared Cache … … Compute node RADOS OSD OSD OSD OSD OSD OSD OSD SSD backend Cache daemon
  • 21. Shared read-only cache for RBD – fast cache warmup • The state of the shared cache will be persistent to a local metadata file along with the cache file • The state of the local cache will be persistent to RBD metadata • On restart the cache controller will load the cache metadata file and reuse the shared cache file • On RBD instance restart the cache state will be loaded as an in- mem map to tell the COWed parts Each cloned image will have its COW cache mapping: - For each read hit, either in shared cache, or in its own cache - Cache mapping bits for COWed data - Updated when COW happen
  • 22. Cache fileCache file RBD_2(cloned) librbd FileImageCache COW data librbd FileImageCache Shared Read-only cache for RBD – IO flow(private cache) Cache_demon Shared Cache file RADOS 1 RBD_1(cloned) librbd FileImageCache Cache lookup COW data 2 in shared cache: - Read from shared cache 2’ in cow cache: - Read from cow cache Compute node read SSD COW Cache mapping rbd_id lba length Rbd_1 8192 4096 Rbd_1 1048576 4096 COW Cache mapping rbd_id lba length Rbd_2 8192 4096 Rbd_2 1048576 4096 image_store policy GC
  • 23. librbd FileImageCache Cache fileCache file RBD_2(cloned) librbd FileImageCache COW data librbd FileImageCache Cache_demo Shared Cache file RADOS 1 COW Cache mapping RBD_1(cloned) Cache lookup COW data 2 2’ in cow cache: - Invalidate the chunk in the cache file - Write to RADOS Compute node rbd_id lba length rbd_1 8192 4096 rbd_1 1048576 4096 write SSD Shared Read-only cache for RBD – IO flow(private cache) COW Cache mapping rbd_id lba length rbd_2 81920 4096 rbd_2 1048576 4096 image_store policy GC in shared cache: - Create entry in COW mapping table - Write to RADOS

Editor's Notes

  1. How to maintain the librbd parent/clone image table?
  2. On promotion, a Writer lock will be hold
  3. On demotion, a Writer lock will be hold
  4. On promotion, a Reader lock will be hold
  5. Maybe we could make the local cache to use the writeback policy?
  6. Is it possible to tell the COWed parts quickly from RBD?
  7. When to promote the shared cache file? -> when opening the first cloned image, the cache will be promoted to local, this could be optimized What data should we promote? parent_image@snapshot Librbd caching will be promoting at block size(4k default) level What is the cache file format? -> sparse file based
  8. Only do promote when read Writes to osd directly and invalidates the cache if cache_hit