5. GlusterFS5
What is Data Maintenance
Maintenance tasks performed on data for protection,
performance, and optimum storage utilization
6. GlusterFS6
Challenges in Data maintenance
Data Maintenance has a overhead on CPU, Memory,
Storage, Network.. Therefore..
Fast
Search
Rich
Metadata
Distribute
Load
balancing
9. GlusterFS9
Optimized DB for GlusterFS
“ Record now , consume later”
Database optimized to record fast
Good Querying Capabilities
Embedded Database
Crash Consistent (Eventually)
14. GlusterFS14
Cache Tiering (Gluster 3.7 feature)
● Tiering
● logical volume composed of diverse storage units
● Secure / nonsecure, compressed / uncompressed, etc.
● Cache tiering
● Fast storage as cache for slow storage
● Fa$t SSD, slow HDD
● Fast 2X replicated, slow erasure coded
● What goes in the cache?
● DB tracks usage patterns
● Files migrate between tiers per usage
● Migration is slow
15. GlusterFS15
Policies for Smart Migration
● File size
● Access rate
● Migration frequency
● Break files into chunks
● Gluster “sharding” feature
16. GlusterFS16
Tier Xlator
HOT DHT COLD DHT
Replication Xlator
Other Client Xlator
HOT Tier
POSIX Xlator
CTR Xlator
Other Server Xlator
Brick Storage
Heat Data
Store
POSIX Xlator
CTR Xlator
Other Server Xlator
Brick Storage
Heat Data
Store
COLD Tier
Demotion
Promotion
Search should be precise and fast
Should have rich metadata filter : Modification Frequency, IO Sizes etc
Should deal with distributed nature of data
Should do load balancing
File system crawl : Slow
File system log : Write fast, Slow read and more space
Metadata databases: Gluster doesnot have one
In-memory inode caches: Not Durable
API Abstraction : Any DB
Rich Search Filters : Frequency Counters, Size of IO counters, Parts of File meta etc
Non Centralized : local to bricks
Performance optimization options
Updates can be Expense: Read + modify + updates
Scalability Issues: Since Single files and WAL complex queries can be slow
Durable Metadata: Not Suited for durable metadata