Powerful Google developer tools for immediate impact! (2023-24 C)
Â
Gluster fs buero20_presentation
1. GlusterFS
Distributed Replicated Parallel File System
Text
Text
Martin Alfke <martin.alfke@buero20.org>
Thursday, November 4, 2010
2. Agenda
⢠General Information on GlusterFS
⢠Architecture Overview
⢠GlusterFS Translators
⢠GlusterFS ConďŹguration
Thursday, November 4, 2010
4. General Information I
File System Solutions
Shared Disk File System
â˘San or Block Access
â˘Mostly used in HA setup (e.g. DRBD)
Distributed File System
â˘Network File System
â˘NFS
â˘SMB/CIFS
â˘9P
Distributed replicated File System
â˘Replication
â˘HA and oďŹine operation
â˘Coda
â˘MS DFS
â˘MooseFS
4
Thursday, November 4, 2010
5. General Information II
File System Solutions
Distributed parallel File System
â˘Setup across multiple servers
â˘HPC
Distributed replicated parallel File System
â˘HPC and HA
â˘Cosmos
â˘MogileFS
â˘GPFS (IBM)
â˘GFS (Google)
â˘Hadoop
â˘GlusterFS
5
Thursday, November 4, 2010
6. Customer Platform
Shared Storage Issues
Bad performance of NFS kernel stack
Limitations of concurrent NFS accesses
Customer data already on NFS system
Environment and Requirements
Debian GNU/Linux Version 4 (etch) â 3 years old
Most D f-t p FS need complex data migration
Solution has to be expandable
6
Thursday, November 4, 2010
7. Why GlusterFS ?
Decision basics
Possibility to run NFS and GlusterFS in parallel
No data migration necessary
Easy setup
Extendable (e.g. new storage nodes)
min. Kernel 2.6.3x --> optimization for FUSE context switches !
7
Thursday, November 4, 2010
9. GlusterFS Basics
Hardware
Any x86 Hardware
Direct attached storage, RAID
FC, InďŹniband or iSCSI SAN
Gigabit or 10 Gigabit network or InďŹniband
OS
Linux, Solaris, OpenSolaris, OS X, FreeBSD
ext3 or ext4 Filesystem (tested)
Other POSIX compliant ďŹlesystems should also work
Architecture
No Meta-Data Server (fully distributed architecture - Elastic Hash)
Replication (RAID 1)
Distribution (RAID 0)
FUSE (Standard)
NFS (unfs3 - depreciated)
SMB/CIFS
DAV
9
Thursday, November 4, 2010
12. GlusterFS Architecture Overview III
Recommended Server Setup
GlusterFS daemons
GlusterFS Server
Network
*nix distribution
FileSystem (ext3/ext4 or POSIX)
Volume Manager (LVM2 only)
HW RAID
Disks
Image Copyright by Gluster.Inc.
12
Thursday, November 4, 2010
14. GlusterFS Translators
GlusterFS modular extension
storage
â˘posix â for underlying ďŹlesystem
â˘bdb â database storage system
protocol
â˘server â required for server conďŹg (e.g. IniďŹniband or TCP)
â˘client â required for client conďŹg (e.g. InďŹniband or TCP)
cluster
â˘distribute â storage spread to multiple servers
â˘replicate â mirror content between servers
â˘stripe â striping between multiple servers
â˘unify â obsolete, use cluster/distribute
â˘nufa â HPC - higher preference for a local volume
encryption
â˘rot-13 â sample code for encryption
14
Thursday, November 4, 2010
15. GlusterFS Translators
performance
â˘readahead â for sequential read performance
â˘writebehind â aggregate written blocks
â˘io-threads â number of threads
â˘io-cache â read cache block size
â˘quickread â fetch small ďŹles in a single call
â˘stat-prefetch â prefetch stats in readdir
â˘symlink-cache â undocumented
15
Thursday, November 4, 2010
16. GlusterFS Translators
features
â˘locks â POSIX locking
â˘ďŹlter â ďŹltering on user id or group id
â˘access-control â undocumented
â˘path-convertor â internal path converter
â˘quota â donât grow beyond disk space
â˘read-only â included in feature/ďŹlter
â˘trash â recycle bin (use on server)
debug
â˘trace â trace glusterfs functions and system calls
â˘io-stats â collect performance data
â˘error-gen â undocumented
16
Thursday, November 4, 2010
17. GlusterFS
ConďŹguration
Thursday, November 4, 2010
18. GlusterFS ConďŹguration Location
ConďŹguration ďŹles (path and names)
Server
â˘/etc/glusterfs/glusterfsd.vol
Client
â˘/etc/glusterfs/glusterfs.vol
Possible location of conďŹg ďŹles for clients
Local on disk
Remote on glusterfs-Server
18
Thursday, November 4, 2010
19. GlusterFS ConďŹguration Example
2 Servers â n Clients â replication
Server 1 + 2
volume posix
type storage/posix
option directory /data/export
end-volume
volume locks
type features/locks
subvolumes posix
end-volume
volume server
volume brick type protocol/server
type performance/io-threads option transport-type tcp
option thread-count 8 option auth.addr.brick.allow 192.168.0.102
subvolumes locks subvolumes brick
end-volume end-volume
19
Thursday, November 4, 2010
20. GlusterFS ConďŹguration Example
2 Servers â n Clients â replication
Clients
volume remote1
type protocol/client
option transport-type tcp volume replicate
option remote-host server1.example.com type cluster/replicate
option remote-subvolume brick subvolumes remote1 remote2
end-volume end-volume
volume remote2 volume writebehind
type protocol/client type performance/write-behind
option transport-type tcp option window-size 1MB
option remote-host server2.example.com subvolumes replicate
option remote-subvolume brick end-volume
end-volume
volume cache
type performance/io-cache
option cache-size 512MB
subvolumes writebehind
end-volume
20
Thursday, November 4, 2010
21. GlusterFS ConďŹguration @Customer
Servers - both
volume posix
type storage/posix
option directory /data
End-volume
volume iostats
type debug/io-stats
subvolumes posix volume iothreads
end-volume type performance/io-threads
option thread-count 16
volume locks subvolumes locks
type features/locks end-volume
subvolumes iostats
end-volume volume writebehind
type performance/write-behind
option cache-size 64MB
option ďŹush-behind oďŹ
subvolumes iothreads
end-volume
21
Thursday, November 4, 2010
22. GlusterFS ConďŹguration @Customer
Servers - both - continued
volume brick
type performance/io-cache
option cache-size 2048MB
option cache-timeout 5
subvolumes writebehind
End-volume
volume server-tcp
type protocol/server
option transport-type tcp
option auth.addr.brick.allow *
option transport.socket.listen-port 6996
option transport.socket.nodelay on
subvolumes brick
end-volume
22
Thursday, November 4, 2010