customer customer has found that the performance levels are acceptable, but increase capacity by
If a has found that the performance levels are acceptable, but wants to wants to increase c
ld addcould add another 4, 1 TB each server, and will not generally generally experience performan
another 4, 1 TB drives to drives to each server, and will not experience performance degrad
than 12). Note that in this case, they are adding 2 more low-price servers, and can sim
drives. (See Config. C, above)
If they want to both quadruple performance and quadruple capacity, they could distribute
each server would have 12,1 TB drives). (See Config. D, below)
Note that by the time a solution has approximately 10 drives, the performance bottleneck
moved to the network. (See Config. D, above)
Ethernet network. Note that performance in this example is more than 25x that which we sa
is evidenced by an increase in performance from 200 MB/s in the baseline configuratio
Config. E, below)
As you will note, the power of the scale-out model is that both capacity and performanc
meet requirements. It is not necessary to know what performance levels will be needed 2,
configurations can be easily adjusted as the need demands.
# Start Gluster management daemon for each server
➜ sudo /etc/init.d/glusterd start
# Adding Servers to Trusted Storage Pool
➜ for HOST in host1 host2 host3; do
gluster peer probe $HOST; done
#=> Probe successful
Probe successful
Probe successful
➜ sudo gluster peer status
#=> Number of Peers: 3
Hostname: host1
Uuid: 81982001-ba0d-455a-bae8-cb93679dbddd
State: Peer in Cluster (Connected)
Hostname: host2
Uuid: 03945cd4-7487-4b2c-9384-f006a76dfee5
State: Peer in Cluster (Connected)...
# Create a distribute Volume named ‘log’
➜ SERVER_LOG_PATH=”/mnt/glusterfs/server/log”
➜ sudo gluster volume create log transport tcp
host1: $SERVER_LOG_PATH
host2: $SERVER_LOG_PATH
host3: $SERVER_LOG_PATH
$ sudo gluster volume start log # start Volume
$ sudo gluster volume info log
#=> Volume Name: log
Type: Distribute
Status: Started
Number of Bricks: 12
Transport-type: tcp
Bricks:
Brick1: delta1:/mnt/glusterfs/server/log
Brick2: delta2:/mnt/glusterfs/server/log
Brick3: delta3:/mnt/glusterfs/server/log
# Create a distribute replicate Volume named ‘repository’
➜ SERVER_REPO_PATH=”/mnt/glusterfs/server/repository”
➜ sudo gluster volume create repository
replica 2 transport tcp
host1: $SERVER_REPO_PATH host2: $SERVER_REPO_PATH
host3: $SERVER_REPO_PATH host4: $SERVER_REPO_PATH
$ sudo gluster volume info repository
#=> Volume Name: repository
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: delta1:/mnt/glusterfs/server/repository
Brick2: delta2:/mnt/glusterfs/server/repository
Brick3: delta3:/mnt/glusterfs/server/repository
Brick4: delta3:/mnt/glusterfs/server/repository
# Mount ‘log’ Volume
➜ CLIENT_LOG_PATH=”/mnt/glusterfs/client/log”
➜ sudo mount -t glusterfs
-o log-level=WARNING,log-file=/var/log/gluster.log
localhost:log $CLIENT_LOG_PATH # native-client
➜ sudo mount -t nfs -o mountproto=tcp
localhost:log $CLIENT_LOG_PATH # nfs
Multi-site cascading Geo-replication
Geo-replication over LAN
You can configure GlusterFS Geo-replication to mirror data over a Local Area Network.
Geo-replication over WAN
You can configure GlusterFS Geo-replication to replicate data over a Wide Area Network.
Geo-replication over WAN
You can configure GlusterFS Geo-replication to replicate data over a Wide Area Network.
Geo-replication over Internet
You
ds can configure GlusterFS Geo-replication to mirror data over the Internet.
Geo-replication over Internet
Gluster File system Administration Guide_3.2_02_B Pg No. 47
You can configure GlusterFS Geo-replication to mirror data over the Internet.
Figure 5, below, illustrates a typical distributed metadata server implementation. It can be seen that this appro
also results in considerable overhead processing for file access, and by design has built-in exposure
corruption scenarios. Here again we see a legacy approach to scale-out storage not congruent with
requirement of the modern data center or with the burgeoning migration to virtualization and cloud computing.
!
Figure!5!Decentralized!Metadata!Approach
any office that stores physical documents in folders in filing cabinets, that person should be able to f
-
Similarly, one could implement an algorithmic approach to data storage that used a similar
locate files. For example, in a ten system cluster, one
isk 10, etc. Figure 6, below illustrates this concept.
!
Figure!6:!Understanding!EHA:!Algorithm
and run it through the hashing algorithm. Each pathname/filename results in a unique numerical r
For the sake of simplicity, one could imagine assigning all files whose hash ends in the number 1
all which end in the number 2 to the second disk, etc. Figure 7, below, illustrates this concept.
!
Figure!7!Understanding!EHA:!Hashing
1. Setting up a very large number of virtual volumes
2. Using the hashing algorithm to assign files to virtual volumes
3. Using a separate process to assign virtual volumes to multiple physical devices
Thus, when disks or nodes are added or deleted, the algorithm itself does not need to be changed. However,
virtual volumes can be migrated or assigned to new physical locations as the need arises. Figure 8, below,
illustrates the Glus
!
Figure!8!Understanding!EHA:!Elasticity