AWS Community Day CPH - Three problems of Terraform
Gluster: where weve been - a history
1. Gluster: Where We've Been
AB Periasamy
Office of the CTO, Red Hat
John Mark Walker
Gluster Community Guy
2. Topics
The Big Idea
Humble beginnings
From Bangalore to Milpitas
Scale-out + Open source == WINNING
User-space, no metadata server, stackable
Cloud and commoditization
06/13/12
3. A Data Explosion!
74% == Unstructured data annual growth
63,000 PB == Scale-out storage in 2015
40% == storage-related expense for cloud
44x == Unstructured data volume growth by
2020
06/13/12
4. Conference Room US Head Office
06/13/12
Bengaluru Office Bengaluru Office
7. What Can You Store?
Media – Docs, Photos, Video
VM Filesystem – VM Disk Images
Big Data – Log Files, RFID Data
Objects – Long Tail Data
06/13/12
8. The big idea:
Storage should be
simple
Simple, scalable, low-cost
06/13/12
9. What is GlusterFS,
Really?
Gluster is a unified, distributed
storage system
DHT, stackable, POSIX, Swift, HDFS
06/13/12
10. Phase 1: Lego Kit for
Storage
“People who think that userspace filesystems
are realistic for anything but toys are just
misguided" – Linus Torvalds
Goal: create a global namespace
06/13/12
11. volume testvol-posix
type storage/posix
option directory /media/datastore
option volume-id 329e31c1-04cc-4386-8bb8-xxxx
end-volume
volume testvol-access-control
type features/access-control
subvolumes testvol-posix
end-volume
volume testvol-locks
type features/locks
subvolumes testvol-access-control
end-volume
volume testvol-io-threads
type performance/io-threads
subvolumes testvol-locks
end-volume
06/13/12
12. Versions 1.x – 2.x
Hand-crafted volume definition files
See examples
Simple configuration files
Faster than tape? It's good!
06/13/12
19. And now for something
completely different
Commoditization and the changing
economics of storage
Why we're winning
06/13/12
20. Simple Economics
Simplicity, scalability, less cost
Virtualized Multi-Tenant Automated Commoditized
Scale on Demand In the Cloud Scale Out Open Source
06/13/12
21. Simplicity Bias
FC, FCoE, iSCSI → HTTP, Sockets
Modified BSD OS → Linux / User Space /
C, Python & Java
Appliance based → Application based
06/13/12
23. Thank you!
AB Periasamy
Office of the CTO, Red Hat
ab@redhat.com
John Mark Walker
Gluster Community Guy
johnmark@redhat.com
Hinweis der Redaktion
Add examples where complexity has been bad - EMC, Cisco, Brocade et al. certification made business out of complexity - if too complicated, doesn't scale
Discuss approach – how GlusterFS is unique and different from other approaches - Lessons form GNU Hurd - user space distributed storage operating system - overcome some parts of the OS: implemented scheduler, POSIX locking, RDMA, MM, cf. JVM, python, etc. - no metadata separation
If you have a bunch of files, should be as simple as an FTP server - in user space, required FUSE, POSIX translator, NAS protocol, cluster translator
Learned about missing features Found the largest problem and wanted to solve it - patterns emerged - scalable unstructured data storage was the #1 problem people wanted to solve Had a clearer idea where we wanted to go – clear direction
Standalone NFS replacement Active-active replicated storage Scalable, distributed storage .. And then scalable, replicated distributed storage + other combos
Elastic features driven by cloud and virt usage - shared storage for virtual guests - flexible, self-service storage - elastic volume management became requirement - automated provisioning of storage w/ CLI (native NFS server? Or 3.2?)
Marker famework: - story of why it's necessary - backup of data in other locales - don't need entire snapshot - users wanted to continuous, unlimited replication - don't want sysadmin intervention – on-demand - queries FS to find what files have changed - manages queue, telling rsync exactly which files to change Inotify – doesn't scale, if daemon crashes, stops tracking changes - would have to write journaling feature to maintain change queue Geo-replication – can work on high-latency, flaky networks