6. 6
Ext2 → Ext3 → Ext4
• Ext2 (1993), inspired from UFS, first popular and
stable Linux file system, but design is plain.
- File systems are getting bigger, how to look up a entry under a big
directory, how to reduce fsck time after crash ...
• Ext3 (2001), add journal, hash-tree directory indexing,
etc.
- File systems are getting bigger, have to eliminate these limitations
- Various advanced file systems impact ext3 ...
• Ext4 (2008), 48-bit addressing space, no limited
directory entries, extents, multi-block allocation,
delayed block allocation, online defragmentation, 256
bytes inode, persistent preallocation, barrier, etc.
7. 7
Nowadays
• Ext4, will continue to be maintained due to stability and
historical reasons
• XFS, robust and scalable, good performance for large
storage, will shine in handling big file area (e.g. virtual
machine image)
• BtrFS, new design (replace ext4), inspired from ZFS,
contains many features of enterprise file system. for
examples, copy on write, own internal RAID (manage
volume ), snapshot/clone support, dynamically grow
and shrink, SSD support, etc.
9. 9
Why CFS
• Independent storage devices (e.g. SAN).
• High availability requirement.
• How to scale out file system in
CPU,
memory,
even network bandwidth.
13. 13
Background
• Costs, storage array, fabric switches, HBA card, etc
are expensive.
• Unified storage space, linear expansion, commodity
hardware.
• Driven by Internet industry (e.g. search, picture share,
big data, etc).
• Google file system appeared (2003).
15. 15
DFS common points
• Not strictly comply with File system POSIX semantics,
most implementations are based on user-space.
• Share nothing, meta-data/file data are stored
separately, meta-data access/file data access are
separated.
• Have own local file system, a file represents a logical
data block, a data block has several copy blocks.
• Meta-data server usually load all meta data into
memory at start-up, writing logs records incremental
changes, then flushing memory to disk/merging log
and previous meta-data file gets a new checkpoint of
meta-data.
• Other algorithms: heartbeat algorithm, rack-aware,
block allocation policy, file lock management, etc.
16. 16
Scale out
• Meta-data cluster server, e.g. GFS2, Ceph.
• Fully symmetric, no central meta-data server, e.g.
GlusterFS.
• Improved cluster management mechanism.
hearbeat/corosync → zookeeper cluster
• IO Flow Control, reduce meta-data server
dependence, costs control (ECC), etc.
VS.