SlideShare ist ein Scribd-Unternehmen logo
1 von 18
A Fast File System for UNIX
    Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry

    Slides by Aleatha Parker-Wood




Tuesday, April 6, 2010
State of the Art


    •    Bell Labs UNIX file system for the PDP-11 (referred to as “old
         filesystem” or OldFS)

    •    Disks are divided into physical partitions which contain a file system

    •    Linked list of free blocks stored in superblock

    •    inodes point either directly to blocks or to indirect blocks




Tuesday, April 6, 2010
Inode Layout in OldFS

                         inodes             data




•    All inodes are stored at the beginning of the disk region for the filesystem

      •    Incurs long seek times for every access

•    inodes for files are unlikely to be adjacent to their containing directory’s
     inodes or to each other

      •    More seek time incurred

Tuesday, April 6, 2010
Data Layout in OldFS

    •    Completely agnostic to physical storage device

    •    Consecutive file blocks unlikely to be on the same cylinder

          •     Even more seeking

    •    512 byte blocks (increased to 1024 bytes)

          •     Increasing the block size improved performance by a factor of 2

          •     Ergo: room for improvement!


Tuesday, April 6, 2010
Performance for OldFS


    •    Old system using 4% of disk bandwidth

    •    Performance good initially (175kbps), but degraded over time
         (30kbps)

    •    Free list became increasingly disorganized as file system was used...

    •    Blocks allocated in increasingly random locations




Tuesday, April 6, 2010
The Fast File System (FFS)



    •    Disk partitions divided into “cylinder groups”

    •    4K minimum block size

          •     ensures few levels of indirection (2 for files < than 4 GB)

    •    Blocks are broken into fragments to accommodate small files



Tuesday, April 6, 2010
Cylinder Groups


    •    Bookkeeping info stored for each cylinder group

          •     Backup copy of superblock
          •     Space for inodes
          •     A bit map of free blocks/fragments
          •     A static number of inodes allocated at creation time

    •    Bookkeeping info stored at a varying offset for each group (so losing
         the top platter will not result in complete data loss)



Tuesday, April 6, 2010
Fragments


    •    2,4, or 8 per block (minimum size is a disk sector, 512 bytes)

    •    Files never use more than one fragmented block

    •    Writing to a file which occupies a fragmented block either fills the
         current block (if room is available) or allocates a new block.

    •    Expanding files a fragment at a time causes frequent copying, writing
         in full blocks is optimal.



Tuesday, April 6, 2010
Layout Optimizations

    •    Optimize for the processor and mass storage device (usually disk)

    •    Cylinder aware

    •    Chooses rotationally optimal blocks (either consecutive or delayed)

    •    Stores rotational layout tables to find positions with data already
         written nearby

    •    Trade off between localizing data references and spreading unrelated
         data across cylinder groups.


Tuesday, April 6, 2010
Layout Policies: Inodes



    •    Inodes of files in a directory often accessed together

          •     For instance, ls reads every inode in the directory

    •    Keep inodes in same cylinder group

    •    When creating new directories, choose cylinder group with few
         current inodes and directories


Tuesday, April 6, 2010
Layout Policies: Data Blocks


    •    Place all data blocks for a file within the same cylinder group

    •    Preferably at rotationally optimal placements

    •    If file is greater than 48K (i.e., an indirect block is needed), move to
         new cylinder group (you had to seek anyway...)

    •    Likewise for every MB thereafter




Tuesday, April 6, 2010
So when you say “Fast” File
    System....




Tuesday, April 6, 2010
Read Throughput
                                Processor/   Speed     Max read
                         Type
                                   Bus       (Kbps)   bandwidth   %    %CPU

                                  750/
                     Old 1024
                                 UNIBUS       29        983       3     11
                                  750/
                New 4096/1024
                                 UNIBUS      221        983       22   43
                                  750/
                New 8192/1024
                                 UNIBUS      233        983       24   29
                                 750/
                New 4096/1024
                                MASSBUS      466        983       47   73
                                 750/
                New 8192/1024
                                MASSBUS      466        983       47   54

Tuesday, April 6, 2010
Write Throughput
                                Processor/   Speed    Max write
                         Type
                                   Bus       (Kbps)   bandwidth   %    %CPU

                                  750/
                     Old 1024
                                 UNIBUS       48        983       5    29
                                  750/
                New 4096/1024
                                 UNIBUS      142        983       14   43
                                  750/
                New 8192/1024
                                 UNIBUS      215        983       22   46
                                 750/
                New 4096/1024
                                MASSBUS      323        983       33   94
                                 750/
                New 8192/1024
                                MASSBUS      466        983       47   95

Tuesday, April 6, 2010
Other metrics...


    •    When running ls for large directories containing other directories,
         disk accesses for inodes cut in two

    •    Large directories containing only files cut by up to a factor of eight

    •    Transfer rates stable over time

    •    Throughput varies with amount of free space maintained (reduced by
         half when system is full)



Tuesday, April 6, 2010
Other Enhancements

    •    Arbitrary length file names (ok, 512 bytes)

    •    Advisory file locking

          •     Shared or exclusive

          •     Applied or removed only on open files

    •    Symbolic links, a la Multics

    •    Atomic rename operation

    •    Quotas

Tuesday, April 6, 2010
Conclusions


    •    Taking advantage of disk geometry and access patterns resulted in 10-
         fold improvement in both read and write throughput

    •    Improvements in block layout increased locality while reducing
         wasted space

    •    Hardware matters!




Tuesday, April 6, 2010
Thank you. Questions?




Tuesday, April 6, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Windows Server 2012 Installation and Configurtion Superiorgrw
Windows Server 2012 Installation and Configurtion SuperiorgrwWindows Server 2012 Installation and Configurtion Superiorgrw
Windows Server 2012 Installation and Configurtion SuperiorgrwAwais Amjad
 
Present of Raid and Its Type
Present of Raid and Its TypePresent of Raid and Its Type
Present of Raid and Its TypeUsama ahmad
 
ZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 ConferenceZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 ConferenceRichard Elling
 
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...Gene Carboni
 
Advanced format for hard disk drives
Advanced format for hard disk drivesAdvanced format for hard disk drives
Advanced format for hard disk drivesIDEMA_USA
 
NETWORK FILE SYSTEM
NETWORK FILE SYSTEMNETWORK FILE SYSTEM
NETWORK FILE SYSTEMRoshan Kumar
 
Presentation dropbox
Presentation   dropboxPresentation   dropbox
Presentation dropboxMohdSuhailZU
 
Essbase log files
Essbase log filesEssbase log files
Essbase log filesAmit Sharma
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesVeritas Technologies LLC
 
Essbase ASO and BSO tuning
Essbase ASO and BSO tuningEssbase ASO and BSO tuning
Essbase ASO and BSO tuningsodhiranga
 
Network attached storage
Network attached storageNetwork attached storage
Network attached storageashutosh rai
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkFlink Forward
 
SUN Network File system - Design, Implementation and Experience
SUN Network File system - Design, Implementation and Experience SUN Network File system - Design, Implementation and Experience
SUN Network File system - Design, Implementation and Experience aniadkar
 
File System and File allocation tables
File System and File allocation tablesFile System and File allocation tables
File System and File allocation tablesshashikant pabari
 
NAS - Network Attached Storage
NAS - Network Attached StorageNAS - Network Attached Storage
NAS - Network Attached StorageShashank Bhatnagar
 
Chapter07 Advanced File System Management
Chapter07      Advanced  File  System  ManagementChapter07      Advanced  File  System  Management
Chapter07 Advanced File System ManagementRaja Waseem Akhtar
 

Was ist angesagt? (20)

Windows Server 2012 Installation and Configurtion Superiorgrw
Windows Server 2012 Installation and Configurtion SuperiorgrwWindows Server 2012 Installation and Configurtion Superiorgrw
Windows Server 2012 Installation and Configurtion Superiorgrw
 
Nfs
NfsNfs
Nfs
 
Present of Raid and Its Type
Present of Raid and Its TypePresent of Raid and Its Type
Present of Raid and Its Type
 
ZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 ConferenceZFS Tutorial USENIX LISA09 Conference
ZFS Tutorial USENIX LISA09 Conference
 
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...
Lesson 3 - Understanding Native Applications, Tools, Mobility, and Remote Man...
 
Advanced format for hard disk drives
Advanced format for hard disk drivesAdvanced format for hard disk drives
Advanced format for hard disk drives
 
Object storage
Object storageObject storage
Object storage
 
NETWORK FILE SYSTEM
NETWORK FILE SYSTEMNETWORK FILE SYSTEM
NETWORK FILE SYSTEM
 
Presentation dropbox
Presentation   dropboxPresentation   dropbox
Presentation dropbox
 
Essbase log files
Essbase log filesEssbase log files
Essbase log files
 
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup AppliancesDeep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
Deep Dive: a technical insider's view of NetBackup 8.1 and NetBackup Appliances
 
Essbase ASO and BSO tuning
Essbase ASO and BSO tuningEssbase ASO and BSO tuning
Essbase ASO and BSO tuning
 
Network attached storage
Network attached storageNetwork attached storage
Network attached storage
 
Dongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of FlinkDongwon Kim – A Comparative Performance Evaluation of Flink
Dongwon Kim – A Comparative Performance Evaluation of Flink
 
Storage and Alfresco
Storage and AlfrescoStorage and Alfresco
Storage and Alfresco
 
SUN Network File system - Design, Implementation and Experience
SUN Network File system - Design, Implementation and Experience SUN Network File system - Design, Implementation and Experience
SUN Network File system - Design, Implementation and Experience
 
File System and File allocation tables
File System and File allocation tablesFile System and File allocation tables
File System and File allocation tables
 
IBM GPFS
IBM GPFSIBM GPFS
IBM GPFS
 
NAS - Network Attached Storage
NAS - Network Attached StorageNAS - Network Attached Storage
NAS - Network Attached Storage
 
Chapter07 Advanced File System Management
Chapter07      Advanced  File  System  ManagementChapter07      Advanced  File  System  Management
Chapter07 Advanced File System Management
 

Ähnlich wie Fast File System

Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsMatt Simmons
 
Btrfs by Chris Mason
Btrfs by Chris MasonBtrfs by Chris Mason
Btrfs by Chris MasonTerry Wang
 
Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Seshu Chakravarthy
 
Secondary storage devices
Secondary storage devices Secondary storage devices
Secondary storage devices Slideshare
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfChristalin Nelson
 
Allocation and free space management
Allocation and free space managementAllocation and free space management
Allocation and free space managementrajshreemuthiah
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
Osdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkOsdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkUdo Seidel
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sortingDrkhanchanaR
 
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
Network Implementation and Support Lesson 05   File Access - Eric VanderburgNetwork Implementation and Support Lesson 05   File Access - Eric Vanderburg
Network Implementation and Support Lesson 05 File Access - Eric VanderburgEric Vanderburg
 
How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsIsaac Christoffersen
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputerPaul Dingman
 
9_Storage_Devices.pptx
9_Storage_Devices.pptx9_Storage_Devices.pptx
9_Storage_Devices.pptxJawaharPrasad3
 
DownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfDownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfHanaBurhan1
 
Working of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memoryWorking of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memoryDon Caeiro
 

Ähnlich wie Fast File System (20)

Lect09
Lect09Lect09
Lect09
 
Solid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln LabsSolid State Drive Technology - MIT Lincoln Labs
Solid State Drive Technology - MIT Lincoln Labs
 
Btrfs by Chris Mason
Btrfs by Chris MasonBtrfs by Chris Mason
Btrfs by Chris Mason
 
Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02Secondarystoragedevices1 130119040144-phpapp02
Secondarystoragedevices1 130119040144-phpapp02
 
Secondary storage devices
Secondary storage devices Secondary storage devices
Secondary storage devices
 
DiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdfDiskStorage_BasicFileStructuresandHashing.pdf
DiskStorage_BasicFileStructuresandHashing.pdf
 
Ext filesystem4
Ext filesystem4Ext filesystem4
Ext filesystem4
 
File server-info
File server-infoFile server-info
File server-info
 
Allocation and free space management
Allocation and free space managementAllocation and free space management
Allocation and free space management
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
Osdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talkOsdc2011.ext4btrfs.talk
Osdc2011.ext4btrfs.talk
 
M1 rl 1.2.1
M1 rl 1.2.1M1 rl 1.2.1
M1 rl 1.2.1
 
Unit 4 external sorting
Unit 4   external sortingUnit 4   external sorting
Unit 4 external sorting
 
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
Network Implementation and Support Lesson 05   File Access - Eric VanderburgNetwork Implementation and Support Lesson 05   File Access - Eric Vanderburg
Network Implementation and Support Lesson 05 File Access - Eric Vanderburg
 
How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation Savings
 
Hadoop on a personal supercomputer
Hadoop on a personal supercomputerHadoop on a personal supercomputer
Hadoop on a personal supercomputer
 
9_Storage_Devices.pptx
9_Storage_Devices.pptx9_Storage_Devices.pptx
9_Storage_Devices.pptx
 
Secondary storage
Secondary storageSecondary storage
Secondary storage
 
DownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdfDownloadClassSessionFile (44).pdf
DownloadClassSessionFile (44).pdf
 
Working of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memoryWorking of Volatile and Non-Volatile memory
Working of Volatile and Non-Volatile memory
 

Kürzlich hochgeladen

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Fast File System

  • 1. A Fast File System for UNIX Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry Slides by Aleatha Parker-Wood Tuesday, April 6, 2010
  • 2. State of the Art • Bell Labs UNIX file system for the PDP-11 (referred to as “old filesystem” or OldFS) • Disks are divided into physical partitions which contain a file system • Linked list of free blocks stored in superblock • inodes point either directly to blocks or to indirect blocks Tuesday, April 6, 2010
  • 3. Inode Layout in OldFS inodes data • All inodes are stored at the beginning of the disk region for the filesystem • Incurs long seek times for every access • inodes for files are unlikely to be adjacent to their containing directory’s inodes or to each other • More seek time incurred Tuesday, April 6, 2010
  • 4. Data Layout in OldFS • Completely agnostic to physical storage device • Consecutive file blocks unlikely to be on the same cylinder • Even more seeking • 512 byte blocks (increased to 1024 bytes) • Increasing the block size improved performance by a factor of 2 • Ergo: room for improvement! Tuesday, April 6, 2010
  • 5. Performance for OldFS • Old system using 4% of disk bandwidth • Performance good initially (175kbps), but degraded over time (30kbps) • Free list became increasingly disorganized as file system was used... • Blocks allocated in increasingly random locations Tuesday, April 6, 2010
  • 6. The Fast File System (FFS) • Disk partitions divided into “cylinder groups” • 4K minimum block size • ensures few levels of indirection (2 for files < than 4 GB) • Blocks are broken into fragments to accommodate small files Tuesday, April 6, 2010
  • 7. Cylinder Groups • Bookkeeping info stored for each cylinder group • Backup copy of superblock • Space for inodes • A bit map of free blocks/fragments • A static number of inodes allocated at creation time • Bookkeeping info stored at a varying offset for each group (so losing the top platter will not result in complete data loss) Tuesday, April 6, 2010
  • 8. Fragments • 2,4, or 8 per block (minimum size is a disk sector, 512 bytes) • Files never use more than one fragmented block • Writing to a file which occupies a fragmented block either fills the current block (if room is available) or allocates a new block. • Expanding files a fragment at a time causes frequent copying, writing in full blocks is optimal. Tuesday, April 6, 2010
  • 9. Layout Optimizations • Optimize for the processor and mass storage device (usually disk) • Cylinder aware • Chooses rotationally optimal blocks (either consecutive or delayed) • Stores rotational layout tables to find positions with data already written nearby • Trade off between localizing data references and spreading unrelated data across cylinder groups. Tuesday, April 6, 2010
  • 10. Layout Policies: Inodes • Inodes of files in a directory often accessed together • For instance, ls reads every inode in the directory • Keep inodes in same cylinder group • When creating new directories, choose cylinder group with few current inodes and directories Tuesday, April 6, 2010
  • 11. Layout Policies: Data Blocks • Place all data blocks for a file within the same cylinder group • Preferably at rotationally optimal placements • If file is greater than 48K (i.e., an indirect block is needed), move to new cylinder group (you had to seek anyway...) • Likewise for every MB thereafter Tuesday, April 6, 2010
  • 12. So when you say “Fast” File System.... Tuesday, April 6, 2010
  • 13. Read Throughput Processor/ Speed Max read Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 29 983 3 11 750/ New 4096/1024 UNIBUS 221 983 22 43 750/ New 8192/1024 UNIBUS 233 983 24 29 750/ New 4096/1024 MASSBUS 466 983 47 73 750/ New 8192/1024 MASSBUS 466 983 47 54 Tuesday, April 6, 2010
  • 14. Write Throughput Processor/ Speed Max write Type Bus (Kbps) bandwidth % %CPU 750/ Old 1024 UNIBUS 48 983 5 29 750/ New 4096/1024 UNIBUS 142 983 14 43 750/ New 8192/1024 UNIBUS 215 983 22 46 750/ New 4096/1024 MASSBUS 323 983 33 94 750/ New 8192/1024 MASSBUS 466 983 47 95 Tuesday, April 6, 2010
  • 15. Other metrics... • When running ls for large directories containing other directories, disk accesses for inodes cut in two • Large directories containing only files cut by up to a factor of eight • Transfer rates stable over time • Throughput varies with amount of free space maintained (reduced by half when system is full) Tuesday, April 6, 2010
  • 16. Other Enhancements • Arbitrary length file names (ok, 512 bytes) • Advisory file locking • Shared or exclusive • Applied or removed only on open files • Symbolic links, a la Multics • Atomic rename operation • Quotas Tuesday, April 6, 2010
  • 17. Conclusions • Taking advantage of disk geometry and access patterns resulted in 10- fold improvement in both read and write throughput • Improvements in block layout increased locality while reducing wasted space • Hardware matters! Tuesday, April 6, 2010