SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Allan Jude: ZFS: Advanced Integration
Personal Background
● Server admin for 16 years
● FreeBSD src/doc committer, Core Team (July 2016-2017)
● Architect of ScaleEngine CDN
● Host of BSDNow.tv podcast
ZFS: What is it?
● built-in volume manager
● pool is thin provisioned to multiple filesystems
● checksums data and metadata
● compression
● COW snapshots, clones
● per fs tunable properties
Snapshots and Clones
● COW means instant snapshots
● blocks refed by snapshot are kept
● no r/w performance impact as no. of snapshot grows
● snapshots take almost no space (~200k metadata) until blocks change
Boot Environments
● Idea from Solaris
● root on ZFS, snapshot before upgrade
● bootloader support enables upgrades without fear
Boot Environment Tooling
● sysadmin/beadm (Shell script, not as good as it could be)
● GSoc 2017: be(8), libbe(3)
○ better management of fs properties for boot integration
○ “deep” boot env support (more below)
○ a proper C library allows integration with pkg(8) and GUIs
What it looks like
The Rest of the Pool
z: root of pool
z/tmp: /tmp
z/audit: not versioned
Example: Laptop
● with boot envs, no need to fear upgrading your laptop before a presentation anymore
Example: Deep Boot Envs
● /usr/src and /usr/obj should be data sets with extra properties for increased performance
● /usr/src and /usr/obj should match running OS in the boot env
Deep Boot Environments
newest,{/usr/src,/usr/obj}
cloned,{/usr/src,/usr/obj}
(startup scripts figures out which child datasets to mount)
BEs as Golden Images
● ScaleEngine use boot envs on all servers
● start with stock FreeBSD with security patches
● zfs send
● zfs recv
● temporarily mount to /mnt
● copy select config files
Persist Config Across Firmwares
● Enhanced process further from above
● New /cfg dataset holds persistent configs
● Images symlink those files from /etc, no need to merge /etc anymore
● zfs recv updated image
● zfsbootcfg (ZFS nextboot)
○ if the new image doesn’t work, reboot to old
● upgrade takes seconds
Replace NanoBSD
● Replace nanobsd in appliance with ZFS
● FreeNAS and pfSense have already done so
● Space efficient
● Still get firmware style (whole system image style) updates
● Reliability of ZFS checksums
● Enhanced nextboot: 3 consec. boot failure in <5mins boots rescue system to allow control of appliance
or AWS instance
Encryption Option #1: GELI
● AES-XTS or AES-CBC
● Full block device encryption (key per disk) support in bootloader
● In gptzfsbooot since 2016
● EFI support for booting encrypted pools before end of 2017
● Need console access to enter passphrase
● No keyfile support in bootloader
Encryption Option #2: ZFS Native
● AES-GCM or AES-ICM
● Not all metadata is encrypted, optionally not all datasets, allows datasets to be unmounted and keys
unloaded, data is protected as it is “at rest”
● Scrub & Resilver without keys loaded
● Diff keys for different datasets
● Spring 2018
● (ZFS will checksum both ciphertext and cleartext)
GELI Enhancements
● BSDCan and BSDCam GELI working groups produced enhancements to new metadata structures
● Support USB keys, separate partitions for storing keys
Appliances: Channel Programs
● multiple ZFS admin operations are not atomic and often slow
● New ZCP feature allows short LUA scripts to perform operations with locks held
● Instruction count and memory limited for safety
● Integrated last month. More scripts coming
● See https://www.bsdcan.org/2017/schedule/events/854.en.html
Appliances: Checkpoints (2017Q4)
● upgrade involve more than OS and tools
● checkpoint preserves ALL data (kind of like a snapshot but different)
● can undo operations that snapshot cannot, like destroying or renaming datasets
● if upgrade fails midway, rollback to checkpoint
● preserve checkpoint until upgrade confirmed good
What Would Make ZFS Better For You?
● just came from ZFS development summit
● cross-platform community, very active, interested in features that benefit users
● Would love user input
● FreeBSD foundation & Delphix are partnering to bring RAID-Z vdev expansion
● What do you need?
Near Future Features
● ZSTD Compression (4x-10x faster than gzip, comparable ratio)
● Adaptive Compression (compress as much as possible without slowing system)
● Faster Resilver (sequential I/O)
○ ZFS, with integrated volume manager, can avoid writing whole disk and only write blocks used
○ This enhancement makes resilver use sequential instead of random I/O
● Smarter Resilver (prefetch)
○ combined, 2x-8x faster when replacing disks
● ZIL performance
○ Intent Logs, how ZFS support synchronous writes
● MMP: safe “zpool import” for clusters
○ Lawrence livermore national labs, pool sharing in cluster
● Device Removal (Evacuation)
○ Delphix, move allocation to other disks, doesn’t work for RAID-Z
Further Future Features
● ZIL performance enhancements
● Fast clone deletion
○ Use live instead of dead list
● Spacemap log(faster alloc/free)
○ faster crash recovery, map + log for free space
● ashift policy, Replace 512b with 4Kn disks
○ time dependent geometry: e.g. “all new allocs will be 4k aligned from now on”
● Distributed Parity (DRAID)
○ e.g. 100 disks: broken into 10x10, throughput of a single disk can still be the bottleneck when
resilvering
○ a virtual disk, made up of chunks from all disks, overcomes this
○ virtual spares, minimizes “reduced parity” situations
● VDEV Classes (metadata, block size)
○ e.g. metadata on SSD, data on disk
● 1000x dedup performance using dedup log
○ problem: hash table on disk causes lots of random I/O, write amplification
○ even with successful dedup of data block, metadata still needs to be written
○ solution: make room in hash table by delete blocks with single ref ...
■ (scott: don’t trust this description as I didn’t understand this part)
○ store changes to hash table in log form like with spacelog
○ Proposal looking for funding
ZFSBook.com
● beginner and advanced books
BSDNow.tv
● weekly video podcast
Q&A
● Channel programs implies Lua interpreter in the kernel? yes. Customized for ZFS.
● Would you use boot envs for major FreeBSD version upgrades? A: yes
○ Newer config file format cause trouble on rollback? A: /etc generally lives with the env, config
file formats in FreeBSD are pretty stable
● Memory requirement for ZFS? A: depends on working set, min 512M, ZFS cache max (sysctl)
○ avoid compression feature if memory limited
● HammerFS vs ZFS? A: HammerFS 1&2 still relies on hw RAID, mainly for a cluster fs. ZFS doesn’t
trust disk or RAID hw.
● A tool for the zfs send, recv way of applying updates? A: currently they’re one liners
● Add 2nd disk, cannot boot problem? A: depends on how the disk is connected.
○ Tip: Don’t run completely out of space. Create an empty dataset with 1% of total storage, to
avoid bad things from happening.

Weitere ähnliche Inhalte

Was ist angesagt?

Cross Develop with VxWorks
Cross Develop with VxWorksCross Develop with VxWorks
Cross Develop with VxWorks
elicarmi
 

Was ist angesagt? (20)

Qemu gluster fs
Qemu gluster fsQemu gluster fs
Qemu gluster fs
 
Ovirt and gluster_hyperconvergence_devconf-2016
Ovirt and gluster_hyperconvergence_devconf-2016Ovirt and gluster_hyperconvergence_devconf-2016
Ovirt and gluster_hyperconvergence_devconf-2016
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
 
Kvm optimizations
Kvm optimizationsKvm optimizations
Kvm optimizations
 
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient WorkflowsExploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflows
 
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius SystemsXPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
 
Gluster fs for_storage_admins_glusterfs_meetup_07_feb
Gluster fs for_storage_admins_glusterfs_meetup_07_febGluster fs for_storage_admins_glusterfs_meetup_07_feb
Gluster fs for_storage_admins_glusterfs_meetup_07_feb
 
Wish list from PostgreSQL - Linux Kernel Summit 2009
Wish list from PostgreSQL - Linux Kernel Summit 2009Wish list from PostgreSQL - Linux Kernel Summit 2009
Wish list from PostgreSQL - Linux Kernel Summit 2009
 
Disaster Recovery Strategies Using oVirt's new Storage Connection Management ...
Disaster Recovery Strategies Using oVirt's new Storage Connection Management ...Disaster Recovery Strategies Using oVirt's new Storage Connection Management ...
Disaster Recovery Strategies Using oVirt's new Storage Connection Management ...
 
Ceph on Windows
Ceph on WindowsCeph on Windows
Ceph on Windows
 
Storage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStackStorage based snapshots for KVM VMs in CloudStack
Storage based snapshots for KVM VMs in CloudStack
 
Kernel Recipes 2016 -
Kernel Recipes 2016 - Kernel Recipes 2016 -
Kernel Recipes 2016 -
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Cross Develop with VxWorks
Cross Develop with VxWorksCross Develop with VxWorks
Cross Develop with VxWorks
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan HoracekOpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
OpenNebula Conf 2014 | Lightning talk: OpenNebula at Etnetera by Jan Horacek
 
Improvements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecaseImprovements in GlusterFS for Virtualization usecase
Improvements in GlusterFS for Virtualization usecase
 
How Can OpenNebula Fit Your Needs: A European Project Feedback
How Can OpenNebula Fit Your Needs: A European Project FeedbackHow Can OpenNebula Fit Your Needs: A European Project Feedback
How Can OpenNebula Fit Your Needs: A European Project Feedback
 
Make room! Make room!
Make room! Make room!Make room! Make room!
Make room! Make room!
 

Ähnlich wie Bsdtw17: allan jude: zfs: advanced integration

GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 MeetupGlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS
 

Ähnlich wie Bsdtw17: allan jude: zfs: advanced integration (20)

Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Scale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_glusterScale out backups-with_bareos_and_gluster
Scale out backups-with_bareos_and_gluster
 
Posscon2013
Posscon2013Posscon2013
Posscon2013
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
Lisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introductionLisa 2015-gluster fs-introduction
Lisa 2015-gluster fs-introduction
 
Asiabsdcon14
Asiabsdcon14Asiabsdcon14
Asiabsdcon14
 
Fossetcon14
Fossetcon14Fossetcon14
Fossetcon14
 
Tlf2014
Tlf2014Tlf2014
Tlf2014
 
Nycbsdcon14
Nycbsdcon14Nycbsdcon14
Nycbsdcon14
 
Scale2014
Scale2014Scale2014
Scale2014
 
Olf2013
Olf2013Olf2013
Olf2013
 
OpenZFS - AsiaBSDcon
OpenZFS - AsiaBSDconOpenZFS - AsiaBSDcon
OpenZFS - AsiaBSDcon
 
UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)UCL All of the Things (MeetBSD California 2014 Lightning Talk)
UCL All of the Things (MeetBSD California 2014 Lightning Talk)
 
Gluster intro-tdose
Gluster intro-tdoseGluster intro-tdose
Gluster intro-tdose
 
Flourish16
Flourish16Flourish16
Flourish16
 
Lavigne bsdmag apr13
Lavigne bsdmag apr13Lavigne bsdmag apr13
Lavigne bsdmag apr13
 
GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 MeetupGlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Architecture - June 30, 2011 Meetup
 
MySQL Oslayer performace optimization
MySQL  Oslayer performace optimizationMySQL  Oslayer performace optimization
MySQL Oslayer performace optimization
 

Mehr von Scott Tsai (8)

Bsdtw17: brooks davis: is it time to replace mmap?
Bsdtw17: brooks davis: is it time to replace mmap?Bsdtw17: brooks davis: is it time to replace mmap?
Bsdtw17: brooks davis: is it time to replace mmap?
 
Bsdtw17: arun thomas: risc v berkeley hardware for your berkeley software dis...
Bsdtw17: arun thomas: risc v berkeley hardware for your berkeley software dis...Bsdtw17: arun thomas: risc v berkeley hardware for your berkeley software dis...
Bsdtw17: arun thomas: risc v berkeley hardware for your berkeley software dis...
 
Bsdtw17: george neville neil: realities of dtrace on free-bsd
Bsdtw17: george neville neil: realities of dtrace on free-bsdBsdtw17: george neville neil: realities of dtrace on free-bsd
Bsdtw17: george neville neil: realities of dtrace on free-bsd
 
Bsdtw17: ruslan bukin: free bsd/risc-v and device drivers
Bsdtw17: ruslan bukin: free bsd/risc-v and device driversBsdtw17: ruslan bukin: free bsd/risc-v and device drivers
Bsdtw17: ruslan bukin: free bsd/risc-v and device drivers
 
Bsdtw17: theo de raadt: mitigations and other real security features
Bsdtw17: theo de raadt: mitigations and other real security featuresBsdtw17: theo de raadt: mitigations and other real security features
Bsdtw17: theo de raadt: mitigations and other real security features
 
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicumBsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
Bsdtw17: mariusz zaborski: case studies of sandboxing base system with capsicum
 
Bsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessionsBsdtw17: lightning talks/wip sessions
Bsdtw17: lightning talks/wip sessions
 
Bsdtw17: johannes m dieterich: high performance computing and gpu acceleratio...
Bsdtw17: johannes m dieterich: high performance computing and gpu acceleratio...Bsdtw17: johannes m dieterich: high performance computing and gpu acceleratio...
Bsdtw17: johannes m dieterich: high performance computing and gpu acceleratio...
 

Kürzlich hochgeladen

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Kürzlich hochgeladen (20)

Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 

Bsdtw17: allan jude: zfs: advanced integration

  • 1. Allan Jude: ZFS: Advanced Integration Personal Background ● Server admin for 16 years ● FreeBSD src/doc committer, Core Team (July 2016-2017) ● Architect of ScaleEngine CDN ● Host of BSDNow.tv podcast ZFS: What is it? ● built-in volume manager ● pool is thin provisioned to multiple filesystems ● checksums data and metadata ● compression ● COW snapshots, clones ● per fs tunable properties Snapshots and Clones ● COW means instant snapshots ● blocks refed by snapshot are kept ● no r/w performance impact as no. of snapshot grows ● snapshots take almost no space (~200k metadata) until blocks change Boot Environments ● Idea from Solaris ● root on ZFS, snapshot before upgrade ● bootloader support enables upgrades without fear Boot Environment Tooling ● sysadmin/beadm (Shell script, not as good as it could be) ● GSoc 2017: be(8), libbe(3) ○ better management of fs properties for boot integration ○ “deep” boot env support (more below) ○ a proper C library allows integration with pkg(8) and GUIs What it looks like The Rest of the Pool z: root of pool z/tmp: /tmp z/audit: not versioned Example: Laptop ● with boot envs, no need to fear upgrading your laptop before a presentation anymore Example: Deep Boot Envs ● /usr/src and /usr/obj should be data sets with extra properties for increased performance ● /usr/src and /usr/obj should match running OS in the boot env
  • 2. Deep Boot Environments newest,{/usr/src,/usr/obj} cloned,{/usr/src,/usr/obj} (startup scripts figures out which child datasets to mount) BEs as Golden Images ● ScaleEngine use boot envs on all servers ● start with stock FreeBSD with security patches ● zfs send ● zfs recv ● temporarily mount to /mnt ● copy select config files Persist Config Across Firmwares ● Enhanced process further from above ● New /cfg dataset holds persistent configs ● Images symlink those files from /etc, no need to merge /etc anymore ● zfs recv updated image ● zfsbootcfg (ZFS nextboot) ○ if the new image doesn’t work, reboot to old ● upgrade takes seconds Replace NanoBSD ● Replace nanobsd in appliance with ZFS ● FreeNAS and pfSense have already done so ● Space efficient ● Still get firmware style (whole system image style) updates ● Reliability of ZFS checksums ● Enhanced nextboot: 3 consec. boot failure in <5mins boots rescue system to allow control of appliance or AWS instance Encryption Option #1: GELI ● AES-XTS or AES-CBC ● Full block device encryption (key per disk) support in bootloader ● In gptzfsbooot since 2016 ● EFI support for booting encrypted pools before end of 2017 ● Need console access to enter passphrase ● No keyfile support in bootloader Encryption Option #2: ZFS Native ● AES-GCM or AES-ICM ● Not all metadata is encrypted, optionally not all datasets, allows datasets to be unmounted and keys unloaded, data is protected as it is “at rest” ● Scrub & Resilver without keys loaded ● Diff keys for different datasets ● Spring 2018 ● (ZFS will checksum both ciphertext and cleartext) GELI Enhancements ● BSDCan and BSDCam GELI working groups produced enhancements to new metadata structures ● Support USB keys, separate partitions for storing keys Appliances: Channel Programs ● multiple ZFS admin operations are not atomic and often slow ● New ZCP feature allows short LUA scripts to perform operations with locks held ● Instruction count and memory limited for safety ● Integrated last month. More scripts coming ● See https://www.bsdcan.org/2017/schedule/events/854.en.html Appliances: Checkpoints (2017Q4) ● upgrade involve more than OS and tools ● checkpoint preserves ALL data (kind of like a snapshot but different)
  • 3. ● can undo operations that snapshot cannot, like destroying or renaming datasets ● if upgrade fails midway, rollback to checkpoint ● preserve checkpoint until upgrade confirmed good What Would Make ZFS Better For You? ● just came from ZFS development summit ● cross-platform community, very active, interested in features that benefit users ● Would love user input ● FreeBSD foundation & Delphix are partnering to bring RAID-Z vdev expansion ● What do you need? Near Future Features ● ZSTD Compression (4x-10x faster than gzip, comparable ratio) ● Adaptive Compression (compress as much as possible without slowing system) ● Faster Resilver (sequential I/O) ○ ZFS, with integrated volume manager, can avoid writing whole disk and only write blocks used ○ This enhancement makes resilver use sequential instead of random I/O ● Smarter Resilver (prefetch) ○ combined, 2x-8x faster when replacing disks ● ZIL performance ○ Intent Logs, how ZFS support synchronous writes ● MMP: safe “zpool import” for clusters ○ Lawrence livermore national labs, pool sharing in cluster ● Device Removal (Evacuation) ○ Delphix, move allocation to other disks, doesn’t work for RAID-Z Further Future Features ● ZIL performance enhancements ● Fast clone deletion ○ Use live instead of dead list ● Spacemap log(faster alloc/free) ○ faster crash recovery, map + log for free space ● ashift policy, Replace 512b with 4Kn disks ○ time dependent geometry: e.g. “all new allocs will be 4k aligned from now on” ● Distributed Parity (DRAID) ○ e.g. 100 disks: broken into 10x10, throughput of a single disk can still be the bottleneck when resilvering ○ a virtual disk, made up of chunks from all disks, overcomes this ○ virtual spares, minimizes “reduced parity” situations ● VDEV Classes (metadata, block size) ○ e.g. metadata on SSD, data on disk ● 1000x dedup performance using dedup log ○ problem: hash table on disk causes lots of random I/O, write amplification ○ even with successful dedup of data block, metadata still needs to be written ○ solution: make room in hash table by delete blocks with single ref ... ■ (scott: don’t trust this description as I didn’t understand this part) ○ store changes to hash table in log form like with spacelog ○ Proposal looking for funding ZFSBook.com ● beginner and advanced books BSDNow.tv ● weekly video podcast Q&A ● Channel programs implies Lua interpreter in the kernel? yes. Customized for ZFS. ● Would you use boot envs for major FreeBSD version upgrades? A: yes ○ Newer config file format cause trouble on rollback? A: /etc generally lives with the env, config file formats in FreeBSD are pretty stable
  • 4. ● Memory requirement for ZFS? A: depends on working set, min 512M, ZFS cache max (sysctl) ○ avoid compression feature if memory limited ● HammerFS vs ZFS? A: HammerFS 1&2 still relies on hw RAID, mainly for a cluster fs. ZFS doesn’t trust disk or RAID hw. ● A tool for the zfs send, recv way of applying updates? A: currently they’re one liners ● Add 2nd disk, cannot boot problem? A: depends on how the disk is connected. ○ Tip: Don’t run completely out of space. Create an empty dataset with 1% of total storage, to avoid bad things from happening.