SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Downloaden Sie, um offline zu lesen
'FUSE'ing Python for rapid development of
storage efficient file-system
PyCon APAC ‘12
Singapore, Jun 07-09, 2012
Chetan Giridhar, Vishal Kanaujia
File Systems
• Provides way to organize, store, retrieve and manage
information
• Abstraction layer
• File system:
– Maps name to an object
– Objects to file contents
• File system types (format):
– Media based file systems (FAT, ext2)
– Network file systems (NFS)
– Special-purpose file systems (procfs)
• Services
– open(), read(), write(), close()…
User space
Kernel space
User
File System
Hardware
Virtual File-system
• To support multiple FS in *NIX
• VFS is an abstract layer in kernel
• Decouples file system implementation from the
interface (POSIX API)
– Common API serving different file system types
• Handles user calls related to file systems.
– Implements generic FS actions
– Directs request to specific code to handle the request
• Associate (and disassociate) devices with instances of
the appropriate file system.
File System: VFS
ext2 NFS procfs
User
VFS
glibc.so
Kernel space
Block layer/ Device
drivers/Hardware
System call
interface
glibc.so: open(), read()
Myapp.py: open(), read()
System call: sys_open(),
sys_read()
VFS: vfs_open(),
vfs_read()
Developing FS in *NIX
• In-kernel file-systems (traditionally)
• It is a complex task
• Understanding kernel libraries and modules
• Development experience in kernel space
• Managing disk i/o
• Time consuming and tedious development
– Frequent kernel panic
– Higher testing efforts
• Kernel bloating and side effects like security
Solution: User space
• In user space:
– Shorter development cycle
– Easy to update fixes, test and distribute
– More flexibility
• Programming tools, debuggers, and libraries as you
have if you were developing standard *NIX applications
• User-space file-systems
– File systems become regular applications (as
opposed to kernel extensions)
FUSE (Filesystem in USErspace)
• Implement a file system in user-space
– no kernel code required!
• Secure, non-privileged mounts
• User operates on a mounted instance of FS:
- Unix utilities
- POSIX libraries
• Useful to develop “virtual” file-systems
– Allows you to imagine “anything” as a file ☺
– local disk, across the network, from memory, or any other
combination
FUSE: Diving deeper
Myfile.py:
open()
glibc.so
VFS: vfs_open() fuse.ko /dev/fuse
libfuse.so
CustomFS: open()
FUSE | develop
• Choices of development in C, C++, Java, … and of course
Python!
• Python interface for FUSE
– (FusePython: most popularly used)
• Open source project
– http://fuse.sourceforge.net/
• For ubuntu systems:
$sudo apt-get instatall python-fuse
$mkdir ./mount_point
$python myfuse.py ./mount_point
$fusermount -u ./mount_point
FUSE API Overview
• File management
– open(path)
– create(path, mode)
– read(path, length, offset)
– write(path, data, offset)
• Directory and file system management
– unlink(path)
– readdir(path)
• Metadata operations
– getattr(path)
– chmod(path, mode)
– chown(path, uid, gid)
seFS – storage efficient FS
• A prototype, experimental file system with:
– Online data de-duplication (SHA1)
– Compression (text based)
• SQLite
• Ubuntu 11.04, Python-Fuse Bindings
• Provides following services:
open() write() chmod()
create() readdir() chown()
read() unlink()
seFS Architecture
Your
application
<<FUSE code>>
myfuse.py
File
System
Operations
seFS DB
storage
Efficiency =
De-duplication
+
Compression<<pylibrary>>
SQLiteHandler.py
Compression.py
Crypto.py
<<SQLite DB
Interface>>
seFS.py
seFS: Database
seFS
Schema
CREATE TABLE data(
"id" INTEGER
PRIMARY KEY
AUTOINCREMENT,
"sha" TEXT,
"data" BLOB,
"length" INTEGER,
"compressed" BLOB);
CREATE TABLE metadata(
"id" INTEGER,
"abspath" TEXT,
"length" INTEGER,
"mtime" TEXT,
"ctime" TEXT,
"atime" TEXT,
"inode" INTEGER);
data table
metadata table
seFS API flow
$touch abc
getattr()
create()
open()
release()
$rm abc $cat >> abc
getattr()
access()
getattr()
open()
flush()
release()
unlink()
create()
seFS DB
User
Operations
seFS APIs
storage
Efficiency
write()
seFS: Code
#!/usr/bin/python
import fuse
import stat
import time
from seFS import seFS
fuse.fuse_python_api = (0, 2)
class MyFS(fuse.Fuse):
def __init__(self, *args, **kw):
fuse.Fuse.__init__(self, *args, *kw)
# Set some options required by the
# Python FUSE binding.
self.flags = 0
self.multithreaded = 0
self.fd = 0
self.sefs = seFS()
ret = self.sefs.open('/')
self.sefs.write('/', "Root of the seFS")
t = int(time.time())
mytime = (t, t, t)
ret = self.sefs.utime('/', mytime)
self.sefs.setinode('/', 1)
seFS: Code (1)
def getattr(self, path):
sefs = seFS()
stat = fuse.stat()
context = fuse.FuseGetContext()
#Root
if path == '/':
stat.stat_nlink = 2
stat.stat_mode = stat.S_IFDIR | 0755
else:
stat.stat_mode = stat.S_IFREG | 0777
stat.stat_nlink = 1
stat.stat_uid, stat.stat_gid =
(context ['uid'], context
['gid'])
# Search for this path in DB
ret = sefs.search(path)
# If file exists in DB, get its times
if ret is True:
tup = sefs.getutime(path)
stat.stat_mtime =
int(tup[0].strip().split('.')[0])
stat.stat_ctime =
int(tup[1].strip().split('.')[0])
stat.stat_atime =
int(tup[2].strip().split('.')[0])
stat.stat_ino =
int(sefs.getinode(path))
# Get the file size from DB
if sefs.getlength(path) is not None:
stat.stat_size =
int(sefs.getlength(path))
else:
stat.stat_size = 0
return stat
else:
return - errno.ENOENT
seFS: Code (2)
def create(self, path,flags=None,mode=None):
sefs = seFS()
ret = self.open(path, flags)
if ret == -errno.ENOENT:
#Create the file in database
ret = sefs.open(path)
t = int(time.time())
mytime = (t, t, t)
ret = sefs.utime(path, mytime)
self.fd = len(sefs.ls())
sefs.setinode(path, self.fd)
return 0
def write(self, path, data, offset):
length = len(data)
sefs = seFS()
ret = sefs.write(path, data)
return length
seFS: Learning
• Design your file system and define the
objectives first, before development
– skip implementing functionality your file system
doesn’t intend to support
• Database schema is crucial
• Knowledge on FUSE API is essential
– FUSE APIs are look-alike to standard POSIX APIs
– Limited documentation of FUSE API
• Performance?
Conclusion
• Development of FS is very easy with FUSE
• Python aids RAD with Python-Fuse bindings
• seFS: Thought provoking implementation
• Creative applications – your needs and
objectives
• When are you developing your own File
system?! ☺
Further Read
• Sample Fuse based File systems
– Sshfs
– YoutubeFS
– Dedupfs
– GlusterFS
• Python-Fuse bindings
– http://fuse.sourceforge.net/
• Linux internal manual
Contact Us
• Chetan Giridhar
– http://technobeans.com
– cjgiridhar@gmail.com
• Vishal Kanaujia
– http://freethreads.wordpress.com
– vishalkanaujia@gmail.com
Backup
Agenda
• The motivation
• Intro to *NIX File Systems
• Trade off: code in user and kernel space
• FUSE?
• Hold on – What’s VFS?
• Diving into FUSE internals
• Design and develop your own File System with
Python-FUSE bindings
• Lessons learnt
• Python-FUSE: Creative applications/ Use-cases
User-space and Kernel space
• Kernel-space
– Kernel code including device drivers
– Kernel resources (hardware)
– Privileged user
• User-space
– User application runs
– Libraries dealing with kernel space
– System resources
Virtual File-system
• To support multiple FS in *NIX
• VFS is an abstract layer in kernel
• Decouples file system implementation from the
interface (POSIX API)
– Common API serving different file system types
• Handles user calls related to file systems.
– Implements generic FS actions
– Directs request to specific code to handle the request
• Associate (and disassociate) devices with instances of
the appropriate file system.
FUSE: Internals
• Three major components:
– Userspace library (libfuse.*)
– Kernel module (fuse.ko)
– Mount utility (fusermount)
• Kernel module hooks in to VFS
– Provides a special device “/dev/fuse”
• Can be accessed by a user-space process
• Interface: user-space application and fuse kernel
module
• Read/ writes occur on a file descriptor of /dev/fuse
FUSE Workflow
User space
Kernel space
User (file I/O)
FUSE: kernel lib
Custatom File Systatem
Virtual File Systatem
User-space FUSE lib
Facts and figures
• seFS – online storage efficiency
• De-duplication/ compression
– Managed catalogue information (file meta-data rarely changes)
– Compression encoded information
• Quick and easy prototyping (Proof of concept)
• Large dataset generation
– Data generated on demand
Creative applications: FUSE based File
systems
• SSHFS: Provides access to a remote file-system
through SSH
• WikipediaFS: View and edit Wikipedia articles as
if they were real files
• GlusterFS: Clustered Distributed Filesystem
having capability to scale up to several petabytes.
• HDFS: FUSE bindings exist for the open
source Hadoop distributed file system
• seFS: You know it already ☺

Weitere ähnliche Inhalte

Was ist angesagt?

11 linux filesystem copy
11 linux filesystem copy11 linux filesystem copy
11 linux filesystem copyShay Cohen
 
FUSE Developing Fillesystems in userspace
FUSE Developing Fillesystems in userspaceFUSE Developing Fillesystems in userspace
FUSE Developing Fillesystems in userspaceelliando dias
 
Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2) Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2) Ahmed El-Arabawy
 
FUSE (Filesystem in Userspace) on OpenSolaris
FUSE (Filesystem in Userspace) on OpenSolarisFUSE (Filesystem in Userspace) on OpenSolaris
FUSE (Filesystem in Userspace) on OpenSolariselliando dias
 
Unix file systems 2 in unix internal systems
Unix file systems 2 in unix internal systems Unix file systems 2 in unix internal systems
Unix file systems 2 in unix internal systems senthilamul
 
The sysfs Filesystem
The sysfs FilesystemThe sysfs Filesystem
The sysfs FilesystemJeff Yana
 
Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems Ahmed El-Arabawy
 
Chapter 10 - File System Interface
Chapter 10 - File System InterfaceChapter 10 - File System Interface
Chapter 10 - File System InterfaceWayne Jones Jnr
 
Python & FUSE
Python & FUSEPython & FUSE
Python & FUSEJoseph Scott
 
Course 101: Lecture 2: Introduction to Operating Systems
Course 101: Lecture 2: Introduction to Operating Systems Course 101: Lecture 2: Introduction to Operating Systems
Course 101: Lecture 2: Introduction to Operating Systems Ahmed El-Arabawy
 
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012Jose L. QuiĂąones-Borrero
 
Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1) Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1) Ahmed El-Arabawy
 
(120513) #fitalk an introduction to linux memory forensics
(120513) #fitalk   an introduction to linux memory forensics(120513) #fitalk   an introduction to linux memory forensics
(120513) #fitalk an introduction to linux memory forensicsINSIGHT FORENSIC
 
Memory forensics
Memory forensicsMemory forensics
Memory forensicsSunil Kumar
 
AOS Lab 7: Page tables
AOS Lab 7: Page tablesAOS Lab 7: Page tables
AOS Lab 7: Page tablesZubair Nabi
 
AOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationAOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationZubair Nabi
 
4. linux file systems
4. linux file systems4. linux file systems
4. linux file systemsMarian Marinov
 

Was ist angesagt? (20)

11 linux filesystem copy
11 linux filesystem copy11 linux filesystem copy
11 linux filesystem copy
 
FUSE Developing Fillesystems in userspace
FUSE Developing Fillesystems in userspaceFUSE Developing Fillesystems in userspace
FUSE Developing Fillesystems in userspace
 
Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2) Course 102: Lecture 16: Process Management (Part 2)
Course 102: Lecture 16: Process Management (Part 2)
 
FUSE (Filesystem in Userspace) on OpenSolaris
FUSE (Filesystem in Userspace) on OpenSolarisFUSE (Filesystem in Userspace) on OpenSolaris
FUSE (Filesystem in Userspace) on OpenSolaris
 
Unix file systems 2 in unix internal systems
Unix file systems 2 in unix internal systems Unix file systems 2 in unix internal systems
Unix file systems 2 in unix internal systems
 
The sysfs Filesystem
The sysfs FilesystemThe sysfs Filesystem
The sysfs Filesystem
 
Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems Course 102: Lecture 28: Virtual FileSystems
Course 102: Lecture 28: Virtual FileSystems
 
why we need ext4
why we need ext4why we need ext4
why we need ext4
 
Chapter 10 - File System Interface
Chapter 10 - File System InterfaceChapter 10 - File System Interface
Chapter 10 - File System Interface
 
Python & FUSE
Python & FUSEPython & FUSE
Python & FUSE
 
Vfs
VfsVfs
Vfs
 
Course 101: Lecture 2: Introduction to Operating Systems
Course 101: Lecture 2: Introduction to Operating Systems Course 101: Lecture 2: Introduction to Operating Systems
Course 101: Lecture 2: Introduction to Operating Systems
 
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012
Linux for Security Professionals (Tips and Tricks) - Init 6 10/2012
 
Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1) Course 102: Lecture 26: FileSystems in Linux (Part 1)
Course 102: Lecture 26: FileSystems in Linux (Part 1)
 
(120513) #fitalk an introduction to linux memory forensics
(120513) #fitalk   an introduction to linux memory forensics(120513) #fitalk   an introduction to linux memory forensics
(120513) #fitalk an introduction to linux memory forensics
 
Memory forensics
Memory forensicsMemory forensics
Memory forensics
 
AOS Lab 7: Page tables
AOS Lab 7: Page tablesAOS Lab 7: Page tables
AOS Lab 7: Page tables
 
005 skyeye
005 skyeye005 skyeye
005 skyeye
 
AOS Lab 12: Network Communication
AOS Lab 12: Network CommunicationAOS Lab 12: Network Communication
AOS Lab 12: Network Communication
 
4. linux file systems
4. linux file systems4. linux file systems
4. linux file systems
 

Ähnlich wie Fuse'ing python for rapid development of storage efficient

Fuse'ing python for rapid development of storage efficient FS
Fuse'ing python for rapid development of storage efficient FSFuse'ing python for rapid development of storage efficient FS
Fuse'ing python for rapid development of storage efficient FSChetan Giridhar
 
Part 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxPart 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxTushar B Kute
 
Javase7 1641812
Javase7 1641812Javase7 1641812
Javase7 1641812Vinay H G
 
UNIT III.pptx
UNIT III.pptxUNIT III.pptx
UNIT III.pptxYogapriyaJ1
 
Writing file system in CPython
Writing file system in CPythonWriting file system in CPython
Writing file system in CPythondelimitry
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systemsAbDul ThaYyal
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsDivye Kapoor
 
Introduction to file system and OCFS2
Introduction to file system and OCFS2Introduction to file system and OCFS2
Introduction to file system and OCFS2Gang He
 
Building File Systems with FUSE
Building File Systems with FUSEBuilding File Systems with FUSE
Building File Systems with FUSEelliando dias
 
Development of the irods rados plugin @ iRODS User group meeting 2014
Development of the irods rados plugin @ iRODS User group meeting 2014Development of the irods rados plugin @ iRODS User group meeting 2014
Development of the irods rados plugin @ iRODS User group meeting 2014mgrawinkel
 
Xfs file system for linux
Xfs file system for linuxXfs file system for linux
Xfs file system for linuxAjay Sood
 
Ch10 file system interface
Ch10   file system interfaceCh10   file system interface
Ch10 file system interfaceWelly Dian Astika
 
All'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in UserspaceAll'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in UserspaceRoberto Reale
 
Poking The Filesystem For Fun And Profit
Poking The Filesystem For Fun And ProfitPoking The Filesystem For Fun And Profit
Poking The Filesystem For Fun And Profitssusera432ea1
 
De-Anonymizing Live CDs through Physical Memory Analysis
De-Anonymizing Live CDs through Physical Memory AnalysisDe-Anonymizing Live CDs through Physical Memory Analysis
De-Anonymizing Live CDs through Physical Memory AnalysisAndrew Case
 
2nd unit part 1
2nd unit  part 12nd unit  part 1
2nd unit part 1Pavan Illa
 

Ähnlich wie Fuse'ing python for rapid development of storage efficient (20)

Fuse'ing python for rapid development of storage efficient FS
Fuse'ing python for rapid development of storage efficient FSFuse'ing python for rapid development of storage efficient FS
Fuse'ing python for rapid development of storage efficient FS
 
Part 03 File System Implementation in Linux
Part 03 File System Implementation in LinuxPart 03 File System Implementation in Linux
Part 03 File System Implementation in Linux
 
Javase7 1641812
Javase7 1641812Javase7 1641812
Javase7 1641812
 
UNIT III.pptx
UNIT III.pptxUNIT III.pptx
UNIT III.pptx
 
Writing file system in CPython
Writing file system in CPythonWriting file system in CPython
Writing file system in CPython
 
Chapter 8 distributed file systems
Chapter 8 distributed file systemsChapter 8 distributed file systems
Chapter 8 distributed file systems
 
Os6
Os6Os6
Os6
 
The Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOsThe Linux Kernel Implementation of Pipes and FIFOs
The Linux Kernel Implementation of Pipes and FIFOs
 
Introduction to file system and OCFS2
Introduction to file system and OCFS2Introduction to file system and OCFS2
Introduction to file system and OCFS2
 
Building File Systems with FUSE
Building File Systems with FUSEBuilding File Systems with FUSE
Building File Systems with FUSE
 
Development of the irods rados plugin @ iRODS User group meeting 2014
Development of the irods rados plugin @ iRODS User group meeting 2014Development of the irods rados plugin @ iRODS User group meeting 2014
Development of the irods rados plugin @ iRODS User group meeting 2014
 
Xfs file system for linux
Xfs file system for linuxXfs file system for linux
Xfs file system for linux
 
Ch10 file system interface
Ch10   file system interfaceCh10   file system interface
Ch10 file system interface
 
Systems Programming - File IO
Systems Programming - File IOSystems Programming - File IO
Systems Programming - File IO
 
009709863.pdf
009709863.pdf009709863.pdf
009709863.pdf
 
5.distributed file systems
5.distributed file systems5.distributed file systems
5.distributed file systems
 
All'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in UserspaceAll'ombra del Leviatano: Filesystem in Userspace
All'ombra del Leviatano: Filesystem in Userspace
 
Poking The Filesystem For Fun And Profit
Poking The Filesystem For Fun And ProfitPoking The Filesystem For Fun And Profit
Poking The Filesystem For Fun And Profit
 
De-Anonymizing Live CDs through Physical Memory Analysis
De-Anonymizing Live CDs through Physical Memory AnalysisDe-Anonymizing Live CDs through Physical Memory Analysis
De-Anonymizing Live CDs through Physical Memory Analysis
 
2nd unit part 1
2nd unit  part 12nd unit  part 1
2nd unit part 1
 

KĂźrzlich hochgeladen

complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgsaravananr517913
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - GuideGOPINATHS437943
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringJuanCarlosMorales19600
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleAlluxio, Inc.
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncssuser2ae721
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 

KĂźrzlich hochgeladen (20)

complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfgUnit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
Unit7-DC_Motors nkkjnsdkfnfcdfknfdgfggfg
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
Transport layer issues and challenges - Guide
Transport layer issues and challenges - GuideTransport layer issues and challenges - Guide
Transport layer issues and challenges - Guide
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Piping Basic stress analysis by engineering
Piping Basic stress analysis by engineeringPiping Basic stress analysis by engineering
Piping Basic stress analysis by engineering
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsyncWhy does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
Why does (not) Kafka need fsync: Eliminating tail latency spikes caused by fsync
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 

Fuse'ing python for rapid development of storage efficient

  • 1. 'FUSE'ing Python for rapid development of storage efficient file-system PyCon APAC ‘12 Singapore, Jun 07-09, 2012 Chetan Giridhar, Vishal Kanaujia
  • 2. File Systems • Provides way to organize, store, retrieve and manage information • Abstraction layer • File system: – Maps name to an object – Objects to file contents • File system types (format): – Media based file systems (FAT, ext2) – Network file systems (NFS) – Special-purpose file systems (procfs) • Services – open(), read(), write(), close()… User space Kernel space User File System Hardware
  • 3. Virtual File-system • To support multiple FS in *NIX • VFS is an abstract layer in kernel • Decouples file system implementation from the interface (POSIX API) – Common API serving different file system types • Handles user calls related to file systems. – Implements generic FS actions – Directs request to specific code to handle the request • Associate (and disassociate) devices with instances of the appropriate file system.
  • 4. File System: VFS ext2 NFS procfs User VFS glibc.so Kernel space Block layer/ Device drivers/Hardware System call interface glibc.so: open(), read() Myapp.py: open(), read() System call: sys_open(), sys_read() VFS: vfs_open(), vfs_read()
  • 5. Developing FS in *NIX • In-kernel file-systems (traditionally) • It is a complex task • Understanding kernel libraries and modules • Development experience in kernel space • Managing disk i/o • Time consuming and tedious development – Frequent kernel panic – Higher testing efforts • Kernel bloating and side effects like security
  • 6. Solution: User space • In user space: – Shorter development cycle – Easy to update fixes, test and distribute – More flexibility • Programming tools, debuggers, and libraries as you have if you were developing standard *NIX applications • User-space file-systems – File systems become regular applications (as opposed to kernel extensions)
  • 7. FUSE (Filesystem in USErspace) • Implement a file system in user-space – no kernel code required! • Secure, non-privileged mounts • User operates on a mounted instance of FS: - Unix utilities - POSIX libraries • Useful to develop “virtual” file-systems – Allows you to imagine “anything” as a file ☺ – local disk, across the network, from memory, or any other combination
  • 8. FUSE: Diving deeper Myfile.py: open() glibc.so VFS: vfs_open() fuse.ko /dev/fuse libfuse.so CustomFS: open()
  • 9. FUSE | develop • Choices of development in C, C++, Java, … and of course Python! • Python interface for FUSE – (FusePython: most popularly used) • Open source project – http://fuse.sourceforge.net/ • For ubuntu systems: $sudo apt-get instatall python-fuse $mkdir ./mount_point $python myfuse.py ./mount_point $fusermount -u ./mount_point
  • 10. FUSE API Overview • File management – open(path) – create(path, mode) – read(path, length, offset) – write(path, data, offset) • Directory and file system management – unlink(path) – readdir(path) • Metadata operations – getattr(path) – chmod(path, mode) – chown(path, uid, gid)
  • 11. seFS – storage efficient FS • A prototype, experimental file system with: – Online data de-duplication (SHA1) – Compression (text based) • SQLite • Ubuntu 11.04, Python-Fuse Bindings • Provides following services: open() write() chmod() create() readdir() chown() read() unlink()
  • 12. seFS Architecture Your application <<FUSE code>> myfuse.py File System Operations seFS DB storage Efficiency = De-duplication + Compression<<pylibrary>> SQLiteHandler.py Compression.py Crypto.py <<SQLite DB Interface>> seFS.py
  • 13. seFS: Database seFS Schema CREATE TABLE data( "id" INTEGER PRIMARY KEY AUTOINCREMENT, "sha" TEXT, "data" BLOB, "length" INTEGER, "compressed" BLOB); CREATE TABLE metadata( "id" INTEGER, "abspath" TEXT, "length" INTEGER, "mtime" TEXT, "ctime" TEXT, "atime" TEXT, "inode" INTEGER); data table metadata table
  • 14. seFS API flow $touch abc getattr() create() open() release() $rm abc $cat >> abc getattr() access() getattr() open() flush() release() unlink() create() seFS DB User Operations seFS APIs storage Efficiency write()
  • 15. seFS: Code #!/usr/bin/python import fuse import stat import time from seFS import seFS fuse.fuse_python_api = (0, 2) class MyFS(fuse.Fuse): def __init__(self, *args, **kw): fuse.Fuse.__init__(self, *args, *kw) # Set some options required by the # Python FUSE binding. self.flags = 0 self.multithreaded = 0 self.fd = 0 self.sefs = seFS() ret = self.sefs.open('/') self.sefs.write('/', "Root of the seFS") t = int(time.time()) mytime = (t, t, t) ret = self.sefs.utime('/', mytime) self.sefs.setinode('/', 1)
  • 16. seFS: Code (1) def getattr(self, path): sefs = seFS() stat = fuse.stat() context = fuse.FuseGetContext() #Root if path == '/': stat.stat_nlink = 2 stat.stat_mode = stat.S_IFDIR | 0755 else: stat.stat_mode = stat.S_IFREG | 0777 stat.stat_nlink = 1 stat.stat_uid, stat.stat_gid = (context ['uid'], context ['gid']) # Search for this path in DB ret = sefs.search(path) # If file exists in DB, get its times if ret is True: tup = sefs.getutime(path) stat.stat_mtime = int(tup[0].strip().split('.')[0]) stat.stat_ctime = int(tup[1].strip().split('.')[0]) stat.stat_atime = int(tup[2].strip().split('.')[0]) stat.stat_ino = int(sefs.getinode(path)) # Get the file size from DB if sefs.getlength(path) is not None: stat.stat_size = int(sefs.getlength(path)) else: stat.stat_size = 0 return stat else: return - errno.ENOENT
  • 17. seFS: Code (2) def create(self, path,flags=None,mode=None): sefs = seFS() ret = self.open(path, flags) if ret == -errno.ENOENT: #Create the file in database ret = sefs.open(path) t = int(time.time()) mytime = (t, t, t) ret = sefs.utime(path, mytime) self.fd = len(sefs.ls()) sefs.setinode(path, self.fd) return 0 def write(self, path, data, offset): length = len(data) sefs = seFS() ret = sefs.write(path, data) return length
  • 18. seFS: Learning • Design your file system and define the objectives first, before development – skip implementing functionality your file system doesn’t intend to support • Database schema is crucial • Knowledge on FUSE API is essential – FUSE APIs are look-alike to standard POSIX APIs – Limited documentation of FUSE API • Performance?
  • 19. Conclusion • Development of FS is very easy with FUSE • Python aids RAD with Python-Fuse bindings • seFS: Thought provoking implementation • Creative applications – your needs and objectives • When are you developing your own File system?! ☺
  • 20. Further Read • Sample Fuse based File systems – Sshfs – YoutubeFS – Dedupfs – GlusterFS • Python-Fuse bindings – http://fuse.sourceforge.net/ • Linux internal manual
  • 21. Contact Us • Chetan Giridhar – http://technobeans.com – cjgiridhar@gmail.com • Vishal Kanaujia – http://freethreads.wordpress.com – vishalkanaujia@gmail.com
  • 23. Agenda • The motivation • Intro to *NIX File Systems • Trade off: code in user and kernel space • FUSE? • Hold on – What’s VFS? • Diving into FUSE internals • Design and develop your own File System with Python-FUSE bindings • Lessons learnt • Python-FUSE: Creative applications/ Use-cases
  • 24. User-space and Kernel space • Kernel-space – Kernel code including device drivers – Kernel resources (hardware) – Privileged user • User-space – User application runs – Libraries dealing with kernel space – System resources
  • 25. Virtual File-system • To support multiple FS in *NIX • VFS is an abstract layer in kernel • Decouples file system implementation from the interface (POSIX API) – Common API serving different file system types • Handles user calls related to file systems. – Implements generic FS actions – Directs request to specific code to handle the request • Associate (and disassociate) devices with instances of the appropriate file system.
  • 26. FUSE: Internals • Three major components: – Userspace library (libfuse.*) – Kernel module (fuse.ko) – Mount utility (fusermount) • Kernel module hooks in to VFS – Provides a special device “/dev/fuse” • Can be accessed by a user-space process • Interface: user-space application and fuse kernel module • Read/ writes occur on a file descriptor of /dev/fuse
  • 27. FUSE Workflow User space Kernel space User (file I/O) FUSE: kernel lib Custatom File Systatem Virtual File Systatem User-space FUSE lib
  • 28. Facts and figures • seFS – online storage efficiency • De-duplication/ compression – Managed catalogue information (file meta-data rarely changes) – Compression encoded information • Quick and easy prototyping (Proof of concept) • Large dataset generation – Data generated on demand
  • 29. Creative applications: FUSE based File systems • SSHFS: Provides access to a remote file-system through SSH • WikipediaFS: View and edit Wikipedia articles as if they were real files • GlusterFS: Clustered Distributed Filesystem having capability to scale up to several petabytes. • HDFS: FUSE bindings exist for the open source Hadoop distributed file system • seFS: You know it already ☺