Why donāt we just use Hadoop as block level storage What are the advantages over iSCSI ā RAID + SW for backup restore and WADB When will it be available for OpenStack ā 4Q2012? The limitations? Volume / total space, size of metadata, network, ā¦ Performance? 12 MB/sec or 49 MB/sec (multiple concurrent clients) How does it compare to EMC, Amazon S3? S3: Object Store; AWS Import/Export Amazon Elastic Block Storage (EBS): 1GB~1TB; ss saved in S3 Cinder, NetApp, Nexenta and SolidFire CIFS, GlusterFS, OCSF2 XCP
CRITICAL CHANGES NO HDFS In-kernal driver ā Nativeā networking disk driver (not xen back-end driver) Key design Why we Why not use iscii protocol Datanode prioritization Datanode hardware dependency
Provides a global storage abstraction on a large number of distributed disks Each storage block remains available despite failure in switch, server, or disk drive Scales to a very large number of concurrent accesses Exploits distributed object caching Initial thought: Use HDFS as the base for global storage management Support concurrency control among accesses to common storage blocks Support QoS guarantee or performance isolation among virtual clusters