Weitere ähnliche Inhalte Ähnlich wie Interactive Hadoop via Flash and Memory (20) Kürzlich hochgeladen (20) Interactive Hadoop via Flash and Memory1. © Hortonworks Inc. 2011
Interactive Hadoop via Flash and
Memory
Arpit Agarwal
aagarwal@hortonworks.com
@aagarw
Chris Nauroth
cnauroth@hortonworks.com
@cnauroth
Page 1
3. © Hortonworks Inc. 2011
HDFS Short-Circuit Reads
Page 3
Architecting the Future of Big Data
4. © Hortonworks Inc. 2011
HDFS Short-Circuit Reads
Page 4
Architecting the Future of Big Data
5. © Hortonworks Inc. 2011
Shortcomings of Existing RAM Utilization
• Lack of Control
– Kernel decides what to retain in cache and what to evict based on observations of
access patterns.
• Sub-optimal RAM Utilization
– Tasks for multiple jobs are interleaved on the same node, and one task’s activity
could trigger eviction of data that would have been valuable to retain in cache for
the other task.
Page 5
Architecting the Future of Big Data
6. © Hortonworks Inc. 2011
Centralized Cache Management
• Provides users with explicit control of which HDFS file paths to keep
resident in memory.
• Allows clients to query location of cached block replicas, opening
possibility for job scheduling improvements.
• Utilizes off-heap memory, not subject to GC overhead or JVM tuning.
Page 6
Architecting the Future of Big Data
7. © Hortonworks Inc. 2011
Using Centralized Cache Management
• Pre-Requisites
– Native Hadoop library required, currently supported on Linux only.
– Set process ulimit for maximum locked memory.
– Configure dfs.datanode.max.locked.memory in hdfs-site.xml, set to the amount of
memory to dedicate towards caching.
• New Concepts
– Cache Pool
– Contains and manages a group of cache directives.
– Has Unix-style permissions.
– Can constrain resource utilization by defining a maximum number of cached bytes or a
maximum time to live.
– Cache Directive
– Specifies a file system path to cache.
– Specifying a directory caches all files in that directory (not recursive).
– Can specify number of replicas to cache and time to live.
Page 7
Architecting the Future of Big Data
8. © Hortonworks Inc. 2011
Using Centralized Cache Management
Page 8
Architecting the Future of Big Data
9. © Hortonworks Inc. 2011
Using Centralized Cache Management
• CLI: Adding a Cache Pool
> hdfs cacheadmin -addPool common-pool
Successfully added cache pool common-pool.
> hdfs cacheadmin -listPools
Found 1 result.
NAME OWNER GROUP MODE LIMIT MAXTTL
common-pool cnauroth cnauroth rwxr-xr-x unlimited never
Page 9
Architecting the Future of Big Data
10. © Hortonworks Inc. 2011
Using Centralized Cache Management
• CLI: Adding a Cache Directive
> hdfs cacheadmin -addDirective
-path /hello-amsterdam
-pool common-pool
Added cache directive 1
> hdfs cacheadmin -listDirectives
Found 1 entry
ID POOL REPL EXPIRY PATH
1 common-pool 1 never /hello-amsterdam
Page 10
Architecting the Future of Big Data
11. © Hortonworks Inc. 2011
Using Centralized Cache Management
• CLI: Removing a Cache Directive
> hdfs cacheadmin -removeDirective 1
Removed cached directive 1
> hdfs cacheadmin -removeDirectives
-path /hello-amsterdam
Removed cached directive 1
Removed every cache directive with path /hello-amsterdam
Page 11
Architecting the Future of Big Data
12. © Hortonworks Inc. 2011
Using Centralized Cache Management
• API: DistributedFileSystem Methods
public void addCachePool(CachePoolInfo info)
public RemoteIterator<CachePoolEntry> listCachePools()
public long addCacheDirective(CacheDirectiveInfo info)
public RemoteIterator<CacheDirectiveEntry>
listCacheDirectives(CacheDirectiveInfo filter)
public void removeCacheDirective(long id)
Page 12
Architecting the Future of Big Data
13. © Hortonworks Inc. 2011
Centralized Cache Management Behind the
Scenes
Page 13
Architecting the Future of Big Data
14. © Hortonworks Inc. 2011
Centralized Cache Management Behind the
Scenes
• Block files are memory-mapped into the DataNode process.
> pmap `jps | grep DataNode | awk '{ print $1 }'` |
grep blk
00007f92e4b1f000 124928K r--s- /data/dfs/data/current/BP-
1740238118-127.0.1.1-
1395252171596/current/finalized/blk_1073741827
00007f92ecd21000 131072K r--s- /data/dfs/data/current/BP-
1740238118-127.0.1.1-
1395252171596/current/finalized/blk_1073741826
Page 14
Architecting the Future of Big Data
15. © Hortonworks Inc. 2011
Centralized Cache Management Behind the
Scenes
• Pages of each block file are 100% resident in memory.
> vmtouch /data/dfs/data/current/BP-1740238118-127.0.1.1-
1395252171596/current/finalized/blk_1073741826
Files: 1
Directories: 0
Resident Pages: 32768/32768 128M/128M 100%
Elapsed: 0.001198 seconds
> vmtouch /data/dfs/data/current/BP-1740238118-127.0.1.1-
1395252171596/current/finalized/blk_1073741827
Files: 1
Directories: 0
Resident Pages: 31232/31232 122M/122M 100%
Elapsed: 0.00172 seconds
Page 15
Architecting the Future of Big Data
16. © Hortonworks Inc. 2011
HDFS Zero-Copy Reads
• Applications read straight from direct byte buffers, backed by the
memory-mapped block file.
• Eliminates overhead of intermediate copy of bytes to buffer in user
space.
• Applications must change code to use a new read API on
DFSInputStream:
public ByteBuffer read(ByteBufferPool factory, int maxLength,
EnumSet<ReadOption> opts)
Page 16
Architecting the Future of Big Data
17. © Hortonworks Inc. 2011
Heterogeneous Storages for HDFS
Architecting the Future of Big Data
Page 17
18. © Hortonworks Inc. 2011
Goals
• Extend HDFS to support a variety of Storage Media
• Applications can choose their target storage
• Use existing APIs wherever possible
Page 18
Architecting the Future of Big Data
19. © Hortonworks Inc. 2011
Interesting Storage Media
Page 19
Architecting the Future of Big Data
Cost Example Use case
Spinning Disk (HDD) Low High volume batch data
Solid State Disk (SSD) 10x of HDD HBase Tables
RAM 100x of HDD Hive Materialized Views
Your custom Media ? ?
20. © Hortonworks Inc. 2011
HDFS Storage Architecture - Before
Page 20
Architecting the Future of Big Data
21. © Hortonworks Inc. 2011
HDFS Storage Architecture - Now
Page 21
Architecting the Future of Big Data
22. © Hortonworks Inc. 2011
Storage Preferences
• Introduce Storage Type per Storage Medium
• Storage Hint from application to HDFS
–Specifies application’s preferred Storage Type
• Advisory
• Subject to available space/quotas
• Fallback Storage is HDD
–May be configurable in the future
Page 22
Architecting the Future of Big Data
23. © Hortonworks Inc. 2011
Storage Preferences (continued)
• Specify preference when creating a file
–Write replicas directly to Storage Medium of choice
• Change preference for an existing file
–E.g. to migrate existing file replicas from HDD to SSD
Page 23
Architecting the Future of Big Data
24. © Hortonworks Inc. 2011
Quota Management
• Extend existing Quota Mechanisms
• Administrators ensure fair distribution of limited
resources
Page 24
Architecting the Future of Big Data
25. © Hortonworks Inc. 2011
File Creation with Storage Types
Page 25
Architecting the Future of Big Data
26. © Hortonworks Inc. 2011
Move existing replicas to target Storage
Type
Page 26
Architecting the Future of Big Data
27. © Hortonworks Inc. 2011
Transient Files (Planned feature)
• Target storage type is Memory
–Writes will go to RAM
–Allow short circuit writes equivalent to Short circuit reads to
local in-memory block replicas
• Checkpoint files to disk by changing storage type
• Or discard
• High performance writes For Low volume transient
data
–e.g. Hive Materialized Views
Page 27
Architecting the Future of Big Data
28. © Hortonworks Inc. 2011
References
• http://hortonworks.com/blog/heterogeneous-storages-hdfs/
• HDFS-2832 – Heterogeneous Storages phase 1 – DataNode as a
collection of storages
• HDFS-5682 – Heterogeneous Storages phase 2 – APIs to expose
Storage Types
• HDFS-4949 – Centralized cache management in HDFS
Page 28
Architecting the Future of Big Data