참여기관_발표자료-국민대학교 201301 정기회의

에너지 30% 이상 절감 가능한
범용 운영체제 핵심 원천 기술 개발 과제
1월 정기회의
국민대학교
김영만, 한재일

Contents
 Research Activities
Part 1 : 발표 배정논문 (3편)
1. A new model for the system and devices latency
2. Cross-Layer Frameworks for Constrained Power and Resources
Management of Embedded Systems
3. Automatic Run-Time Selection of Power Polices for Operating Systems
Part 2 : 국민대 연구내용
1. Performance Evaluation of Parallel Applications on Next Generation
Memory Architecture with Power-Aware Paging Method
2. PPFS : A Scalable Flash Memory File System for the Hybrid
Architecture of Phase-change RAM and NAND Flash
3. Address Translation Technique for Large NAND Flash Memory using
Page Level Mapping
4. Performance Optimization Techniques for Legacy File Systems on
Flash Memory

1. Research Activities
Research Contents
에너지 절감과 관련된 논문 읽기
시뮬레이터 분석
Papers
A new model for the system and devices latency
Automatic Run-Time Selection of Power Policies for Operating Systems
Cross-Layer Frameworks for Constrained Power and Resources Management of Embedded Systems
Address Translation Technique for Large NAND Flash Memory
Performance Optimization Techniques for Legacy File Systems on Flash Memory
A Scalable Flash Memory File System for the Hybrid Architecture of Phase-change RAM and NAND Flash
An Energy Efficient Cache Design Using Spin Torque Transfer (STT) RAM
Performance Evaluation of Parallel Applications on Next Generation Memory Architecture with Power-
Aware Paging Method

Part 1
발표 배정논문 (3편)

1.Anewmodelforthesystemanddeviceslatency

WHAT IS LATENCY ?
• “ In a computer system, latency is often used to mean any del
ay or waiting that increases real or perceived response time b
eyond the response time desired. “
• “ Specific contributors to computer latency include mismatche
s in data speed between the microprocessor and input/output
devices and inadequate data buffers.”
• “ Within a computer, latency can be removed or "hidden" by s
uch techniques as prefetching (anticipating the need for data i
nput requests) and multithreading, or using parallelism across
multiple execution threads.“
• Source: http://searchciomidmarket.techtarget.com/definition/
latency

TERMINOLOGY.
(Texas Instrument)
• Latency: time to react to an external event, e.g. time spent to execut
e the handler code after an IRQ, time spent to execute driver code fr
om an external wake-up event.
• HW latency: latency introduced by the HW to transition between po
wer states.
• SW latency: time for the SW to execute low power transition code, e
.g. IP block save & restore, caches flush/invalidate etc.
• System: ‘everything needed to execute the kernel code', e.g. on OM
AP3, system = CPU0 + CORE (main memory, caches, IRQ controller...).
• Per-device latency: latency of a device (or peripheral). The per-devic
e PM QoS framework allows to control the devices states from the al
lowed devices latency.
• Cpuidle: framework that controls the CPUs low power states (=C-stat
es), from the allowed system latency. Note : Is being abused to contr
ol the system state.
• PM runtime: framework that allows the dynamic switching of resour
ces.

HOW TO SPECIFY THE ALLOWED LATE
NCY.
• The PM QoS framework allows the kernel and user to specify
the allowed latency.
• The framework calculates the aggregated constraint value and
calls the registered platform-specific handlers in order to apply
the constraints at lower level.

PM QoS FRAMEWORK.
• PM QoS is a framework developed by Intel.
• It allows kernel code and applications to set their requirement
s in terms of:
• CPU DMA latency.
• Network latency.
• According to these requirements, PM QoS allows kernel driver
s to adjust their power management.
• See Documentation/power/pm_qos_interface.txt.
• http://free-electrons.com/kerneldoc/latest/power/pm_qos_interface.txt
• Still in very early deployment (only 4 drivers in 2.6.36).

What is the key point of control
ling the latency ?
• The point is to dynamically optimize the power consumption o
f all system components.
• Knowing the allowed latency (from the constraints) and the ex
pected worst-case latency allows to choose the optimum pow
er state.

OMAP.
• OMAP (Open Multimedia Applications Platform) developed b
y Texas Instruments is a category of proprietary system on chi
ps (SoCs) for portable and mobile multimedia applications.
• OMAP devices generally include a general-purpose ARM archi
tecture processor core plus one or more specialized co-proces
sors.
• Earlier OMAP variants commonly featured a variant of the Tex
as Instruments TMS320 series digital signal processor.
• The OMAP family consists of three product groups classified b
y performance and intended application:
• High-performance applications processors
• Basic multimedia applications processors
• Integrated modem and applications processors

OMAP.
TI OMAP3530 on BeagleBoard described

OMAP.
TI OMAP4430 on PandaBoard described

PROBLEM.
• There is no concept of ‘overall latency’.
• No interdependency between PM frameworks
 Ex. on OMAP3 : cpuidle manages only a subset of the power domains
(MPU, CORE).
 Ex. on OMAP3 per-device PM QoS manages the other power domain
s.
 No relation between the frameworks, each framework has its own lat
ency numbers.
• Some system settings are not included in the model
 Mainly because of the (lack of) SW support at the time of the measur
ement session.
 Ex. On OMAP3 : voltage scaling in low power modes, sys_clkreq, sys_
offmode and the interaction with the PowerIC.
• Dynamic nature of the system settings
 The measured numbers are for a fixed setup, with predefined system
settings.
 The measured numbers are constant.

SOLUTION PROPOSAL.
• Overall latency calculation.
• We need a model which breaks down the overall latency into t
he latencies from every contributor:
Latency = latencySW + latencyHW
Latency = latencySW + latencySoC + latencyExternal HW
• LatencySW : time for the SW to save/restore the context of an I
P block.
• LatencySoC : time for the SoC HW to change an IP block state.
• LatencyExternal HW : time to stop/restart external HW. (Ex: extern
al crystal oscillator, external power supply …)
• Note: every latency factor maybe be divided into smaller facto
rs. E.g: On OMAP a DPLL can feed multiple power domains.

3.AutomaticRun-TimeSelectionofPowerPolicesfor
OperatingSystems
20130108
이재열

Problems
• Existing studies one power management make an implicit
assumption
• Only one policy can be used to save power
• Hence, those studies focus on finding the best polices for
unique request patterns

HAPPI(Homogeneous Architecture
for Power Policy Integration)
• HAPPI is currently capable of supporting power policies for
disk, DVD-ROM, and network devices
• But it can easily be extended to support other I/O devices
• Must provide
• A function that predicts idleness and controls a device’s power
state.
• A function that accepts a trace of device accesses, determines
the actions the control function would take, and returns the
energy consumption and access delay from the actions.

• If policy is selected to manage the power state of a specific
device by HAPPI, it is considered activity
• Each device is assigned only one active policy at anytime
• Whenever the device is accessed, HAPPI captures the size and
time of the access
• Also records the energy and delay for each device

• Policy Selection

Implementation
• Linux 2.6.5
• Policies and evaluators are implemented as kernel module
• Experimental hardware is not fully ACPI compliant
• So they implement a function that returns the power,
transition energy and transition delay for each state of each
device
• Policies need these values to compute the power consumed in
each state

Experiments
• Fujitsu laptop hard disk(HDD)
• Samsung DVD drive(DVD)
• NetXtreme integrated wired network card(NIC)
Power states for devices

Experiments
• Workload
1. Web browsing + buffered media playback from DVD
2. Download video and buffered media playback from disk
3. CVS checkout from remote repository
4. E-mail synchronization + unbuffered media playback from DVD
5. Kernel compile

Experiments
• Policies
• Null
• 2-competitive timeout
• Exponential prediction
• Adaptive timeout

Exponential Prediction
• Formulation
• In : the last predicted value
• in : the latest idle period
• a : a constant attenuation factor in the range between 0 to 1
• If a = 0, then In+1 = In
• If a = 1, then In = in
• So, typically a = 1/2

Exponential Prediction
In
in
Actual Idle(in) 6 4 6 4 13 13 13 …
Prediction(In) 10 8 6 6 5 9 11 12 …

Experiments
• Result
Estimated energy consumption for each policy on
devices for experimental workload
Selected policies for devices at each evaluation
Workload 1 2 3 4 5 Workload 1 2 3 4 5

Conclusion
• experiments indicate that policy selection is highly adaptive to
workload and hardware types, supporting our claim that
automatic policy selection is necessary to achieve better
energy savings

1.PerformanceEvaluationofParallelApplicationsonNext
GenerationMemoryArchitecturewithPower-AwarePaging
Method

Problems
• This paper propose solution (architecture and low power
paging algorithm) to reduce energy consumption in HPC (High
Performance Computing) systems
• To demonstrate low power paging algorithm can improve HPC
performance and reduce energy consumption

SOLUTION
• Replace a part of DRAM with MRAM.
• Conduct simulation to evaluate the performance and energy
consumption of several application benchmark.
• Make a trace file of memory access in each application
benchmark by using the Valgrind profiling tool.
• For each memory access that incurs miss, we collect memory
address and profiling results, which are access count on all the
memory pages.
• With the trace files, they replay behavior of application with
our event-driven simulator.

HOW CAN THEY SOLVE ?
• They propose hybrid memory architecture and power aware s
wapping.
• Use MRAM as main memory beside DRAM due to its higher ac
cess speed and low power consumption.
• Use FLASH as fast random-access swap device due to its faster
random access read speed.
• Use MRAM hit rate and threshold in Low Power Paging Algorit
hm to mange the swapping interaction between DRAM/MRA
M and FLASH. Therefore can improve performance and reduce
energy.

Proposition– HybridMemoryArchitecture
andPower AwareSwapping
CPUs
L1 CACHE
L2 CACHE
MRAM DRAM
HOTTER PAGE
COLDER PAGE
FLASH SWAP
Overview of Proposed Low Power Memory Architecture
MainMemory
Larger number of access.
SWAP SWAP SWAP
CACHE

Low-Power Paging Algorithm
Allocate Hot Pages on
MRAM
Profiling Result
Application Running
Page Fault
Memory
Access on
MRAM
MRAM Hit++
Memory Access++
MRAM Hit Rate <-- MRAM Hit / Memory Access
MRAM Hit Rate >
Threshold
Swap Out the Last
Recently Used Page on
DRAM
Swap Out the Last
Recently Used Page on
DRAM or MRAM
L2 Cache Miss
No
Yes
No
No
Yes
Yes
Figure 2 – Algorithmic Flow of Proposed Paging Algorithm
- A trace file also includes profiling results, which are access
counts on all the memory pages.
- Profiling: the per page memory access frequency of a given application
throughout its execution.
- Pre execution trial or sampling with HW assist.
With the trace file, we replay behavior of application with our event-driven simulator.
- Memory access
 L2 Cache Miss
 Collect
 Profiling

Why they need that algorith
m?
• First simple algorithm works as follows:
• We pin downs the hottest pages so that they are never swap out
and allocated on MRAM.
• The remaining pages are allocated onto DRAM and use LRU based
swapping with flash memory.
• This simple algorithm in some case increased application exec
ution time with LRU swapping algorithm.
• Excessive swaps, slowing down the application considerably.

Why they need that algorith
m?
• To resolve this situation, we extend our algorithm by introduce
a metric called MRAM hit rate and its threshold so that applica
tions exhibiting lower locality may use both MRAM and DRAM
as swappable main memory.
• Thr = α x MRAM_SIZE / TOTAL_SIZE.
• α (≈1) is a configurable parameter to be used to determine the
threshold.
• Several preliminary experiments have shown that a threshold
value of 0.9 seems to work for the NAS and other HPC applicat
ions.

CORE IDEAS OF LOW POWER
PAGING ALGORITHM.
• MRAM hit rate is a dynamic value that indicates the ratio of th
e access counts onto MRAM versus memory access to all the
memory at each point in execution time.
• If the ratio is large, we can decide that accesses to MRAM has
sufficient locality such that pages should be pinned down.
• On the other hand, if the ratio is small, the application lacks of
locality and thus the entire main memory should be seen as s
wappable.

CONCLUSION.
• Reduce DRAM capacity aggressively can reduce energy consu
mption, even with swapping.
• The energy consumption can be reduced to 25% by reducing D
RAM capacity.

2.PPFS:AScalableFlashMemoryFileSystemfortheHybrid
ArchitectureofPhase-changeRAMandNANDFlash

NAND Flash Memory
• NAND flash memory structure
• Page (2KB) : Read and Write Unit
• Block (64 pages = 128KB) : Erase Unit
• NAND flash memory is beauty
• Non-volatility
• Fast access time (No seek latency)
• Low power consumption
• Relatively large capacity
• Shock-resistance
• NAND flash memory is beast
• Erase before write : The page should be erased first in
order to update data on that page
• Slow write : Support only page-level write and 10x slower
than read
• Limited life time : Ensure 100K ~ 1M erase cycles
Ref) K9F1G08X0A Datasheet

Feature of PRAM
Source: Motoyuki Ooishi, Nikkei Electronics Asia, Oct. 2007
 PRAM memory
 Random access memory
 Non-volatile memory
 Low leakage energy
 High density: 4x denser than DRAM
 Limited endurance

NAND flash memory VS. PRAM
1. KPS5615EZM Data Sheet, 2. K9G8G08U0M Data Sheet
PRAM1 NOR SLC NAND MLC NAND2
Volatility Non-volatile Non-volatile Non-volatile Non-volatile
Random access Yes Yes No No
Unit of write Word (2byte) Word (2byte) Page (2Kbyte) Page (2Kbyte)
Read speed 50ns/word 100ns/word 25us/page 60us/page
Write speed 5us/word 11.5us/word 200us/page 800us/page
Erase speed N/A 0.7s/64KB 2ms/128KB 1.5ms/128KB
Program
endurance
108 105 106 105
Size 32MByte 32MByte ~1GB 4GB+
Others • Serial program
• Serial program
• Paired page
damage

JFFS2(Journaling Flash File System)
• Developed by Redhat eCos in 2001
• Designed for NOR flash memory at the first time
• Supporting data compression
– Good for reducing total page write
– Additional computational overhead
• Log-structured File system
– Any file system modification is appended to the log
• Scalability problem
– Need full scan at a mount time
– Manage all metadata in main memory
• Directory structure, File indexing structure
Scan area
JFFS
Ref. D. Woodhouse, “JFFS: The journaling flash file system,” presented at the Ottawa Linux
Symposium, 2001.

YAFFS2 (Yet Another Flash File System)
• Developed by Aleph One in 2003
• Designed specifically for NAND flash memory
– Use spare region to store the file metadata
– Need to scan entire spare region
• Reduced mounting time comparing with JFFS2
– Manage all metadata in main memory
• Directory structure, File indexing structure
Scan area
YAFFS
Ref. http://www.yaffs.net/

CFFS (Core Flash File System)
• Developed By CORE Lab in 2006
• Metadata separation
– Metadata and data is written to different blocks in NAND flash
– Scanning only the metadata blocks  Reduced mounting time
• Store file indexing structure in NAND flash memory
– Reduce the main memory usage
– Manage directory structure in main memory
• CFFS2 limitation
– Need extra metadata write operation
• Updating file index in NAND flash memory
– Wear-leveling problem
• Metadata block is updated more frequently Scan area
CFFS2
Ref. S. H. Lim and K. H. Park, “An efficient nand flash file system for flash memory storage,”
IEEE Transactions on Computers, vol. 55, no. 7, pp. 906–912, 2006.

Previous flash file systems
Feature Pros. Cons.
JFFS2
[2001]
• LFS approach
• Data compression
• Node management
• Reliable
• Metadata update overhead
• Node management overhead
YAFFS2
[2003]
• LFS approach
• Using spare region
• Reduced mounting time
CFFS
[2006]
• LFS approach
• Metadata separation
• File indexing in NAND
• Reduced mounting time
• Reduced GC overhead
• Scalability problem remaining
• Extra write overhead
• Wear-leveling problem

Metadata update problems
Write 512 or 2KiB

Scalability problems
2. Use of main memory
Scan area Non-scan area
1. Scan area comparison
>>
Open(“/dir/a.txt”)
i-number
Location of inode
Location of data
Accessing a file ‘/dir/a.txt’
Type of Index JFFS, YAFFS CFFS
1. Find i-number using
path name
In memory
directory
In memory
directory
2. Find inode using i-
number
In memory inode
map
In memory inode
map
3. Find file data
In memory file
index
In NAND file
index
YAFFS CFFSJFFS

Solution of Metadata update
Write 2Btyte

PFFS Scalability: Mounting
time
• PFFS has minimized and fixed mounting time
– All metadata are connected from root directory in PRAM
– PFFS does not need to scan the NAND flash memory
YAFFS CFFSJFFS PFFS
Scan area Non-scan area
>> >
Scan area comparison

PFFS Scalability: Memory use
• PFFS use no DRAM main memory for metadata structure
– Most of metadata structures of PFFS are contained in PRAM
Type of Index
JFFS,
YAFFS
CFFS PFFS
1. Find i-number
using path name
In memory
directory
In memory
directory
In-PRAM
directory
2. Find inode using
i-number
In memory
inode map
In memory
inode map
Simple
calculation
3. Find file data
In memory
file index
In NAND file
index
In-PRAM data
pointers
Open(“/dir/a.txt”)
i-number
Location of inode
Location of data
Accessing a file ‘/dir/a.txt’
Main memory use

Evaluation
• CPU: Samsung S3C2413 (ARM 926EJ)
• Mem : 64MB DRAM
• 1GB MLC NAND, 32MB PRAM
• NAND flash memory characteristics
• Benchmark: PostMark
• Benchmark for short-lived, small file
read/write performance
• Comparison with YAFFS2

Conclusion
• PFFS solves the scalability problems of previous flash file
systems by using the hybrid architecture of PRAM and NAND
flash memory
• Mounting time and memory usage of PFFS are O(1)
• The performance of PFFS is 25% better than YAFFS2 for small
file writes

3.AddressTranslationTechniqueforLargeNANDFlash
MemoryusingPageLevelMapping

Problems
• In page level mapping scheme, relocation of data is possible as
page size
• But, disadvantage is the large size of mapping table
• Ex) In 64GB SSD
• If using block level mapping, size of mapping table is 512KB
• If using page level mapping, size of mapping table is 64MB
• So most of the actual commercial SSD uses a hybrid scheme
based on block level mapping scheme

Page Level Mapping scheme
address translation techniques
• The entire mapping table is maintained in the NAND
• caching frequently used mapping table to DRAM
• Use FTL-TLB and FTL mapping directory structure

Page table management in Demand
Paging Memory System
• Using page level mapping in the NAND Flash memory and
using Demand Paging scheme on the memory system are
similar

FTL-TLB
• manage mapping table as a section
• A section stored mapping table about NAND Flash memory
one block
• Ex) If a block has 128 pages, size of a section is 128 * 4B
• The number of sections is the same that entire block numbers
in NAND

FTL Mapping Directory
• FTL Mapping Directory is allocated in DRAM
• FTL Mapping Directory has as much as the number of sections
Whether cashed or not in FTL-TLB
Whether updated or not in NAND

Evaluation
• Workload
• Goal is 64GB SSD
• Use Virtual Box and 64GB HDD and Windows XP
• Collect access trace
• Daily_usage and multi_program are typical environment
• Install_update is Windows update and program install
• Large_file is copy large file
Trace Requests Data size [MB]
Daily_usage 545031 10270.78
Multi_program 309262 3070.669
Install_update 1022856 14072.22
Large_file 45593 2810.333

Evaluation
• Replacement Algorithm
• OPT(The Optimal Algorithm)
• LRU
• LRFU(Last Recently/Frequently Used Replacement)
• LRU2(LRU-K)
• LIRS(Low Inter-Reference recency Set)
• CFLRU(Clean First LRU)
• LRU-WSR(LRU-Write Sequence Reordering)

CFLRU(Clean First LRU)
• If all page frames are clean pages or dirty pages then CFLRU is
the same LRU algorithm.
• But Example: all page frames have dirty pages and clean pages.
• CFLRU divides the LRU list into two regions.
• The working region consists of recently used pages and most of
cache hits are generated in this region.
• The clean-first region consists of pages which are candidates for
eviction.
• CFLRU selects a clean page to evict in the clean-first region
first.
• If there is no clean page in this region, a dirty page at the end
of the LRU list is evicted.

CFLRU(Clean First LRU)
P1 P2 P3 P4 P5 P6 P7 P8
5(D) 2(C) 3(D) 7(C) 1(C) 4(D) 6(C) 8(D)
Working Region
C : Clean Page
D : Dirty Page Clean First Region
P7 P1 P2 P3 P4 P5 P6 P8
9(C) 5(D) 2(C) 3(D) 7(C) 1(C) 4(D) 8(D)
P5 P7 P1 P2 P3 P4 P6 P8
10(C) 9(C) 5(D) 2(C) 3(D) 7(C) 4(D) 8(D)
P4 P5 P7 P1 P2 P3 P6 P8
11(C) 10(C) 9(C) 5(D) 2(C) 3(D) 4(D) 8(D)
P2 P4 P5 P7 P1 P3 P6 P8
12(C) 11(C) 10(C) 9(C) 5(D) 3(D) 4(D) 8(D)
Access 9
Access 10
Access 11
Access 12
Evict P7
Evict P5
Evict P4
Evict P2

LRU-WSR(LRU-WriteSequenceReordering)
• We have some concepts: Cold dirty page, And Cold flag.
• If the page is dirty and cold-flag is set, this page regarded as a
cold dirty page.
• We have example: LRU-WSR uses a page list L and an
additional flag - Cold flag.

LRUWSR(LRU-WriteSequenceReordering)
P1 P2 P3 P4 P5 P6 P7 P8
5(D) 2(C) 3(D) 7(C) 1(C) 4(D) 6(C) 8(D)
Cf=0 Cf=0 Cf=0 Cf=0 Cf=0 Cf=1 Cf=0 Cf=0
Cf : Cold flag
P7 P8 P1 P2 P3 P4 P5 P6
9(C) 8(D) 5(D) 2(C) 3(D) 7(C) 1(C) 4(D)
Access 9 Evict P7
P6 P7 P8 P1 P2 P3 P4 P5
10(C) 9(C) 8(D) 5(D) 2(C) 3(D) 7(C) 1(C)
Access 10 Evict P6
P5 P6 P7 P8 P1 P2 P3 P4
11(C) 10(C) 9(C) 8(D) 5(D) 2(C) 3(D) 7(C)
Access 11 Evict P5
P4 P5 P6 P7 P8 P1 P2 P3
12(C) 11(C) 10(C) 9(C) 8(D) 5(D) 2(C) 3(D)
Access 12 Evict P4

Evaluation
• Cache hit ratio
Daily_usage Multi_program
Large_fileInstall_update
Show cache hit ratio of
over 95% in all cases
except Large_file.

Evaluation
• Overhead
Daily_usage Multi_program
Large_fileInstall_update
In most of workload
when cache size is
more than 512KB
overhead is less than
2%.

Evaluation
• Memory usage
• 64GB SSD has 131072 blocks of 512 KB size
• A entry use 6B in FTL mapping directory
• So size is 768KB
Page mapping
table
512KB
FTL-TLB
1024KB
FTL-TLB
64 GB SSD 64MB 1280KB 1.9% 1792KB 2.7%

Conclusion
• Although FTL-TLB uses only 512KB, cache hit ratio is over 90%
• Cache over head is under 2%
• Memory usage is only 1.9% rather than full mapping table

4.PerformanceOptimizationTechniquesforLegacyFile
SystemsonFlashMemory

Problems
• No research about File system optimization on Flash
• Legacy cluster allocation scheme for hard disk is not suitable
• Hard disk can in place update
• But Flash can not do

Solutions
• AFCA(Anti-Fragmentation Cluster Allocation)
• New Fragmentation for Flash
• Data invalidation scheme
• If data is not used any more, file system announce to FTL for
reduce unnecessary overhead

AFCA(Anti-Fragmentation Cluster
Allocation)
• File fragmentation
• The number of logical blocks to save the file : N
• The number of logical blocks to actually used : n
• If n>N, file is fragmented
• Free space fragmentation
• Minimum number of logical blocks with free space : M
• The number of logical blocks in free space is located : m
• If m>M, Free space is fragmented

Allocation)
M : 4
m : 5
N : 2
n : 2
M : 2
m : 3
N : 2
n : 3
N : 2
n : 2
M : 2
m : 2
M : 2
m : 2
N : 2
n : 3

Allocation)
basic cluster allocation(BCA) AFCA

Allocation)
• Considerations
• If file is larger than logical block, allocate as a logical block. It’s
good to reduce file fragmentation
• After allocate all clusters in a block, allocate next logical block. it
is good to reduce free space fragmentation
• File is considered as small file and if file exceeds the threshold,
file is considered as large file

Allocation)
• Free logical blocks(F-logical block)
• All clusters are unused state in logical block
• Logical blocks for small file(S-logical block)
• Logical blocks for large file(L-logical block)

Allocation)
F-logical block
S-logical block L-logical block
cluster allocation
at small file
Return some cluster

Data invalidation scheme
• If sector is not used any more, file system announce to FTL
• FTL checks sector that is invalid data on page mapping table

Evaluation
• Use Ext2, Kernel 2.4 and NAND Flash Emulator
• Page size is 2KB
• Block size is 128KB
• FTL is Z-FTL based on block mapping

Conclusion
• When we use AFCA
• Fragmentation is reduced up to 53%
• Performance is improved up to 46%
• When we use data invalidation
• Write performance is improved up to 22%

참여기관_발표자료-국민대학교 201301 정기회의

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie 참여기관_발표자료-국민대학교 201301 정기회의

Ähnlich wie 참여기관_발표자료-국민대학교 201301 정기회의 (20)

참여기관_발표자료-국민대학교 201301 정기회의