1. AN ENERGY EFFICIENT CACHE MEMORY
DESIGN USING VERILOG HDL
NAME: DHRITIMAN HALDER
USN: 1RE12LVS05
DEPARTMENT: M.TECH (VLSI DESIGN AND EMBEDDED
SYSTEMS)
SEMESTER: 4TH
COURSE CODE: 12LVS43
SUBJECT CODE: 12EC943
UNDER GUIDANCE OF: PROF. PRASAD S.N
2. CONTENTS
• Introduction to Cache Memory
• Types of Cache Memory
• Cache Read Operation
• Different Cache Mapping Techniques
• Cache Write Operation
• Different Write Policies
• Problem Definition
• Literature Survey
• Proposed Work
• Results
• Conclusion
• Publication
• References
3. Introduction to Cache Memory
• Processor requires data and instructions while performing a
specific task.
• Data and instructions are stored in main memory.
• A cache memory keeps necessary data and instructions to
accelerate the speed of operation.
Cache and Main Memory
4. Types of Cache Memory
• Data Cache: A data cache is used to speed up data fetch and
store which is usually organized as a hierarchy of more levels
(L1,L2 etc).
• Instruction Cache: An instruction cache to speed up
executable instruction fetch.
• Translation Lookside Buffer: A translation lookside buffer
(TLB) translates the virtual address into physical address of
requested data and instruction faster.
6. Different Cache Mapping Techniques
• Direct Mapping: Any main memory location can be loaded
into one fixed location in cache.
– Advantage- No search is required as there is only one location in
cache for each main memory location.
– Disadvantage- Hit ratio is poor as there is only one fixed location
for each main memory location.
7. Different Cache Mapping Techniques
• Fully Associative Mapping: Any main memory location can be
placed in any location in cache.
– Advantage- Hit ratio is increased as a main memory location can
take any location in any cache.
– Disadvantage- Consumes a lot of energy because controller unit
needs to search all tag patterns to ensure the data is present or not.
8. Different Cache Mapping Techniques
• Set Associative Mapping: Any main memory location can be
loaded to at least two or more memory location in cache.
– Advantage- Hit ratio is more compared to direct mapped cache and
amount of energy consumption is less as compared to fully
associative cache because limited no tag patterns needs to be
searched.
10. Different Write Policies
• Write-back Policy: In write-back cache only cache memory is
updated during write operation and marked with a dirty bit. Main
memory is updated later when the data block is to be replaced.
– Advantage- Consumes less energy during write operation because
main memory is not updated simultaneously.
– Disadvantage- Algorithm of write operation is complex and often
result data inconsistency.
11. Different Write Policies
• Write-through Policy: In write-through cache both cache
memory and main memory is updated simultaneously during
write operation.
– Advantage- Maintains data consistency through the memory
hierarchy.
– Disadvantage- Consumes a lot of energy due to increased access at
lower level.
12. Different Write Policies
• Write-around Policy: In write-around cache only main
memory is updated during write operation.
– Advantage- Consumes less energy and maintains data consistency.
– Disadvantage- Very much application specific and used where
recently executed data will not be required again.
13. Problem Definition
• Cache Coherence: In a multiprocessor system or in a multi-core
processor data inconsistency may occur between adjacent levels or
within same level because of data sharing with main memory, process
migration etc.
• Soft Error: Due to radiation effect data stored in memory becomes
erroneous which is known as soft error.
• Solution: Write-through is preferred because it updates data in main
memory simultaneously and maintains data consistency.
• Problem: Write-through cache consumes a lot of energy due to
increased access at lower level.
14. Literature Survey
• Partitioning Cache Data Array into Sub-banks
– Cache data array is partitioned into several segments horizontally.
– Each segment can be powered up individually.
– The only segment that contains required data / instruction is
powered up.
– The amount of power savings is reduced by eliminating
unnecessary access.
15. Literature Survey
• Division of Bit-Lines into Small Segmentation
– Each column of bit-lines are split into several segmentation.
– All segments are connected to a common line (pre-charged high).
– Address decoder identifies the segment targeted by row address and
isolates all but targeted segment.
– Power consumption is less as capacitive loading is reduced.
16. Literature Survey
• Way Concatenation Technique
– The memory address is split into a line-offset field, an index field
and a tag field.
– The cache decodes index field of address and compares tag with
address tag field.
– If the is a match the multiplexor routes the cache data to the output.
17. Literature Survey
• Location Cache
– Works in parallel with TLB and L1 cache.
– On read miss the way information is available because the physical
address is translated by TLB.
– The L2 cache is accessed in direct mapped cache.
29. Conclusion
• Several new components has been introduced in L1 cache
such as way-tag array, way-tag buffer and way decoder.
• L1 cache remains inbuilt with processor and area overhead is
a major drawback in this architecture.
• Layout designers has to handle place and route process very
carefully.
31. References
• [1]. An Energy-Efficient L2 Cache Architecture Using Way Tag Information Under Write-Through Policy,
Jianwei Dai and Lei Wang, Senior Member, IEEE, IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, Vol. 21, No. 1, January 2013
• [2]. G. Konstadinidis, K. Normoyle, S. Wong, S. Bhutani, H. Stuimer, T. Johnson, A. Smith, D. Cheung, F.
Romano, S. Yu, S. Oh, V.Melamed, S. Narayanan, D. Bunsey, C. Khieu, K. J. Wu, R. Schmitt, A. Dumlao,
M. Sutera, J. Chau, andK. J. Lin, “Implementation of a third-generation 1.1-GHz 64-bit microprocessor,”
IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1461–1469, Nov. 2002.
• [3]. S. Rusu, J. Stinson, S. Tam, J. Leung, H. Muljono, and B. Cherkauer, “A 1.5-GHz 130-nm itanium 2
processor with 6-MB on-die L3 cache,” IEEE J. Solid-State Circuits, vol. 38, no. 11, pp. 1887–1895, Nov.
2003.
• [4]. D. Wendell, J. Lin, P. Kaushik, S. Seshadri, A. Wang, V. Sundararaman, P. Wang, H. McIntyre, S. Kim,
W. Hsu, H. Park, G. Levinsky, J. Lu, M. Chirania, R. Heald, and P. Lazar, “A 4 MB on-chip L2 cache for a
90 nm 1.6 GHz 64 bit SPARC microprocessor,” in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.
Papers, 2004, pp. 66–67.
• [5]. http://en.wikipedia.org/wiki/CPU_cache
• [6]. C. Su and A. Despain, “Cache design tradeoffs for power and performance optimization: A case study,”
in Proc. Int. Symp. Low Power Electron. Design, 1997, pp. 63–68.
• [7]. K. Ghose and M. B.Kamble, “Reducing power in superscalar processor caches using subbanking,
multiple line buffers and bit-line segmentation,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp.
70–75.
• [8]. C. Zhang, F. Vahid, and W. Najjar, “A highly-configurable cache architecture for embedded systems,”
in Proc. Int. Symp. Comput. Arch., 2003, pp. 136–146.
• [9]. K. Inoue, T. Ishihara, and K. Murakami, “Way-predicting set-associative cache for high performance
and low energy consumption,” in Proc. Int. Symp. Low Power Electron. Design, 1999, pp. 273–275.
• [10]. A.Ma, M. Zhang, and K.Asanovi, “Way memoization to reduce fetch energy in instruction caches,” in
Proc. ISCA Workshop Complexity Effective Design, 2001, pp. 1–9.
• [11]. T. Ishihara and F. Fallah, “A way memorization technique for reducing power consumption of caches
in application specific integrated processors,” in Proc. Design Autom. Test Euro. Conf., 2005, pp. 358–363.
• [12]. R. Min, W. Jone, and Y. Hu, “Location cache: A low-power L2 cache system,” in Proc. Int. Symp.
Low Power Electron. Design, 2004, pp. 120–125.