SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Algorithmic Memory Increases Memory
Performance By an Order of Magnitude


                        Sundar Iyer
              Co-Founder & CTO Memoir Systems
       Track F, Lecture 2: Intellectual Property for SoC & Cores




                                May 2, 2012
Problem: Processor-Embedded Memory Performance Gap

                                     Performance degradation can be
                                             more significant
                                             more significant
                                           and is getting worse!
                                                                                   Processor Embedded
                                                                                 Memory Performance Gap
Normalized Growth




                    *Source: Hennessy and Patterson, 5th Edition




                                                                   May 2, 2012
Why is Embedded Memory Slow?

       1 2       3       4       5       6       7       8       9 10 11 12 13 14 15
clk
read
                                                                          One operation per
addr   A     B       C       D       E       F       G       H
                                                                          memory clock cycle

data                 A       B       C       D       E       F    G   H




        How can we increase memory performance
         without increasing memory clock speed?

                                                 May 2, 2012
Solution: Algorithmic Memory®= Memory
                     Macros + Algorithms
           Physical Memory                           Algorithmic Memory
            1P @ 500 MHz                                 1P @ 500 MHz
      1P      1P      1P      1P                1P         1P        1P    1P

      1P       1P     1P      1P                1P         1P       1P     1P

      1P       1P     1P      1P                1P         1P       1P     1P


                                                           Extra Memory




            1P @ 500 MHz                                 4P @ 500 MHz
       Allows 500 Million MOPS1                     Allows 2000 Million MOPS
   (1 Memory Operations Per Second)                  More Ports, Same Clock




                                      May 2, 2012
Solution Overview
                              2X Performance for ~15% area overhead
                                     Any Embedded Physical Memory

RTL Based: No Circuit or                                                Simultaneous Accesses to the
                                    1P       1P        1P       1P
    Layout changes                                                     same Address, Row, Column, or
                                                                           Bank (no exceptions)
                                    1P       1P        1P       1P


                                    1P       1P        1P       1P


                                             Extra Memory

                                         Algorithmic Memory


                                                                         Exhaustively Formally Verified
                                             Data
                                    Data
                                    Addr



                                             Addr



                                                       Addr



                                                                Addr
                                                       Data



                                                                Data
   Each Port can access the                                                & Transparent to end-user
   entire Memory Address

   Using Physical 1-Port Memory to Build any Multiport Functionality

                                                  May 2, 2012
Usage & Adoption
 Easily Interface
                                                                   128 Width
  • Presents standard memory interface
  • Adds no clock cycle latency
  • Used as a drop-in replacement




                                            8K Depth
                                                                   Physical
                                                                   Memory

 Readily Integrate
  • Fits seamlessly in SoC design flow                         Memoir IP IP
                                                                Memoir

  • Used in SoCs - ASICs, ASSPs, GPPs
                                                       A   D   A   D       A   D   A D


 Rapidly Implement                                      Identical Pinout
                                                       to Standard Memory
  • Supports any process, node or foundry

                              May 2, 2012
Increases Density

 Denser Physical
   1P Memory
   Algorithmic
   2P Memory




     Physical
    2P Memory




                                 Normalized for 1P = 1 Mb/mm2


                   May 2, 2012
Increases Density




                     Normalized for 1P = 1 Mb/mm2


       May 2, 2012
Reduces Total Power




                      Based on 40nm example



        May 2, 2012
Reduces Total Power




                      Based on 40nm example



        May 2, 2012
Configurable Performance
                       Performance
                          (MOPS)            Higher performance
                                           algorithmic memories




                             4P




                                                               Higher density
                                     2P                     algorithmic memories
                                                                                     Memory Density
                                                                                       (Mb/mm2)

                                                                           Physical Memory
   Power efficient
algorithmic memories                                                       Higher Performance Algorithmic Memory
                          Algorithmic 2P   SP SP
                                                                           Area Efficient Algorithmic Memory

                                                                           Power Efficient Algorithmic Memory
Power Efficiency
  (Mb/mW)


                                                   May 2, 2012
Increases Portfolio of Available Memories
                            1R1W
                    1R/4W           2R/1W



            4R/1W                           1R/2W




          3R1W              1RW                2RW




            2R2W                            1R2W



                    1R3W            2R1W
                            3R/1W
                                                    Physical Memory
                                                    Algorithmic Memory



                            May 2, 2012
Rapid Memory Analysis & Generation
    2X
    3X
             Acceleration
    4X
                                  Push Button Analysis
  # Read     Ports
  # Write                                                               Generate Memory
                                                            Real-time
                                                                           Algorithmic
  # Width
  Specify    Capacity
             Feed Inputs
                            GUI       SYN      GEN    CHK                   Memory
  # Depth
  Memory
                                                                               …
                                                            Feedback
  Reduced    Latency
  Standard

  Power      Optimization
                            SRAM            Register File
   Area
                            eDRAM           Standard Cell

                            Library & Building Blocks

                                       May 2, 2012
Multiport Memory Usages
             Descriptor and Free Lists, Ingress Buffers
3R1W         L2 MAC Lookups, Shared Caches
 2R1W

1R2W         Descriptor and Free Lists, Egress Buffers
             Cache Coherency Arrays for L2/L3 Caches
1R3W

2R2W         Netflow, Counters
             State Tables, Linked Lists
1R1W

4Ror1W       Data and Tag Arrays for L2, L3 Caches
             Route Lookup Tables
3Ror1W
             ACL Tables
2Ror1W

             May 2, 2012
Exhaustive Formal Verification Reduces Risk

 Independently Verify Logic                                                        SRAM

                                                                                 BIST Wrapper
  • Mathematically proven algorithms
  • Formally, exhaustively verified RTL
                                             SCAN


                                                            Physical Memory


 Separately Test Physical Memories BIST
  • Supports 3rd party DFT methodology                  Algorithmic Memory
                                                            Memoir IP

  • Transparent customer BIST, BISR
  • Doesn’t need complex multiport BIST             A   D    A   D     A   D A     D




                               May 2, 2012
Tier-1 OEM Evaluation
                               – Performance, Area and Power Benefits
                  Large ASIC
                                                          Algorithmic Memory Solution
                                                                                                4X MOPS
                                                                                                Memories
24mm




                                                   21mm
                       24mm                                              21mm
          Area 576 mm2                                      Area 441 mm2
       •      800 Mb of total memory                      •      Area Savings of 135 mm2 (23% die)
       •      165 Memory Instances                        •      136 Memory Instances Accelerated
          Versatile memories required                       Power Savings > 12W
       •      4R/1W, 2R1W, 1R2W memories                     4X MOPS for select memories


                                               May 2, 2012
Summary
1.   Increases Port and Clock Performance
2.   Lowers Area and Power
3.   Easy Interface, Integration and Implementation
4.   Creates Versatile Memory Portfolio
5.   Reduces Cost, Risk and Time to Market


       Algorithmic Memories are not a panacea, but present a new solution to
            alleviate the processor embedded memory performance gap




                                     May 2, 2012
Q&A




        Sundar Iyer
sundaes@memoir-systems.com
    Come Visit Our Booth!
      Memoir Systems


           May 2, 2012

Weitere ähnliche Inhalte

Was ist angesagt?

Dme presentation-dec2012-rev13-1
Dme presentation-dec2012-rev13-1Dme presentation-dec2012-rev13-1
Dme presentation-dec2012-rev13-1
Bengt Edlund
 
Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)
LilianaSuri
 
Faster Than A Speeding Disk
Faster Than A Speeding DiskFaster Than A Speeding Disk
Faster Than A Speeding Disk
Andrey Klyachkin
 
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
BellLabs
 
Jul 19 09 1ku Price
Jul 19 09 1ku  PriceJul 19 09 1ku  Price
Jul 19 09 1ku Price
guestd420a8
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
Seokhwan Kim
 
Deep dive storage networking the path to performance
Deep dive storage networking the path to performanceDeep dive storage networking the path to performance
Deep dive storage networking the path to performance
Interop
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
ISSGC Summer School
 

Was ist angesagt? (20)

Dme presentation-dec2012-rev13-1
Dme presentation-dec2012-rev13-1Dme presentation-dec2012-rev13-1
Dme presentation-dec2012-rev13-1
 
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?  Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?
 
Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)Cybertron pc slayer ii gaming pc (blue)
Cybertron pc slayer ii gaming pc (blue)
 
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
Teradata Co-existing Systems Parallel Efficiency -- Calculation & Reconfigura...
 
Hardware
HardwareHardware
Hardware
 
Faster Than A Speeding Disk
Faster Than A Speeding DiskFaster Than A Speeding Disk
Faster Than A Speeding Disk
 
ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
 
Analyzing Chips in a System Context
Analyzing Chips in a System ContextAnalyzing Chips in a System Context
Analyzing Chips in a System Context
 
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
Green Telecom & IT Workshop by IISc and Bell Labs: Embodied topology by Prof....
 
Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11Nvidia Cuda Apps Jun27 11
Nvidia Cuda Apps Jun27 11
 
Jul 19 09 1ku Price
Jul 19 09 1ku  PriceJul 19 09 1ku  Price
Jul 19 09 1ku Price
 
Linux on System z – performance update
Linux on System z – performance updateLinux on System z – performance update
Linux on System z – performance update
 
A spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information accessA spoken dialog system for electronic program guide information access
A spoken dialog system for electronic program guide information access
 
Brochure NAS LG
Brochure NAS LGBrochure NAS LG
Brochure NAS LG
 
Deep dive storage networking the path to performance
Deep dive storage networking the path to performanceDeep dive storage networking the path to performance
Deep dive storage networking the path to performance
 
Building Applications Using NoSQL Architectures on top of SQL Azure: How MSN ...
Building Applications Using NoSQL Architectures on top of SQL Azure: How MSN ...Building Applications Using NoSQL Architectures on top of SQL Azure: How MSN ...
Building Applications Using NoSQL Architectures on top of SQL Azure: How MSN ...
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
 
Ph.D. thesis presentation
Ph.D. thesis presentationPh.D. thesis presentation
Ph.D. thesis presentation
 
Cache-partitioning
Cache-partitioningCache-partitioning
Cache-partitioning
 
intel speed-select-technology-base-frequency-enhancing-performance
intel speed-select-technology-base-frequency-enhancing-performanceintel speed-select-technology-base-frequency-enhancing-performance
intel speed-select-technology-base-frequency-enhancing-performance
 

Ähnlich wie Algorithmic Memory Increases Memory Performance by an Order of Magnitude

Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Fisnik Kraja
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
Don Brizendine
 
Modeling System Behaviors: A Better Paradigm on Prototyping
Modeling System Behaviors: A Better Paradigm on PrototypingModeling System Behaviors: A Better Paradigm on Prototyping
Modeling System Behaviors: A Better Paradigm on Prototyping
DVClub
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta
Lorenzo Corbetta
 
Rocketick accelerated verilog simulations
Rocketick  accelerated verilog simulationsRocketick  accelerated verilog simulations
Rocketick accelerated verilog simulations
chiportal
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
Xiao Qin
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming
npinto
 

Ähnlich wie Algorithmic Memory Increases Memory Performance by an Order of Magnitude (20)

Teradata memory management - A balancing act
Teradata memory management  -  A balancing actTeradata memory management  -  A balancing act
Teradata memory management - A balancing act
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
 
Ibm power7
Ibm power7Ibm power7
Ibm power7
 
X86 hardware for packet processing
X86 hardware for packet processingX86 hardware for packet processing
X86 hardware for packet processing
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
Sun Storage F5100 Flash Array, Redefining Storage Performance and Efficiency-...
 
Modeling System Behaviors: A Better Paradigm on Prototyping
Modeling System Behaviors: A Better Paradigm on PrototypingModeling System Behaviors: A Better Paradigm on Prototyping
Modeling System Behaviors: A Better Paradigm on Prototyping
 
2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta2013 02 08 annunci power 7 plus sito cta
2013 02 08 annunci power 7 plus sito cta
 
Rocketick accelerated verilog simulations
Rocketick  accelerated verilog simulationsRocketick  accelerated verilog simulations
Rocketick accelerated verilog simulations
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
Five steps perform_2009 (1)
Five steps perform_2009 (1)Five steps perform_2009 (1)
Five steps perform_2009 (1)
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
 
FLASH MEMORY: THE BIG DATA from Structure:Data 2012
FLASH MEMORY: THE BIG DATA from Structure:Data 2012FLASH MEMORY: THE BIG DATA from Structure:Data 2012
FLASH MEMORY: THE BIG DATA from Structure:Data 2012
 
Momentus xt PP Briefing
Momentus xt PP BriefingMomentus xt PP Briefing
Momentus xt PP Briefing
 
Computer memory
Computer memoryComputer memory
Computer memory
 
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
SpursEngine A High-performance Stream Processor Derived from Cell/B.E. for Me...
 
[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming[Harvard CS264] 05 - Advanced-level CUDA Programming
[Harvard CS264] 05 - Advanced-level CUDA Programming
 
Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1
 
Challenges in mixed signal
Challenges in mixed signal Challenges in mixed signal
Challenges in mixed signal
 

Mehr von chiportal

Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
chiportal
 

Mehr von chiportal (20)

Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China Prof. Zhihua Wang, Tsinghua University, Beijing, China
Prof. Zhihua Wang, Tsinghua University, Beijing, China
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
 
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
Prof. Steve Furber, University of Manchester, Principal Designer of the BBC M...
 
Prof. Uri Weiser,Technion
Prof. Uri Weiser,TechnionProf. Uri Weiser,Technion
Prof. Uri Weiser,Technion
 
Ken Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, FaradayKen Liao, Senior Associate VP, Faraday
Ken Liao, Senior Associate VP, Faraday
 
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 Prof. Danny Raz, Director, Bell Labs Israel, Nokia  Prof. Danny Raz, Director, Bell Labs Israel, Nokia
Prof. Danny Raz, Director, Bell Labs Israel, Nokia
 
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, SynopsysMarco Casale-Rossi, Product Mktg. Manager, Synopsys
Marco Casale-Rossi, Product Mktg. Manager, Synopsys
 
Dr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazzDr.Efraim Aharoni, ESD Leader, TowerJazz
Dr.Efraim Aharoni, ESD Leader, TowerJazz
 
Eddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, IntelEddy Kvetny, System Engineering Group Leader, Intel
Eddy Kvetny, System Engineering Group Leader, Intel
 
Dr. John Bainbridge, Principal Application Architect, NetSpeed
 Dr. John Bainbridge, Principal Application Architect, NetSpeed  Dr. John Bainbridge, Principal Application Architect, NetSpeed
Dr. John Bainbridge, Principal Application Architect, NetSpeed
 
Xavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, ArterisXavier van Ruymbeke, App. Engineer, Arteris
Xavier van Ruymbeke, App. Engineer, Arteris
 
Asi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, VtoolAsi Lifshitz, VP R&D, Vtool
Asi Lifshitz, VP R&D, Vtool
 
Zvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQZvika Rozenshein,General Manager, EngineeringIQ
Zvika Rozenshein,General Manager, EngineeringIQ
 
Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC Lewis Chu,Marketing Director,GUC
Lewis Chu,Marketing Director,GUC
 
Kunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-SiliconKunal Varshney, VLSI Engineer, Open-Silicon
Kunal Varshney, VLSI Engineer, Open-Silicon
 
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, SynopsysGert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
 
Tuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano RetinaTuvia Liran, Director of VLSI, Nano Retina
Tuvia Liran, Director of VLSI, Nano Retina
 
Sagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-SiliconSagar Kadam, Lead Software Engineer, Open-Silicon
Sagar Kadam, Lead Software Engineer, Open-Silicon
 
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP SemiconductorRonen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
Ronen Shtayer,Director of ASG Operations & PMO, NXP Semiconductor
 
Prof. Emanuel Cohen, Technion
Prof. Emanuel Cohen, TechnionProf. Emanuel Cohen, Technion
Prof. Emanuel Cohen, Technion
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Kürzlich hochgeladen (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Algorithmic Memory Increases Memory Performance by an Order of Magnitude

  • 1. Algorithmic Memory Increases Memory Performance By an Order of Magnitude Sundar Iyer Co-Founder & CTO Memoir Systems Track F, Lecture 2: Intellectual Property for SoC & Cores May 2, 2012
  • 2. Problem: Processor-Embedded Memory Performance Gap Performance degradation can be more significant more significant and is getting worse! Processor Embedded Memory Performance Gap Normalized Growth *Source: Hennessy and Patterson, 5th Edition May 2, 2012
  • 3. Why is Embedded Memory Slow? 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 clk read One operation per addr A B C D E F G H memory clock cycle data A B C D E F G H How can we increase memory performance without increasing memory clock speed? May 2, 2012
  • 4. Solution: Algorithmic Memory®= Memory Macros + Algorithms Physical Memory Algorithmic Memory 1P @ 500 MHz 1P @ 500 MHz 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P 1P Extra Memory 1P @ 500 MHz 4P @ 500 MHz Allows 500 Million MOPS1 Allows 2000 Million MOPS (1 Memory Operations Per Second) More Ports, Same Clock May 2, 2012
  • 5. Solution Overview 2X Performance for ~15% area overhead Any Embedded Physical Memory RTL Based: No Circuit or Simultaneous Accesses to the 1P 1P 1P 1P Layout changes same Address, Row, Column, or Bank (no exceptions) 1P 1P 1P 1P 1P 1P 1P 1P Extra Memory Algorithmic Memory Exhaustively Formally Verified Data Data Addr Addr Addr Addr Data Data Each Port can access the & Transparent to end-user entire Memory Address Using Physical 1-Port Memory to Build any Multiport Functionality May 2, 2012
  • 6. Usage & Adoption  Easily Interface 128 Width • Presents standard memory interface • Adds no clock cycle latency • Used as a drop-in replacement 8K Depth Physical Memory  Readily Integrate • Fits seamlessly in SoC design flow Memoir IP IP Memoir • Used in SoCs - ASICs, ASSPs, GPPs A D A D A D A D  Rapidly Implement Identical Pinout to Standard Memory • Supports any process, node or foundry May 2, 2012
  • 7. Increases Density Denser Physical 1P Memory Algorithmic 2P Memory Physical 2P Memory Normalized for 1P = 1 Mb/mm2 May 2, 2012
  • 8. Increases Density Normalized for 1P = 1 Mb/mm2 May 2, 2012
  • 9. Reduces Total Power Based on 40nm example May 2, 2012
  • 10. Reduces Total Power Based on 40nm example May 2, 2012
  • 11. Configurable Performance Performance (MOPS) Higher performance algorithmic memories 4P Higher density 2P algorithmic memories Memory Density (Mb/mm2) Physical Memory Power efficient algorithmic memories Higher Performance Algorithmic Memory Algorithmic 2P SP SP Area Efficient Algorithmic Memory Power Efficient Algorithmic Memory Power Efficiency (Mb/mW) May 2, 2012
  • 12. Increases Portfolio of Available Memories 1R1W 1R/4W 2R/1W 4R/1W 1R/2W 3R1W 1RW 2RW 2R2W 1R2W 1R3W 2R1W 3R/1W Physical Memory Algorithmic Memory May 2, 2012
  • 13. Rapid Memory Analysis & Generation 2X 3X Acceleration 4X Push Button Analysis # Read Ports # Write Generate Memory Real-time Algorithmic # Width Specify Capacity Feed Inputs GUI SYN GEN CHK Memory # Depth Memory … Feedback Reduced Latency Standard Power Optimization SRAM Register File Area eDRAM Standard Cell Library & Building Blocks May 2, 2012
  • 14. Multiport Memory Usages  Descriptor and Free Lists, Ingress Buffers 3R1W  L2 MAC Lookups, Shared Caches 2R1W 1R2W  Descriptor and Free Lists, Egress Buffers  Cache Coherency Arrays for L2/L3 Caches 1R3W 2R2W  Netflow, Counters  State Tables, Linked Lists 1R1W 4Ror1W  Data and Tag Arrays for L2, L3 Caches  Route Lookup Tables 3Ror1W  ACL Tables 2Ror1W May 2, 2012
  • 15. Exhaustive Formal Verification Reduces Risk  Independently Verify Logic SRAM BIST Wrapper • Mathematically proven algorithms • Formally, exhaustively verified RTL SCAN Physical Memory  Separately Test Physical Memories BIST • Supports 3rd party DFT methodology Algorithmic Memory Memoir IP • Transparent customer BIST, BISR • Doesn’t need complex multiport BIST A D A D A D A D May 2, 2012
  • 16. Tier-1 OEM Evaluation – Performance, Area and Power Benefits Large ASIC Algorithmic Memory Solution 4X MOPS Memories 24mm 21mm 24mm 21mm  Area 576 mm2  Area 441 mm2 • 800 Mb of total memory • Area Savings of 135 mm2 (23% die) • 165 Memory Instances • 136 Memory Instances Accelerated  Versatile memories required  Power Savings > 12W • 4R/1W, 2R1W, 1R2W memories  4X MOPS for select memories May 2, 2012
  • 17. Summary 1. Increases Port and Clock Performance 2. Lowers Area and Power 3. Easy Interface, Integration and Implementation 4. Creates Versatile Memory Portfolio 5. Reduces Cost, Risk and Time to Market Algorithmic Memories are not a panacea, but present a new solution to alleviate the processor embedded memory performance gap May 2, 2012
  • 18. Q&A Sundar Iyer sundaes@memoir-systems.com Come Visit Our Booth! Memoir Systems May 2, 2012

Hinweis der Redaktion

  1. Today, a single-port embedded memory can perform one memory operation per clock cycle. Therefore embedded memory performance has traditionally been closely tied to memory clock speed and ultimately limited by it. Because embedded memory IP providers (responding to application needs for more on-chip memory) had to make design trade-offs early on that favored high density over high speed, memory clock speeds lag behind processor clock speeds. With its Algorithmic Memory technology, Memoir Systems tackles a fundamental question --- can we increase memory performance without increasing memory clock speeds? Historically, circuits and advances in lithography have been used at every generation as the approach to enhance memory performance. Unfortunately these approaches alone do not give enough performance improvement, and are not keeping up with applications that require higher memory performance. The problem is we have limited our thinking about embedded memories to a purely circuit and process oriented approach. Thus, our focus has been on maximizing the number of transistors on a chip and cranking up the clock speed. This has been successful up to a point, but as transistors approach atomic dimension, we are running into fundamental physical barriers. For this reason, we need to rethink our approach to embedded memory design.
  2. Algorithmic memory technology increases the density (lowers area) of physical memories. This also reduces the leakage power consumption.
  3. Algorithmic memory technology allows system designers to treat memory performance as a configurable entity with its own set of tradeoffs with respect to speed, area and power.
  4. AlgorithmicMemories can be generated from a small set of base physical memories and provide a broad portfolio of customized memories with any combination of read and write interfaces.
  5. An algorithmic memory synthesis platform can analyze and estimate the resulting area, power and speed of custom memory configurations in seconds, and generate it in a matter of days.
  6. OrangeApplications??Compare sizes area/power
  7. Logic is scan insertedScan chain way to test normal logicFlops are scan chain scan enabled flops