SlideShare ist ein Scribd-Unternehmen logo
1 von 20
© 2013 IBM Corporation
PSS
A Prototype Storage Subsystem based on PCM
IBM Research – Zurich
Ioannis Koltsidas, Roman Pletka, Peter Mueller,
Thomas Weigold, Evangelos Eleftheriou
University of Patras
Maria Varsamou, Athina Ntalla, Elina Bougioukou,
Aspasia Palli, Theodore Antonakopoulos
© 2013 IBM Corporation
Phase Change Memory (PCM)
 Based on the thermal threshold switching effect of chalcogenidic meterials
 Two Phases:
2
Set
Reset
Amorphous Phase Crystalline Phase
 Phases have very different electrical resistances (ratio of 1:100 to 1:1000)
 Transition between phases by controlled heating and cooling
 Read time: 100-300 nsec
 Program time: 10-150 μsec
 PCM cells can be reprogrammed at least 106 times
 Performance and price characteristics between DRAM and Flash
© 2013 IBM Corporation
Placing PCM in Servers and Storage Systems
Server System Storage System
RAID ctrl
HBA
CPUs
DRAM
Cache
I/O
bus
RAID ctrl
CPUs
I/O
bus
HA
PCM
PCM
Hybrid PCIe
attached SSD
Hybrid
SAS/SATA
attached SSD
PCM
Hybrid PCIe
attached SSD
Hybrid
SAS/SATA
attached SSD
PCM
DRAM DRAM
PCM
DRAM
Cache PCMPCM
PCM PCM
3
PCMPCM
HDDSSD
All-Flash
Array
Metadata
Tables
© 2013 IBM Corporation
PSS: A PCM-based PCI-e Prototype Card
4
Goal:
Architect a PCM-based device and implement
a fully-functional, high-performance PCI-e card
 Take advantage of the characteristics of PCM and
mitigate its limitations
 Target is workloads dominated by 4kB requests
 Simple, lightweight hardware design
 System integration of multiple cards through
software
 Emphasis on consistently low, predictable latency
PCM PCI-e cardHost
PCM
chips
PCM
Channels
Lean
PCM
Controller
Multi-lane
PCI-e bus
Block
Device
Driver
PCM
pipe #1
PCM
pipe #n
512B
4kB
 Limited in capacity due to the density of commercially available PCM parts (as of early 2013)
 Use cases:
– Caching device
– Metadata store
– Backend for low-latency Key-Value store
– Tiered storage device in a hybrid configuration with Flash
© 2013 IBM Corporation
PCM Parts: Micron P5Q
 90nm technology node
 128 Mbit devices (NP5Q128AE3ESFC0E)
 SPI bus compatible serial interface
 Maximum clock frequency: 66 MHz
 64-byte write buffer
– 120 μsec average program time
– about 0.5 MB/s write bandwidth
5
Block transfer time: 8.24 usecs (64+4 Bytes)
Sector I/O (512B + 64B):
Write: 1.15 msecs (0.86 kIOPs)
Read: 75.24 usecs (13.29 kIOPs)
Very asymmetric
read / write performance
© 2013 IBM Corporation
2D PCM Channel Architecture
6
PCM
(1,NI)
PCM
(1, NI -1)
PCM
(1,1)
k
PCM
(2, NI)
PCM
(2, NI -1)
PCM
(2,1)
k
PCM
(NS, NI)
PCM
(NS, NI -1)
PCM
(NS,1)
k
FSM
Data Buffer
FSM FSMFSM
Sub-channel
PCM
Channel
Controller
Sub-bank
© 2013 IBM Corporation
Read vs. Write Performance Trade-Offs
 A high degree of pipelining:
– Increases the write performance
• By having the long programming times overlap
– May reduce the read performance
• Read times are anyway very short
• Less parallelism due to fewer I/O pins
 For a given budget of I/O pins:
– More sub-channels  Better read performance
– More sub-banks  Better write performance
 Application needs should drive the configuration
– Channels with different geometry in the same device
possible
 We chose a configuration that minimizes write latency
without severely penalizing reads
7
© 2013 IBM Corporation
3x3 Channel Architecture
8
PCM
(1,3)
PCM
(1, 2)
PCM
(1,1)
k
PCM
(2, 3)
PCM
(2, 2)
PCM
(2,1)
k
PCM
(3, 3)
PCM
(3, 2)
PCM
(3,1)
k
FSM
Data Buffer
FSM FSMFSM
PCM Channel Controller
1 Block = 64 bytes
3 blocks
3 blocks
3 blocks
For each user sector (512b= 8x64), we store 64bytes of metadata, i.e., 9 blocks in total
P
C
M
B
a
n
k
© 2013 IBM Corporation
PSS Channel Card
One PCM channel per side
Two PCM channels per card
Channel card
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
[1,1]
[1,3]
[3,1]
[3,3]
[1,2]
[2,1]
[2,3]
[2,2] [3,2]
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
CS/ Din
CLK Dout
[1,1]
[1,3]
[3,1]
[3,3]
[1,2]
[2,1]
[2,3]
[2,2] [3,2]
CS[3] CS[2] CS[1]
CLK D1[1:0] D2[1:0] D3[1:0]DIR
C B A Y0
Y1
Y2Y3Y4Y5
3 to 8
DECODER
PCM channel specs
Data transfer Rate: 49.5 MBps
Sector read time: 13.8 usecs
Sector read rate: 61.6 ksectors/sec
Sector write time: 133.8 usecs
Sector write rate: 14.8 ksectors/sec
1 PCM Channel = 2 Banks (2x3x3)
© 2013 IBM Corporation
PSS PCI-e Card
10
PCMChannelmodule
Xilinx Zynq-7045 FPGA Board
2XHost-attachedPSSCards
 Error correction based on simple BCH codes
o 6 BCH codewords per 512 bytes sector with 4 bits error correction capability per codeword.
 Wear leveling using a Start-Gap scheme
 512MB of DRAM, mostly used as a write cache
 Support for cached writes, direct writes with early completion, direct writes with late completion
8 PCM channels with pipeline support per card
© 2013 IBM Corporation
PSS Experimental Results
Throughput versus Offered Load (4kB pages)
© 2013 IBM Corporation
Random Read
PSS Experimental Results
I/O Completion Latency Distribution
PCM technology programming timeRandom Write (cached)
© 2013 IBM Corporation
Latency distribution comparison to Flash-based devices
 Devices
– PSS PCI-e Card
– MLC Flash PCI-e SSD 1
– MLC Flash PCI-e SSD 2
– TLC Flash SATA SSD
 Experiment
Per-I/O latency measurements for 2 hours of uniformly random 4kB writes at QD=1
(after 12 hours of preconditioning with the same workload)
13
© 2013 IBM Corporation
Latency Profile up to 1msec
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
Latency Profile up to 10msec
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
Total Latency Profile
PSS MLC1
MLC2 TLC
© 2013 IBM Corporation
PCM Endurance Measurements
17
minimum write time per sector
maximum write time per sector
mean write time per sector
The effect of PCM aging on
write time and BER
Experimental parameters:
• Random data
• 32 sectors per write cycle
• 4 PCM channels
• 8 PCM banks
• Pipeline is active
Experiment
- Perform 10K write cycles
with random data (x32
sectors)
- Write, read and compare a
set of 32 sectors (single
write cycle)
PCM Write Latency Distribution
Experimental parameters:
• Random data
• 32 sectors per write cycle
• 4 PCM channels
• 8 PCM banks/controllers
• Pipeline is active
Experiment
- Perform 10K write cycles with random data
(x32 sectors)
- Write, read and compare a set of 32 sectors
(single write cycle)
© 2013 IBM Corporation
Conclusions
 PCM is a promising new memory technology
 PSS is a PCI-e attached subsystem that mitigates the limitations of current PCM technology
 The 2D Channel Architecture allows the designer to trade-off read performance for write
performance and vice-versa
 PSS achieved good performace
– 65k Read IOPS @ 35 μsec
– 15k Write IOPS @ 61 μsec
 PSS achieved consistently low write latency
– 99.9% of the requests completed within 240 μsec
• 12x and 275x lower than MLC and TLC Flash SSDs, respectively
– Highest observed latency was 2 msec
• 7x and 61x lower than MLC and TLC Flash SSDs, respectively
19
© 2013 IBM Corporation20

Weitere ähnliche Inhalte

Was ist angesagt?

Track e low voltage sram - adam teman bgu
Track e   low voltage sram - adam teman bguTrack e   low voltage sram - adam teman bgu
Track e low voltage sram - adam teman bguchiportal
 
2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentationSaket Vihari
 
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM Research
 
Multiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image ProcessingMultiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image Processingmayank.grd
 
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsHybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsMicronTechnology
 
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageHybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageBill Kohnen
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAlessio Villardita
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...rahul kumar verma
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeFPGA Central
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...In-Memory Computing Summit
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiAnkit Raj
 
Reliability and yield
Reliability and yield Reliability and yield
Reliability and yield rohitladdu
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAShien-Chun Luo
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronizationrtsljekim
 

Was ist angesagt? (20)

Track e low voltage sram - adam teman bgu
Track e   low voltage sram - adam teman bguTrack e   low voltage sram - adam teman bgu
Track e low voltage sram - adam teman bgu
 
SRAM
SRAMSRAM
SRAM
 
2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation2012 benjamin klenk-future-memory_technologies-presentation
2012 benjamin klenk-future-memory_technologies-presentation
 
Multicore Processors
Multicore ProcessorsMulticore Processors
Multicore Processors
 
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
IBM and ASTRON 64-Bit Microserver Prototype Prepares for Big Bang's Big Data,...
 
Multiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image ProcessingMultiprocessor Architecture for Image Processing
Multiprocessor Architecture for Image Processing
 
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory SystemsHybrid Memory Cube: Developing Scalable and Resilient Memory Systems
Hybrid Memory Cube: Developing Scalable and Resilient Memory Systems
 
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive AdvantageHybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
Hybrid Memory Cubes offer Smart Designers and Buyers a Competitive Advantage
 
An introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale ComputersAn introduction to the Design of Warehouse-Scale Computers
An introduction to the Design of Warehouse-Scale Computers
 
Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...Memory map selection of real time sdram controller using verilog full project...
Memory map selection of real time sdram controller using verilog full project...
 
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, LatticeMemory Interfaces & Controllers - Sandeep Kulkarni, Lattice
Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice
 
Sram technology
Sram technologySram technology
Sram technology
 
Lecture2
Lecture2Lecture2
Lecture2
 
DDR2 SDRAM
DDR2 SDRAMDDR2 SDRAM
DDR2 SDRAM
 
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
IMCSummit 2015 - Day 1 Developer Session - The Science and Engineering Behind...
 
Multicore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash PrajapatiMulticore processor by Ankit Raj and Akash Prajapati
Multicore processor by Ankit Raj and Akash Prajapati
 
2020 icldla-updated
2020 icldla-updated2020 icldla-updated
2020 icldla-updated
 
Reliability and yield
Reliability and yield Reliability and yield
Reliability and yield
 
customization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLAcustomization of a deep learning accelerator, based on NVDLA
customization of a deep learning accelerator, based on NVDLA
 
Multi-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O SynchronizationMulti-IMA Partition Scheduling for Global I/O Synchronization
Multi-IMA Partition Scheduling for Global I/O Synchronization
 

Andere mochten auch

PHASE CHANGE MEMORY
PHASE CHANGE MEMORYPHASE CHANGE MEMORY
PHASE CHANGE MEMORYSrinivas H V
 
Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsJacek Laskowski
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganSpark Summit
 
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Spark Summit
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiSpark Summit
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovSpark Summit
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterJen Aman
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Helena Edelson
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Helena Edelson
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsAnyscale
 

Andere mochten auch (11)

PHASE CHANGE MEMORY
PHASE CHANGE MEMORYPHASE CHANGE MEMORY
PHASE CHANGE MEMORY
 
Opening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing toolsOpening slides to Warsaw Scala FortyFives on Testing tools
Opening slides to Warsaw Scala FortyFives on Testing tools
 
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth LoganMulti Model Machine Learning by Maximo Gurmendez and Beth Logan
Multi Model Machine Learning by Maximo Gurmendez and Beth Logan
 
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
Reactive Streams, linking Reactive Application to Spark Streaming by Luc Bour...
 
Beneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek LaskowskiBeneath RDD in Apache Spark by Jacek Laskowski
Beneath RDD in Apache Spark by Jacek Laskowski
 
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander UlanovA Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
CaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark ClusterCaffeOnSpark: Deep Learning On Spark Cluster
CaffeOnSpark: Deep Learning On Spark Cluster
 
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
Leveraging Kafka for Big Data in Real Time Bidding, Analytics, ML & Campaign ...
 
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
Streaming Big Data with Spark, Kafka, Cassandra, Akka & Scala (from webinar)
 
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning ModelsApache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
Apache ® Spark™ MLlib 2.x: How to Productionize your Machine Learning Models
 

Ähnlich wie A Prototype Storage Subsystem based on Phase Change Memory

Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptPavikaSharma3
 
Decoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIDecoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIAllan Cantle
 
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
Ics21 workshop   decoupling compute from memory, storage & io with omi - ...Ics21 workshop   decoupling compute from memory, storage & io with omi - ...
Ics21 workshop decoupling compute from memory, storage & io with omi - ...Vaibhav R
 
Kiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptKiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptTriTrang4
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulationDeepak Shankar
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Joshua Mora
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingAMD
 
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingCXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingMemory Fabric Forum
 
A Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory OptimizationA Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory Optimizationijsrd.com
 
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallQ1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallMemory Fabric Forum
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMemory Fabric Forum
 
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Codemotion
 
PIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPremier Farnell
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manualkot seelam
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxMemory Fabric Forum
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overviewNabil Chouba
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5Steen Larsen
 

Ähnlich wie A Prototype Storage Subsystem based on Phase Change Memory (20)

7 eti pres
7 eti pres7 eti pres
7 eti pres
 
Microelectronics U4.pptx.ppt
Microelectronics U4.pptx.pptMicroelectronics U4.pptx.ppt
Microelectronics U4.pptx.ppt
 
Decoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMIDecoupling Compute from Memory, Storage and IO with OMI
Decoupling Compute from Memory, Storage and IO with OMI
 
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
Ics21 workshop   decoupling compute from memory, storage & io with omi - ...Ics21 workshop   decoupling compute from memory, storage & io with omi - ...
Ics21 workshop decoupling compute from memory, storage & io with omi - ...
 
Kiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.pptKiến trúc máy tính - COE 301 - Memory.ppt
Kiến trúc máy tính - COE 301 - Memory.ppt
 
Automotive network and gateway simulation
Automotive network and gateway simulationAutomotive network and gateway simulation
Automotive network and gateway simulation
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...
 
Heterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D PackagingHeterogeneous Integration with 3D Packaging
Heterogeneous Integration with 3D Packaging
 
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and SwitchingCXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
 
A Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory OptimizationA Simplied Bit-Line Technique for Memory Optimization
A Simplied Bit-Line Technique for Memory Optimization
 
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory WallQ1 Memory Fabric Forum: Breaking Through the Memory Wall
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling Ecosystem
 
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
Yufeng Guo - Tensor Processing Units: how TPUs enable the next generation of ...
 
PIC32MX Microcontroller Family
PIC32MX Microcontroller FamilyPIC32MX Microcontroller Family
PIC32MX Microcontroller Family
 
VJITSk 6713 user manual
VJITSk 6713 user manualVJITSk 6713 user manual
VJITSk 6713 user manual
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Zynq ultrascale
Zynq ultrascaleZynq ultrascale
Zynq ultrascale
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 
Semiconductor overview
Semiconductor overviewSemiconductor overview
Semiconductor overview
 
Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 

Mehr von IBM Research

IBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research
 
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME  64-bit Hot Water Cooled MicroserverIBM/ASTRON DOME  64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME 64-bit Hot Water Cooled MicroserverIBM Research
 
IBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM Research
 
The Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementThe Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementIBM Research
 
Big Data and the Future of Storage
Big Data and the Future of StorageBig Data and the Future of Storage
Big Data and the Future of StorageIBM Research
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive ComputingIBM Research
 
Das IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberDas IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberIBM Research
 
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichNano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichIBM Research
 
Dechema Conference: Istanbul
Dechema Conference: IstanbulDechema Conference: Istanbul
Dechema Conference: IstanbulIBM Research
 
IBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research
 

Mehr von IBM Research (11)

IBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and InnovationIBM Research - Zurich Celebrates 60 Years of Science and Innovation
IBM Research - Zurich Celebrates 60 Years of Science and Innovation
 
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME  64-bit Hot Water Cooled MicroserverIBM/ASTRON DOME  64-bit Hot Water Cooled Microserver
IBM/ASTRON DOME 64-bit Hot Water Cooled Microserver
 
IBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOMEIBM and ASTRON 64bit μServer for DOME
IBM and ASTRON 64bit μServer for DOME
 
The Dilemmas of Innovation Management
The Dilemmas of Innovation ManagementThe Dilemmas of Innovation Management
The Dilemmas of Innovation Management
 
Big Data and the Future of Storage
Big Data and the Future of StorageBig Data and the Future of Storage
Big Data and the Future of Storage
 
The New Era of Cognitive Computing
The New Era of Cognitive ComputingThe New Era of Cognitive Computing
The New Era of Cognitive Computing
 
Das IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als ArbeitgeberDas IBM Forschungslabor als Arbeitgeber
Das IBM Forschungslabor als Arbeitgeber
 
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - ZurichNano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
Nano, SuperMUC and Photovoltaics:A Day in the Life of IBM Research - Zurich
 
Meet IBM Research
Meet IBM ResearchMeet IBM Research
Meet IBM Research
 
Dechema Conference: Istanbul
Dechema Conference: IstanbulDechema Conference: Istanbul
Dechema Conference: Istanbul
 
IBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor BriefingIBM Research: IBM 2010 Investor Briefing
IBM Research: IBM 2010 Investor Briefing
 

Kürzlich hochgeladen

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 

Kürzlich hochgeladen (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 

A Prototype Storage Subsystem based on Phase Change Memory

  • 1. © 2013 IBM Corporation PSS A Prototype Storage Subsystem based on PCM IBM Research – Zurich Ioannis Koltsidas, Roman Pletka, Peter Mueller, Thomas Weigold, Evangelos Eleftheriou University of Patras Maria Varsamou, Athina Ntalla, Elina Bougioukou, Aspasia Palli, Theodore Antonakopoulos
  • 2. © 2013 IBM Corporation Phase Change Memory (PCM)  Based on the thermal threshold switching effect of chalcogenidic meterials  Two Phases: 2 Set Reset Amorphous Phase Crystalline Phase  Phases have very different electrical resistances (ratio of 1:100 to 1:1000)  Transition between phases by controlled heating and cooling  Read time: 100-300 nsec  Program time: 10-150 μsec  PCM cells can be reprogrammed at least 106 times  Performance and price characteristics between DRAM and Flash
  • 3. © 2013 IBM Corporation Placing PCM in Servers and Storage Systems Server System Storage System RAID ctrl HBA CPUs DRAM Cache I/O bus RAID ctrl CPUs I/O bus HA PCM PCM Hybrid PCIe attached SSD Hybrid SAS/SATA attached SSD PCM Hybrid PCIe attached SSD Hybrid SAS/SATA attached SSD PCM DRAM DRAM PCM DRAM Cache PCMPCM PCM PCM 3 PCMPCM HDDSSD All-Flash Array Metadata Tables
  • 4. © 2013 IBM Corporation PSS: A PCM-based PCI-e Prototype Card 4 Goal: Architect a PCM-based device and implement a fully-functional, high-performance PCI-e card  Take advantage of the characteristics of PCM and mitigate its limitations  Target is workloads dominated by 4kB requests  Simple, lightweight hardware design  System integration of multiple cards through software  Emphasis on consistently low, predictable latency PCM PCI-e cardHost PCM chips PCM Channels Lean PCM Controller Multi-lane PCI-e bus Block Device Driver PCM pipe #1 PCM pipe #n 512B 4kB  Limited in capacity due to the density of commercially available PCM parts (as of early 2013)  Use cases: – Caching device – Metadata store – Backend for low-latency Key-Value store – Tiered storage device in a hybrid configuration with Flash
  • 5. © 2013 IBM Corporation PCM Parts: Micron P5Q  90nm technology node  128 Mbit devices (NP5Q128AE3ESFC0E)  SPI bus compatible serial interface  Maximum clock frequency: 66 MHz  64-byte write buffer – 120 μsec average program time – about 0.5 MB/s write bandwidth 5 Block transfer time: 8.24 usecs (64+4 Bytes) Sector I/O (512B + 64B): Write: 1.15 msecs (0.86 kIOPs) Read: 75.24 usecs (13.29 kIOPs) Very asymmetric read / write performance
  • 6. © 2013 IBM Corporation 2D PCM Channel Architecture 6 PCM (1,NI) PCM (1, NI -1) PCM (1,1) k PCM (2, NI) PCM (2, NI -1) PCM (2,1) k PCM (NS, NI) PCM (NS, NI -1) PCM (NS,1) k FSM Data Buffer FSM FSMFSM Sub-channel PCM Channel Controller Sub-bank
  • 7. © 2013 IBM Corporation Read vs. Write Performance Trade-Offs  A high degree of pipelining: – Increases the write performance • By having the long programming times overlap – May reduce the read performance • Read times are anyway very short • Less parallelism due to fewer I/O pins  For a given budget of I/O pins: – More sub-channels  Better read performance – More sub-banks  Better write performance  Application needs should drive the configuration – Channels with different geometry in the same device possible  We chose a configuration that minimizes write latency without severely penalizing reads 7
  • 8. © 2013 IBM Corporation 3x3 Channel Architecture 8 PCM (1,3) PCM (1, 2) PCM (1,1) k PCM (2, 3) PCM (2, 2) PCM (2,1) k PCM (3, 3) PCM (3, 2) PCM (3,1) k FSM Data Buffer FSM FSMFSM PCM Channel Controller 1 Block = 64 bytes 3 blocks 3 blocks 3 blocks For each user sector (512b= 8x64), we store 64bytes of metadata, i.e., 9 blocks in total P C M B a n k
  • 9. © 2013 IBM Corporation PSS Channel Card One PCM channel per side Two PCM channels per card Channel card CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout [1,1] [1,3] [3,1] [3,3] [1,2] [2,1] [2,3] [2,2] [3,2] CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout CS/ Din CLK Dout [1,1] [1,3] [3,1] [3,3] [1,2] [2,1] [2,3] [2,2] [3,2] CS[3] CS[2] CS[1] CLK D1[1:0] D2[1:0] D3[1:0]DIR C B A Y0 Y1 Y2Y3Y4Y5 3 to 8 DECODER PCM channel specs Data transfer Rate: 49.5 MBps Sector read time: 13.8 usecs Sector read rate: 61.6 ksectors/sec Sector write time: 133.8 usecs Sector write rate: 14.8 ksectors/sec 1 PCM Channel = 2 Banks (2x3x3)
  • 10. © 2013 IBM Corporation PSS PCI-e Card 10 PCMChannelmodule Xilinx Zynq-7045 FPGA Board 2XHost-attachedPSSCards  Error correction based on simple BCH codes o 6 BCH codewords per 512 bytes sector with 4 bits error correction capability per codeword.  Wear leveling using a Start-Gap scheme  512MB of DRAM, mostly used as a write cache  Support for cached writes, direct writes with early completion, direct writes with late completion 8 PCM channels with pipeline support per card
  • 11. © 2013 IBM Corporation PSS Experimental Results Throughput versus Offered Load (4kB pages)
  • 12. © 2013 IBM Corporation Random Read PSS Experimental Results I/O Completion Latency Distribution PCM technology programming timeRandom Write (cached)
  • 13. © 2013 IBM Corporation Latency distribution comparison to Flash-based devices  Devices – PSS PCI-e Card – MLC Flash PCI-e SSD 1 – MLC Flash PCI-e SSD 2 – TLC Flash SATA SSD  Experiment Per-I/O latency measurements for 2 hours of uniformly random 4kB writes at QD=1 (after 12 hours of preconditioning with the same workload) 13
  • 14. © 2013 IBM Corporation Latency Profile up to 1msec PSS MLC1 MLC2 TLC
  • 15. © 2013 IBM Corporation Latency Profile up to 10msec PSS MLC1 MLC2 TLC
  • 16. © 2013 IBM Corporation Total Latency Profile PSS MLC1 MLC2 TLC
  • 17. © 2013 IBM Corporation PCM Endurance Measurements 17 minimum write time per sector maximum write time per sector mean write time per sector The effect of PCM aging on write time and BER Experimental parameters: • Random data • 32 sectors per write cycle • 4 PCM channels • 8 PCM banks • Pipeline is active Experiment - Perform 10K write cycles with random data (x32 sectors) - Write, read and compare a set of 32 sectors (single write cycle)
  • 18. PCM Write Latency Distribution Experimental parameters: • Random data • 32 sectors per write cycle • 4 PCM channels • 8 PCM banks/controllers • Pipeline is active Experiment - Perform 10K write cycles with random data (x32 sectors) - Write, read and compare a set of 32 sectors (single write cycle)
  • 19. © 2013 IBM Corporation Conclusions  PCM is a promising new memory technology  PSS is a PCI-e attached subsystem that mitigates the limitations of current PCM technology  The 2D Channel Architecture allows the designer to trade-off read performance for write performance and vice-versa  PSS achieved good performace – 65k Read IOPS @ 35 μsec – 15k Write IOPS @ 61 μsec  PSS achieved consistently low write latency – 99.9% of the requests completed within 240 μsec • 12x and 275x lower than MLC and TLC Flash SSDs, respectively – Highest observed latency was 2 msec • 7x and 61x lower than MLC and TLC Flash SSDs, respectively 19
  • 20. © 2013 IBM Corporation20

Hinweis der Redaktion

  1. PCM is emerging as one of the most promising NVM technologies and is becoming more and more matureThe active material in PCM is a chalcogenidic alloy which is placed between two electrically conducting electrodes in each cell of the memory array.A low current is then applied that causes a phase transition.
  2. We focus on a PCM-based system that uses PCM in a persistent storage device attached on the I/O bus of the system
  3. One of the main concerns regarding the use of PCM for high-performance applications is its relatively high write latency. The purpose of this work has been to architect a PCM-based device and implement a fully functional, high-performance PCI-e card that takes advantage of the characteristics of PCM and mitigates its limitations. \The design of PSS targets storage workloads that are dominated by small (e.g., 4 kB) random I/O operations and aims to achieve low, predictable latency with reads and writes
  4. Was commercially available when the project started (end of 2012)