SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
An Overview of Flash Storage
                          for Databases
                               Morgan Tocker
                           <morgan@percona.com>




     1
Wednesday, March 9, 2011
Introduction

                                [ Me]                       [Percona]

                   Director of Training. Previously    Consulting, Training,
                       worked at MySQL, Sun           Support & Development
                            Microsystems.                  for MySQL.




     ★   No invested interest in which hardware I recommend.
         ✦
             [Disclaimer] Some hardware vendors have engaged in our
             services to evaluate and improve performance of their
             products.

     2
Wednesday, March 9, 2011
What this talk is about
     ★   Flash technologies (NAND, NOR).
     ★   Server Usage.
         ✦
              Not USB thumb drives.
         ✦
              Not Consumer usage.
     ★   “For Database” == MySQL.
         ✦
              Should be more or less applicable for all databases.




     3
Wednesday, March 9, 2011
Agenda
     ★   Introduction.
     ★   A look at the current market.
     ★   Applications.




     4
Wednesday, March 9, 2011
Revolutionary
     ★   Change in technology -
         ✦
              From spinning disk to solid state.
     ★   No mechanical moving parts.
     ★   Jump in performance.
     ★   Requires changes in the Application.
     ★   Hard not to predict a quick replacement to all SSDs in
         the next 5-10 years*



             * However, at the moment hard disks are still
     5       becoming cheaper (size) quicker than SSDs!
Wednesday, March 9, 2011
“Numbers everyone should know”
      L1 cache reference                                                              0.5 ns
      Branch mispredict                                                               5 ns
      L2 cache reference                                                              7 ns
      Mutex lock/unlock                                                              25 ns
      Main memory reference                                                         100 ns
      Compress 1K bytes with Zippy                                                3,000 ns
      Send 2K bytes over 1 Gbps network                                          20,000 ns
      NAND Flash (my estimate)                                                   50,000 ns
      Read 1 MB sequentially from memory                                        250,000 ns
      Round trip within same datacenter                                         500,000 ns
      Disk seek                                                              10,000,000 ns
      Read 1 MB sequentially from disk                                       20,000,000 ns
      Send packet CA->Netherlands->CA                                       150,000,000 ns

              See: http://www.linux-mag.com/cache/7589/1.html and Google http://
     6        www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf
Wednesday, March 9, 2011
Physics Behind
     ★   “Floating Gate Transistors”
         ✦
              Non volatile memory.
     ★   One State - Single State (SLC)
         ✦
              Faster, more reliable, expensive.
     ★   Many States - Multi Level Cell (MLC)
         ✦
              Usually 4 states.
         ✦
              Slower, less reliable, cheaper.




     7
Wednesday, March 9, 2011
Classification
     ★   NOR
         ✦
              Speeds like memory for reads.
         ✦
              Much, much slower for erase/writing data.
         ✦
              Practical use: storing firmware.
     ★   NAND
         ✦
              Faster writes.
         ✦
              Only block-level read access (4K).
         ✦
              Idea is to compact as many cells in limited space - to make it
              competitive with hard drives.



     8
Wednesday, March 9, 2011
Erasing (NAND)
     ★   Erase is to set all bits to “1111...”
         ✦
              Erasing process is similar to “flash” in photocameras - this is
              where the name FLASH comes from.
         ✦
              Erase is slow, done in batch operations (up to 1MB).
     ★   Change “1” -> “0” is fast.
     ★   Change “0” -> “1” is possible only be erase.
         ✦
              1st write: “1111” -> “1110”. Block marked as “written”
         ✦
              2nd write: even “1110” -> “1010” is not possible.




     9
Wednesday, March 9, 2011
Erase Challenges
     ★   Erase is slow
         ✦
              You want to erase many blocks in a single “flash”.
         ✦
              Block Management.
     ★   [via software] When you write, card never writes the
         same block.
     ★   Background process to run garbage collection.




    10
Wednesday, March 9, 2011
Erase Lifecycle
     ★   SLC ~100K times per cell (may vary).
     ★   MLC ~10K times per cell (may vary).
     ★   For many this is a major point of discussion.
         ✦
              How big of an issue depends a lot on firmware.
         ✦
              Many cells and even distribution (“wear levelling”) makes it a
              couple of years under heavy work load.




    11
Wednesday, March 9, 2011
Write degradation
     ★   Expected.
         ✦
              More full the device, harder it is to garbage collect.
     ★   Graph for Fusion-io 320G MLC card:




    12
Wednesday, March 9, 2011
Firmware Really Matters (1)
     ★   I would not expect even less flat performance on a
         cheaper, non-enterprise class of hardware.
         ✦
              Come to my talk on Friday.
         ✦
              I will tell you consistency of performance is more important
              than anything else.




    13
Wednesday, March 9, 2011
Firmware Really Matters (2)
     ★   Many revisions of firmware for each vendor.
         ✦
              Important to compare apples-to-apples in any comparisons.
         ✦
              I heard a rumour one large SSD vendor is on their 4th
              successful complete ground up implementation ;)




    14
Wednesday, March 9, 2011
Agenda
     ★   Introduction.
     ★   A look at the current market.
     ★   Applications.




    15
Wednesday, March 9, 2011
The current market (1)
     ★   Fusion-IO.
         ✦
              Established player with a large product line.
         ✦
              Enjoyed near-monopoly for a while being only PCI card
              vendor.
     ★   Virident.
         ✦
              Previously a MySQL Appliance vendor.
         ✦
              Switched business model in ~2010 to just ship PCI Flash
              cards.
         ✦
              Very good, consistent results.



    16
Wednesday, March 9, 2011
The current market (2)
     ★   Intel/OCZ/other.
         ✦
              Typically aims for pro-desktop market.
         ✦
              Does not necessarily offer the same features/promises as the
              “enterprise hardware”...




    17
Wednesday, March 9, 2011
You pay more for...
     ★   Greater amount of over provisioning (more consistent).
     ★   Internal redundancy (aka RAID).
     ★   More complex firmware (more consistent).
     ★   Guarantee of durability (such as a capacitor).
     ★   Greater life-span (more write cycles).
     ★   Better Performance (much more IOPS).




    18
Wednesday, March 9, 2011
Fusion-io




    19
Wednesday, March 9, 2011
Performance Specification
     ★   160G SLC
         ✦
              110K read IOPS (4K)
         ✦
              26us read latency.
     ★   320G MLC
         ✦
              71K read IOPS.
         ✦
              41us read latency.
     ★   “Duo” Range (not covered).
     ★   Lifetime:
         ✦
              SLC flash @ 40% write duty | 25 calendar years
         ✦
              MLC flash @ 20% write duty | 10 calendar years
         ✦
              MLC flash @ 40% write duty | 5 calendar years
    20
Wednesday, March 9, 2011
Fusion-io Overview
     ★   Fast. Very fast.
         ✦
              Cheaper than disks in terms of $-per IOPS.
     ★   PCI-E - closest to CPU.
     ★   Durability.
     ★   Shares host memory / CPU
     ★   Most complex part - firmware.
     ★   Large amount of space reservation for heavy writes.




    21
Wednesday, March 9, 2011
Fusion-io drawbacks
     ★   Expensive. Let’s say “$6000+” (retail; your price may be
         less).
         ✦
              For full performance, requires additional 25% space
              reservation.
         ✦
              DRAM is actually probably cheaper per GB.
     ★   PCI-E is not hot swap.
         ✦
              Also has potential for errors (when host fails, garbage keeps
              being sent. Fusion-io handles this well.)




    22
Wednesday, March 9, 2011
Fusion-io durability
     ★   Cache is located on host system.
     ★   “Transaction log” to prevent lost data.
         ✦
              Crash recovery.




    23
Wednesday, March 9, 2011
Fusion-io read performance
         160GB SLC card
         8 threads: 33K IOPS (525MB/sec), 0.28 ms 95% response time




                           RAID 10 is Dell Perc 6i
                           on 8 disks 2.5” 15 RPM SAS



    24
Wednesday, March 9, 2011
Fusion-io write performance
     ★   8 threads: 20K IOPS (314MB/sec), 0.26 ms 95%
         response time.




    25
Wednesday, March 9, 2011
Fusion-io databases
     ★   Many read / write threads to utilize throughput.
     ★   “MySQL” is not able to fully use it.
         ✦
              Better in 5.5, MySQL-5.1-plugin, XtraDB.
     ★   InnoDB IO path “needs work”.




    26
Wednesday, March 9, 2011
Virident TachIOn




    27
Wednesday, March 9, 2011
Virident
     ★   PCI interface.
     ★   Has NAND flash upgrade modules.
     ★   Good stable results.
     ★   Advertised 300,000 IOPS in 75:25 (read:write).




    28
Wednesday, March 9, 2011
Virident Options
     ★   300G, 400G, 600, 800G SLC cards.
         ✦
              400G is $13,600
     ★   (More or less the same price range as Fusion-io).




    29
Wednesday, March 9, 2011
2010 Benchmarks:




            http://www.mysqlperformanceblog.com/2010/06/15/virident-
    30      tachion-new-player-on-flash-pci-e-cards-market/
Wednesday, March 9, 2011
Intel SSDs




    31
Wednesday, March 9, 2011
Intel SSDs
     ★   Were awesome in 2008.
         ✦
              Many accolades, first SSDs that probably made sense for a
              lot of pro-desktop users.
     ★   A couple of iterations of firmware, but mostly intel
         treated customers like mushrooms for 2 years.
         ✦
              No clear advance warning of road map.
         ✦
              Finally a replacement 510 series announced last month.
                     • Slides don’t feature these. Have not used them.




    32
Wednesday, March 9, 2011
Intel Overview
     ★   SATA form factor.
     ★   Intel X25-M Gen 1 (50nm) & Gen 11 (35nm).
         ✦
              MLC
     ★   Intel X25-E (50nm)
         ✦
              SLC
         ✦
              “Enterprise”.
     ★   New 510 series - just released last month.




    33
Wednesday, March 9, 2011
X25-E
     ★   32G / 64G
     ★   Throughput: 35K IOPS reads, 3.5K IOPS writes.
     ★   Latency: 75us reads, 85us writes.
     ★   64G - $725
         ✦
              $11/GB
     ★   Write endurance:
         ✦
              1 petabyte of random writes (32G)
         ✦
              2 petabytes of random writes (64G)



    34
Wednesday, March 9, 2011
X25-M Gen II
     ★   80G / 160G
     ★   Throughput: 35K IOS reads, 6.5 / 8.5K IOPS writes.
     ★   Latency: 65us reads, 85us writes.
     ★   160GB - $415
         ✦
              ~$3 / GB
     ★   Write Endurance.
         ✦
              Not mentioned in official specification.




    35
Wednesday, March 9, 2011
X25-E and X25-M
     ★   Even if “E” is enterprise - power loss means data loss.
         ✦
              Loss of transactions.
     ★   You can disable write cache, but performance is woeful.




    36
Wednesday, March 9, 2011
X25 Deployments
     ★   RAID
         ✦
              Software / hardware?
         ✦
              Level 0? 1? 10? 5? 50?
     ★   Engineering process could be complicated and
         expensive.
         ✦
              There are/were ready solutions (Schooner[1], Gear6[2], Cisco
              servers).




             [1] Changed business model recently.
    37       [2] Went broke.
Wednesday, March 9, 2011
Agenda
     ★   Introduction.
     ★   A look at the current market.
     ★   Applications.




    38
Wednesday, March 9, 2011
MySQL Specific (1)
     ★   SSD is very good at Random reads.
         ✦
              Not so good at sequential writes!
     ★   Data files on SSD.
         ✦
              Table files (*.ibd).
         ✦
              Rollback segments (ibdata1).
     ★   Logs on RAID with BBU.
         ✦
              Binary logs.
         ✦
              Transaction logs.
         ✦
              Double write buffer.
         ✦
              Insert buffer.
         ✦
              Slow log, error log, general log.
    39       See: http://yoshinorimatsunobu.blogspot.com/2009/05/tables-on-ssd-redobinlogsystem.html

Wednesday, March 9, 2011
MySQL Specific (2)
     ★   Buy memory, or buy SSDs?
         ✦
              [Usually] Buy memory when it’s possible.




    40
Wednesday, March 9, 2011
Other Reasons to use Flash (1)
     ★   Server Consolidation.
         ✦
              Hard drives do ~100-200 IOPS*
         ✦
              Now one card can get 100K (theorhetical)!
         ✦
              ~x2 - x10 reduction in many cases (see craigslist).




    41       * Assuming no RAID controller performing additional merging.
Wednesday, March 9, 2011
Other Reasons to use Flash (2)
     ★   Power consumption reduction.
         ✦
              “Transactions per watt” incredibly lower.
                     • See: http://www.percona.com/files/percona-live/jeremy-
                       Craigslist.pptx.pdf
         ✦
              Important for a large number of people. Even if power is
              cheap, colo facilities often limit availability per-rack.




    42
Wednesday, March 9, 2011
Other Reasons to use Flash (3)
     ★   Limit variance / risk of operational issues from cold
         starts.
         ✦
              Easy to see something like an advertising network miss
              response time goals when aim is 50ms/page.
                     • Each IO is ~10ms.
                     • Following a few secondary keys to a primary key and you miss it.
     ★   Good for throughput too.




    43
Wednesday, March 9, 2011
Applications must change




Wednesday, March 9, 2011
Short Term (1)
     ★   Multi-threaded IO is required to exploit all throughput
         offered.
         ✦
              InnoDB Plugin, MySQL 5.5 ready.
         ✦
              Many other databases are not ready.




    45
Wednesday, March 9, 2011
Short Term (2)
     ★   Opportunities for Multi-level caches when data exceeds
         SSDs size.
         ✦
              See Flashcache (Facebook), ZFS L2 ARC, Veritas.




    46
Wednesday, March 9, 2011
Long Term
     ★   Decades of hard drive assumptions about random IO
         cost need to be unwound.
         ✦
              For example, InnoDB, Oracle, PostgreSQL work like this...




    47
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (High Level)


                             Log Files


     SELECT * FROM City
   WHERE CountryCode=ʼAUSʼ




                                           Tablespace
                             Buffer Pool



    48
Wednesday, March 9, 2011
Basic Operation (cont.)


                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)


                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)


                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)


                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)

                                 01010

                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)

                                 01010

                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)

                                 01010

                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Basic Operation (cont.)

                                 01010

                                  Log Files

      UPDATE City SET
     name = 'Morgansville'
    WHERE name = 'Brisbane'
    AND CountryCode='AUS'




                                                Tablespace
                                  Buffer Pool



    49
Wednesday, March 9, 2011
Long Term (cont.)
     ★   Examples of “the database is the log” for MySQL are the
         PBXT and RethinkDB storage engines.




    50
Wednesday, March 9, 2011
Storage Hardware also changes
     ★   Most of us used to buying RAID controllers, placing
         disks below them.
         ✦
              Only a very limited number of RAID controllers understand
              SSDS.
         ✦
              RAID controllers are used to optimizing IO for devices
              capable of 100-200 IOPS.
         ✦
              If we look at Fusion-IO, the devices also internally RAID
              (~RAID4).




    51
Wednesday, March 9, 2011
Technologies to look at
     ★   More PCI express cards.
         ✦
              Potential to lower barrier to entry - only ~2-3 players,
              competition not as hot as it could be (yet).
     ★   More Enterprise focused MLC.
         ✦
              Better software (firmware) means more wear levelling,
              improved performance, etc.
         ✦
              More storage in fewer cells = lower cost.
     ★   Violin Memory
         ✦
              I am not hands-on familiar with their technology, but they
              have some very high end offerings.
         ✦
              Expect more awesome high end offerings (all vendors).
    52
Wednesday, March 9, 2011
Questions
     ★   Thank you for Confoo for letting me speak about such a
         niche topic!
     ★   If I’m out of time, please feel free to catch me around.




    53
Wednesday, March 9, 2011

Weitere ähnliche Inhalte

Ähnlich wie An Overview of Flash Storage for Databases

Fusion-io SSD and SQL Server 2008
Fusion-io SSD and SQL Server 2008Fusion-io SSD and SQL Server 2008
Fusion-io SSD and SQL Server 2008Mark Ginnebaugh
 
Fusion Iossdandsqlserver2008 091022013943 Phpapp02
Fusion Iossdandsqlserver2008 091022013943 Phpapp02Fusion Iossdandsqlserver2008 091022013943 Phpapp02
Fusion Iossdandsqlserver2008 091022013943 Phpapp02eddiesauvao
 
Demystifying SSD, Mark Smith, S3
Demystifying SSD, Mark Smith, S3Demystifying SSD, Mark Smith, S3
Demystifying SSD, Mark Smith, S3subtitle
 
NAND-Flash-Data-Recovery-Cookbook-igor.pdf
NAND-Flash-Data-Recovery-Cookbook-igor.pdfNAND-Flash-Data-Recovery-Cookbook-igor.pdf
NAND-Flash-Data-Recovery-Cookbook-igor.pdfsheikhfarhanm6948
 
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...NETWAYS
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLYoshinori Matsunobu
 
Flash Storage Technology 101
Flash Storage Technology 101Flash Storage Technology 101
Flash Storage Technology 101Unitiv
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersSeveralnines
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Community
 
Dell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage mythsDell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage mythsNatalie Cerullo
 
Ceph Day Seoul - Ceph on All-Flash Storage
Ceph Day Seoul - Ceph on All-Flash Storage Ceph Day Seoul - Ceph on All-Flash Storage
Ceph Day Seoul - Ceph on All-Flash Storage Ceph Community
 
Application acceleration from the data storage perspective
Application acceleration from the data storage perspectiveApplication acceleration from the data storage perspective
Application acceleration from the data storage perspectiveInterop
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...Ernie Souhrada
 

Ähnlich wie An Overview of Flash Storage for Databases (20)

Fusion-io SSD and SQL Server 2008
Fusion-io SSD and SQL Server 2008Fusion-io SSD and SQL Server 2008
Fusion-io SSD and SQL Server 2008
 
Fusion Iossdandsqlserver2008 091022013943 Phpapp02
Fusion Iossdandsqlserver2008 091022013943 Phpapp02Fusion Iossdandsqlserver2008 091022013943 Phpapp02
Fusion Iossdandsqlserver2008 091022013943 Phpapp02
 
S3
S3S3
S3
 
Demystifying SSD, Mark Smith, S3
Demystifying SSD, Mark Smith, S3Demystifying SSD, Mark Smith, S3
Demystifying SSD, Mark Smith, S3
 
NAND-Flash-Data-Recovery-Cookbook-igor.pdf
NAND-Flash-Data-Recovery-Cookbook-igor.pdfNAND-Flash-Data-Recovery-Cookbook-igor.pdf
NAND-Flash-Data-Recovery-Cookbook-igor.pdf
 
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...
OSBConf 2015 | Contemporary and cost efficient backups to to tape by josef we...
 
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
 
Flash Storage Technology 101
Flash Storage Technology 101Flash Storage Technology 101
Flash Storage Technology 101
 
Five steps perform_2009 (1)
Five steps perform_2009 (1)Five steps perform_2009 (1)
Five steps perform_2009 (1)
 
5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance5 Steps to PostgreSQL Performance
5 Steps to PostgreSQL Performance
 
Development to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB ClustersDevelopment to Production with Sharded MongoDB Clusters
Development to Production with Sharded MongoDB Clusters
 
The Smug Mug Tale
The Smug Mug TaleThe Smug Mug Tale
The Smug Mug Tale
 
Mysql talk
Mysql talkMysql talk
Mysql talk
 
Ceph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash StorageCeph Day Tokyo -- Ceph on All-Flash Storage
Ceph Day Tokyo -- Ceph on All-Flash Storage
 
Dell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage mythsDell whitepaper busting solid state storage myths
Dell whitepaper busting solid state storage myths
 
SSD-Bondi.pptx
SSD-Bondi.pptxSSD-Bondi.pptx
SSD-Bondi.pptx
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Ceph Day Seoul - Ceph on All-Flash Storage
Ceph Day Seoul - Ceph on All-Flash Storage Ceph Day Seoul - Ceph on All-Flash Storage
Ceph Day Seoul - Ceph on All-Flash Storage
 
Application acceleration from the data storage perspective
Application acceleration from the data storage perspectiveApplication acceleration from the data storage perspective
Application acceleration from the data storage perspective
 
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
All Your IOPS Are Belong To Us - A Pinteresting Case Study in MySQL Performan...
 

Mehr von ConFoo

Debugging applications with network security tools
Debugging applications with network security toolsDebugging applications with network security tools
Debugging applications with network security toolsConFoo
 
The business behind open source
The business behind open sourceThe business behind open source
The business behind open sourceConFoo
 
Security 202 - Are you sure your site is secure?
Security 202 - Are you sure your site is secure?Security 202 - Are you sure your site is secure?
Security 202 - Are you sure your site is secure?ConFoo
 
OWASP Enterprise Security API
OWASP Enterprise Security APIOWASP Enterprise Security API
OWASP Enterprise Security APIConFoo
 
Opensource Authentication and Authorization
Opensource Authentication and AuthorizationOpensource Authentication and Authorization
Opensource Authentication and AuthorizationConFoo
 
Introduction à la sécurité des WebServices
Introduction à la sécurité des WebServicesIntroduction à la sécurité des WebServices
Introduction à la sécurité des WebServicesConFoo
 
Le bon, la brute et le truand dans les nuages
Le bon, la brute et le truand dans les nuagesLe bon, la brute et le truand dans les nuages
Le bon, la brute et le truand dans les nuagesConFoo
 
The Solar Framework for PHP
The Solar Framework for PHPThe Solar Framework for PHP
The Solar Framework for PHPConFoo
 
Décrire un projet PHP dans des rapports
Décrire un projet PHP dans des rapportsDécrire un projet PHP dans des rapports
Décrire un projet PHP dans des rapportsConFoo
 
Server Administration in Python with Fabric, Cuisine and Watchdog
Server Administration in Python with Fabric, Cuisine and WatchdogServer Administration in Python with Fabric, Cuisine and Watchdog
Server Administration in Python with Fabric, Cuisine and WatchdogConFoo
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+ConFoo
 
Think Mobile First, Then Enhance
Think Mobile First, Then EnhanceThink Mobile First, Then Enhance
Think Mobile First, Then EnhanceConFoo
 
Metaprogramming in Ruby
Metaprogramming in RubyMetaprogramming in Ruby
Metaprogramming in RubyConFoo
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101ConFoo
 
As-t-on encore besoin d'un framework web ?
As-t-on encore besoin d'un framework web ?As-t-on encore besoin d'un framework web ?
As-t-on encore besoin d'un framework web ?ConFoo
 
Pragmatic Guide to Git
Pragmatic Guide to GitPragmatic Guide to Git
Pragmatic Guide to GitConFoo
 
Building servers with Node.js
Building servers with Node.jsBuilding servers with Node.js
Building servers with Node.jsConFoo
 
Android Jump Start
Android Jump StartAndroid Jump Start
Android Jump StartConFoo
 
Develop mobile applications with Flex
Develop mobile applications with FlexDevelop mobile applications with Flex
Develop mobile applications with FlexConFoo
 
WordPress pour le développement d'aplications web
WordPress pour le développement d'aplications webWordPress pour le développement d'aplications web
WordPress pour le développement d'aplications webConFoo
 

Mehr von ConFoo (20)

Debugging applications with network security tools
Debugging applications with network security toolsDebugging applications with network security tools
Debugging applications with network security tools
 
The business behind open source
The business behind open sourceThe business behind open source
The business behind open source
 
Security 202 - Are you sure your site is secure?
Security 202 - Are you sure your site is secure?Security 202 - Are you sure your site is secure?
Security 202 - Are you sure your site is secure?
 
OWASP Enterprise Security API
OWASP Enterprise Security APIOWASP Enterprise Security API
OWASP Enterprise Security API
 
Opensource Authentication and Authorization
Opensource Authentication and AuthorizationOpensource Authentication and Authorization
Opensource Authentication and Authorization
 
Introduction à la sécurité des WebServices
Introduction à la sécurité des WebServicesIntroduction à la sécurité des WebServices
Introduction à la sécurité des WebServices
 
Le bon, la brute et le truand dans les nuages
Le bon, la brute et le truand dans les nuagesLe bon, la brute et le truand dans les nuages
Le bon, la brute et le truand dans les nuages
 
The Solar Framework for PHP
The Solar Framework for PHPThe Solar Framework for PHP
The Solar Framework for PHP
 
Décrire un projet PHP dans des rapports
Décrire un projet PHP dans des rapportsDécrire un projet PHP dans des rapports
Décrire un projet PHP dans des rapports
 
Server Administration in Python with Fabric, Cuisine and Watchdog
Server Administration in Python with Fabric, Cuisine and WatchdogServer Administration in Python with Fabric, Cuisine and Watchdog
Server Administration in Python with Fabric, Cuisine and Watchdog
 
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+Marrow: A Meta-Framework for Python 2.6+ and 3.1+
Marrow: A Meta-Framework for Python 2.6+ and 3.1+
 
Think Mobile First, Then Enhance
Think Mobile First, Then EnhanceThink Mobile First, Then Enhance
Think Mobile First, Then Enhance
 
Metaprogramming in Ruby
Metaprogramming in RubyMetaprogramming in Ruby
Metaprogramming in Ruby
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101
 
As-t-on encore besoin d'un framework web ?
As-t-on encore besoin d'un framework web ?As-t-on encore besoin d'un framework web ?
As-t-on encore besoin d'un framework web ?
 
Pragmatic Guide to Git
Pragmatic Guide to GitPragmatic Guide to Git
Pragmatic Guide to Git
 
Building servers with Node.js
Building servers with Node.jsBuilding servers with Node.js
Building servers with Node.js
 
Android Jump Start
Android Jump StartAndroid Jump Start
Android Jump Start
 
Develop mobile applications with Flex
Develop mobile applications with FlexDevelop mobile applications with Flex
Develop mobile applications with Flex
 
WordPress pour le développement d'aplications web
WordPress pour le développement d'aplications webWordPress pour le développement d'aplications web
WordPress pour le développement d'aplications web
 

Kürzlich hochgeladen

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 

Kürzlich hochgeladen (20)

Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 

An Overview of Flash Storage for Databases

  • 1. An Overview of Flash Storage for Databases Morgan Tocker <morgan@percona.com> 1 Wednesday, March 9, 2011
  • 2. Introduction [ Me] [Percona] Director of Training. Previously Consulting, Training, worked at MySQL, Sun Support & Development Microsystems. for MySQL. ★ No invested interest in which hardware I recommend. ✦ [Disclaimer] Some hardware vendors have engaged in our services to evaluate and improve performance of their products. 2 Wednesday, March 9, 2011
  • 3. What this talk is about ★ Flash technologies (NAND, NOR). ★ Server Usage. ✦ Not USB thumb drives. ✦ Not Consumer usage. ★ “For Database” == MySQL. ✦ Should be more or less applicable for all databases. 3 Wednesday, March 9, 2011
  • 4. Agenda ★ Introduction. ★ A look at the current market. ★ Applications. 4 Wednesday, March 9, 2011
  • 5. Revolutionary ★ Change in technology - ✦ From spinning disk to solid state. ★ No mechanical moving parts. ★ Jump in performance. ★ Requires changes in the Application. ★ Hard not to predict a quick replacement to all SSDs in the next 5-10 years* * However, at the moment hard disks are still 5 becoming cheaper (size) quicker than SSDs! Wednesday, March 9, 2011
  • 6. “Numbers everyone should know” L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Compress 1K bytes with Zippy 3,000 ns Send 2K bytes over 1 Gbps network 20,000 ns NAND Flash (my estimate) 50,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from disk 20,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns See: http://www.linux-mag.com/cache/7589/1.html and Google http:// 6 www.cs.cornell.edu/projects/ladis2009/talks/dean-keynote-ladis2009.pdf Wednesday, March 9, 2011
  • 7. Physics Behind ★ “Floating Gate Transistors” ✦ Non volatile memory. ★ One State - Single State (SLC) ✦ Faster, more reliable, expensive. ★ Many States - Multi Level Cell (MLC) ✦ Usually 4 states. ✦ Slower, less reliable, cheaper. 7 Wednesday, March 9, 2011
  • 8. Classification ★ NOR ✦ Speeds like memory for reads. ✦ Much, much slower for erase/writing data. ✦ Practical use: storing firmware. ★ NAND ✦ Faster writes. ✦ Only block-level read access (4K). ✦ Idea is to compact as many cells in limited space - to make it competitive with hard drives. 8 Wednesday, March 9, 2011
  • 9. Erasing (NAND) ★ Erase is to set all bits to “1111...” ✦ Erasing process is similar to “flash” in photocameras - this is where the name FLASH comes from. ✦ Erase is slow, done in batch operations (up to 1MB). ★ Change “1” -> “0” is fast. ★ Change “0” -> “1” is possible only be erase. ✦ 1st write: “1111” -> “1110”. Block marked as “written” ✦ 2nd write: even “1110” -> “1010” is not possible. 9 Wednesday, March 9, 2011
  • 10. Erase Challenges ★ Erase is slow ✦ You want to erase many blocks in a single “flash”. ✦ Block Management. ★ [via software] When you write, card never writes the same block. ★ Background process to run garbage collection. 10 Wednesday, March 9, 2011
  • 11. Erase Lifecycle ★ SLC ~100K times per cell (may vary). ★ MLC ~10K times per cell (may vary). ★ For many this is a major point of discussion. ✦ How big of an issue depends a lot on firmware. ✦ Many cells and even distribution (“wear levelling”) makes it a couple of years under heavy work load. 11 Wednesday, March 9, 2011
  • 12. Write degradation ★ Expected. ✦ More full the device, harder it is to garbage collect. ★ Graph for Fusion-io 320G MLC card: 12 Wednesday, March 9, 2011
  • 13. Firmware Really Matters (1) ★ I would not expect even less flat performance on a cheaper, non-enterprise class of hardware. ✦ Come to my talk on Friday. ✦ I will tell you consistency of performance is more important than anything else. 13 Wednesday, March 9, 2011
  • 14. Firmware Really Matters (2) ★ Many revisions of firmware for each vendor. ✦ Important to compare apples-to-apples in any comparisons. ✦ I heard a rumour one large SSD vendor is on their 4th successful complete ground up implementation ;) 14 Wednesday, March 9, 2011
  • 15. Agenda ★ Introduction. ★ A look at the current market. ★ Applications. 15 Wednesday, March 9, 2011
  • 16. The current market (1) ★ Fusion-IO. ✦ Established player with a large product line. ✦ Enjoyed near-monopoly for a while being only PCI card vendor. ★ Virident. ✦ Previously a MySQL Appliance vendor. ✦ Switched business model in ~2010 to just ship PCI Flash cards. ✦ Very good, consistent results. 16 Wednesday, March 9, 2011
  • 17. The current market (2) ★ Intel/OCZ/other. ✦ Typically aims for pro-desktop market. ✦ Does not necessarily offer the same features/promises as the “enterprise hardware”... 17 Wednesday, March 9, 2011
  • 18. You pay more for... ★ Greater amount of over provisioning (more consistent). ★ Internal redundancy (aka RAID). ★ More complex firmware (more consistent). ★ Guarantee of durability (such as a capacitor). ★ Greater life-span (more write cycles). ★ Better Performance (much more IOPS). 18 Wednesday, March 9, 2011
  • 19. Fusion-io 19 Wednesday, March 9, 2011
  • 20. Performance Specification ★ 160G SLC ✦ 110K read IOPS (4K) ✦ 26us read latency. ★ 320G MLC ✦ 71K read IOPS. ✦ 41us read latency. ★ “Duo” Range (not covered). ★ Lifetime: ✦ SLC flash @ 40% write duty | 25 calendar years ✦ MLC flash @ 20% write duty | 10 calendar years ✦ MLC flash @ 40% write duty | 5 calendar years 20 Wednesday, March 9, 2011
  • 21. Fusion-io Overview ★ Fast. Very fast. ✦ Cheaper than disks in terms of $-per IOPS. ★ PCI-E - closest to CPU. ★ Durability. ★ Shares host memory / CPU ★ Most complex part - firmware. ★ Large amount of space reservation for heavy writes. 21 Wednesday, March 9, 2011
  • 22. Fusion-io drawbacks ★ Expensive. Let’s say “$6000+” (retail; your price may be less). ✦ For full performance, requires additional 25% space reservation. ✦ DRAM is actually probably cheaper per GB. ★ PCI-E is not hot swap. ✦ Also has potential for errors (when host fails, garbage keeps being sent. Fusion-io handles this well.) 22 Wednesday, March 9, 2011
  • 23. Fusion-io durability ★ Cache is located on host system. ★ “Transaction log” to prevent lost data. ✦ Crash recovery. 23 Wednesday, March 9, 2011
  • 24. Fusion-io read performance 160GB SLC card 8 threads: 33K IOPS (525MB/sec), 0.28 ms 95% response time RAID 10 is Dell Perc 6i on 8 disks 2.5” 15 RPM SAS 24 Wednesday, March 9, 2011
  • 25. Fusion-io write performance ★ 8 threads: 20K IOPS (314MB/sec), 0.26 ms 95% response time. 25 Wednesday, March 9, 2011
  • 26. Fusion-io databases ★ Many read / write threads to utilize throughput. ★ “MySQL” is not able to fully use it. ✦ Better in 5.5, MySQL-5.1-plugin, XtraDB. ★ InnoDB IO path “needs work”. 26 Wednesday, March 9, 2011
  • 27. Virident TachIOn 27 Wednesday, March 9, 2011
  • 28. Virident ★ PCI interface. ★ Has NAND flash upgrade modules. ★ Good stable results. ★ Advertised 300,000 IOPS in 75:25 (read:write). 28 Wednesday, March 9, 2011
  • 29. Virident Options ★ 300G, 400G, 600, 800G SLC cards. ✦ 400G is $13,600 ★ (More or less the same price range as Fusion-io). 29 Wednesday, March 9, 2011
  • 30. 2010 Benchmarks: http://www.mysqlperformanceblog.com/2010/06/15/virident- 30 tachion-new-player-on-flash-pci-e-cards-market/ Wednesday, March 9, 2011
  • 31. Intel SSDs 31 Wednesday, March 9, 2011
  • 32. Intel SSDs ★ Were awesome in 2008. ✦ Many accolades, first SSDs that probably made sense for a lot of pro-desktop users. ★ A couple of iterations of firmware, but mostly intel treated customers like mushrooms for 2 years. ✦ No clear advance warning of road map. ✦ Finally a replacement 510 series announced last month. • Slides don’t feature these. Have not used them. 32 Wednesday, March 9, 2011
  • 33. Intel Overview ★ SATA form factor. ★ Intel X25-M Gen 1 (50nm) & Gen 11 (35nm). ✦ MLC ★ Intel X25-E (50nm) ✦ SLC ✦ “Enterprise”. ★ New 510 series - just released last month. 33 Wednesday, March 9, 2011
  • 34. X25-E ★ 32G / 64G ★ Throughput: 35K IOPS reads, 3.5K IOPS writes. ★ Latency: 75us reads, 85us writes. ★ 64G - $725 ✦ $11/GB ★ Write endurance: ✦ 1 petabyte of random writes (32G) ✦ 2 petabytes of random writes (64G) 34 Wednesday, March 9, 2011
  • 35. X25-M Gen II ★ 80G / 160G ★ Throughput: 35K IOS reads, 6.5 / 8.5K IOPS writes. ★ Latency: 65us reads, 85us writes. ★ 160GB - $415 ✦ ~$3 / GB ★ Write Endurance. ✦ Not mentioned in official specification. 35 Wednesday, March 9, 2011
  • 36. X25-E and X25-M ★ Even if “E” is enterprise - power loss means data loss. ✦ Loss of transactions. ★ You can disable write cache, but performance is woeful. 36 Wednesday, March 9, 2011
  • 37. X25 Deployments ★ RAID ✦ Software / hardware? ✦ Level 0? 1? 10? 5? 50? ★ Engineering process could be complicated and expensive. ✦ There are/were ready solutions (Schooner[1], Gear6[2], Cisco servers). [1] Changed business model recently. 37 [2] Went broke. Wednesday, March 9, 2011
  • 38. Agenda ★ Introduction. ★ A look at the current market. ★ Applications. 38 Wednesday, March 9, 2011
  • 39. MySQL Specific (1) ★ SSD is very good at Random reads. ✦ Not so good at sequential writes! ★ Data files on SSD. ✦ Table files (*.ibd). ✦ Rollback segments (ibdata1). ★ Logs on RAID with BBU. ✦ Binary logs. ✦ Transaction logs. ✦ Double write buffer. ✦ Insert buffer. ✦ Slow log, error log, general log. 39 See: http://yoshinorimatsunobu.blogspot.com/2009/05/tables-on-ssd-redobinlogsystem.html Wednesday, March 9, 2011
  • 40. MySQL Specific (2) ★ Buy memory, or buy SSDs? ✦ [Usually] Buy memory when it’s possible. 40 Wednesday, March 9, 2011
  • 41. Other Reasons to use Flash (1) ★ Server Consolidation. ✦ Hard drives do ~100-200 IOPS* ✦ Now one card can get 100K (theorhetical)! ✦ ~x2 - x10 reduction in many cases (see craigslist). 41 * Assuming no RAID controller performing additional merging. Wednesday, March 9, 2011
  • 42. Other Reasons to use Flash (2) ★ Power consumption reduction. ✦ “Transactions per watt” incredibly lower. • See: http://www.percona.com/files/percona-live/jeremy- Craigslist.pptx.pdf ✦ Important for a large number of people. Even if power is cheap, colo facilities often limit availability per-rack. 42 Wednesday, March 9, 2011
  • 43. Other Reasons to use Flash (3) ★ Limit variance / risk of operational issues from cold starts. ✦ Easy to see something like an advertising network miss response time goals when aim is 50ms/page. • Each IO is ~10ms. • Following a few secondary keys to a primary key and you miss it. ★ Good for throughput too. 43 Wednesday, March 9, 2011
  • 45. Short Term (1) ★ Multi-threaded IO is required to exploit all throughput offered. ✦ InnoDB Plugin, MySQL 5.5 ready. ✦ Many other databases are not ready. 45 Wednesday, March 9, 2011
  • 46. Short Term (2) ★ Opportunities for Multi-level caches when data exceeds SSDs size. ✦ See Flashcache (Facebook), ZFS L2 ARC, Veritas. 46 Wednesday, March 9, 2011
  • 47. Long Term ★ Decades of hard drive assumptions about random IO cost need to be unwound. ✦ For example, InnoDB, Oracle, PostgreSQL work like this... 47 Wednesday, March 9, 2011
  • 48. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 49. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 50. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 51. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 52. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 53. Basic Operation (High Level) Log Files SELECT * FROM City WHERE CountryCode=ʼAUSʼ Tablespace Buffer Pool 48 Wednesday, March 9, 2011
  • 54. Basic Operation (cont.) Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 55. Basic Operation (cont.) Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 56. Basic Operation (cont.) Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 57. Basic Operation (cont.) Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 58. Basic Operation (cont.) 01010 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 59. Basic Operation (cont.) 01010 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 60. Basic Operation (cont.) 01010 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 61. Basic Operation (cont.) 01010 Log Files UPDATE City SET name = 'Morgansville' WHERE name = 'Brisbane' AND CountryCode='AUS' Tablespace Buffer Pool 49 Wednesday, March 9, 2011
  • 62. Long Term (cont.) ★ Examples of “the database is the log” for MySQL are the PBXT and RethinkDB storage engines. 50 Wednesday, March 9, 2011
  • 63. Storage Hardware also changes ★ Most of us used to buying RAID controllers, placing disks below them. ✦ Only a very limited number of RAID controllers understand SSDS. ✦ RAID controllers are used to optimizing IO for devices capable of 100-200 IOPS. ✦ If we look at Fusion-IO, the devices also internally RAID (~RAID4). 51 Wednesday, March 9, 2011
  • 64. Technologies to look at ★ More PCI express cards. ✦ Potential to lower barrier to entry - only ~2-3 players, competition not as hot as it could be (yet). ★ More Enterprise focused MLC. ✦ Better software (firmware) means more wear levelling, improved performance, etc. ✦ More storage in fewer cells = lower cost. ★ Violin Memory ✦ I am not hands-on familiar with their technology, but they have some very high end offerings. ✦ Expect more awesome high end offerings (all vendors). 52 Wednesday, March 9, 2011
  • 65. Questions ★ Thank you for Confoo for letting me speak about such a niche topic! ★ If I’m out of time, please feel free to catch me around. 53 Wednesday, March 9, 2011