SlideShare ist ein Scribd-Unternehmen logo
1 von 48
Achieving 100,000
Transactions Per Second
 with a NoSQL Database



         Eric David Bloch
           @eedeebee

          23 May 2012
A bit about me

• I’ve written software used by millions of people.

    Apps, libraries, compilers, device drivers, operating systems

• This is my third Gluecon and my first talk

• I’m the Community Director at MarkLogic, last 2 years.

• I survived having 3 kids in less than 2 years.
Like me, but not me
Actually, me
Storyline

• Why?
• How?
  • Whirl-wind tour of database architecture
  • Techniques to get to 100K
 It’s about money.




 Top 5 bank needed to manage trades
 Trades look more like documents than tables
 Schemas for trades change all the time
 Transactions
 Scale and velocity (“Big Data”)
Trades and Positions


 1 million trades per day
 Followed by 1 million position reports at end of day
    Roll up trades of current date for each “book, instrument” pair
    Group-by, with key = “date, book, instrument”

                  1M positions
         1M trades




               Day         Day   Day    Day    Day     ...
                1           2     3      4      5
Trades and Positions

<trade>
   <quantity>8540882</quantity>
   <quantity2>1193.71</quantity2>
   <instrument>WASAX</instrument>
   <book>679</book>
   <trade-date>2011-03-13-07:00</trade-date>
   <settle-date>2011-03-17-07:00</settle-date>
</trade>


<position>
  <instrument>EAAFX</instrument>
  <book>679</book>
  <quantity>3</quantity>
  <business-date>2011-03-25Z</business-date>
  <position-date>2011-03-24Z</position-date>
</position>
Requirements

   NoSQL flexibility,
 performance & scale

    Enterprise-grade
transactional guarantees
Now show us

• 15K inserts per second
• Linear scalability
What NoSQL Database out there can we use?




                     A quote from the bank
“We threw everything we had at MarkLogic and it didn’t break a sweat”
We weren’t content with 15K


So we showed them…




Slide 13   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
What is MarkLogic?


 Non-relational, document-oriented, distributed
  database
            Shared nothing clustering, linear scale out
            Multi-Version Concurrency Control (MVCC)
            Transactions (ACID)


 Search Engine
               Web scale (Big Data)
               Inverted indexes (term lists)
               Real-time updates
               Compose-able queries

Slide 14       Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Architecture



 Data model
 Indexing
 Clustering
 Query execution
 Transactions



Slide 15   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Key (URI)                    Value (Document)

/trade/153748994             <trade>
                              <id>8</id>
                              <time>2012-02-20T14:00:00</time>
                              <instrument>BYME AAA</instrument>
                              <price cur=“usd”>600.27</price>
                             </trade>



/user/eedeebee               {
                                 “name” : “Eric Bloch”,
                                 “age” : 47,
                                 “hair” : “gray”,
                                 “kids” : [ “Grace”, “Ryan”, “Owen” ]
                             }

/book5293                    It was the best of times, it was the worst of times, it
                             was the age of wisdom, ...


/2012-02-20T14:47:53/01445   .mp3
                             .avi
                             [your favorite binary format]
Inverted Index



“which”                                          123, 127, 129, 152, 344, 791 . . .

“uniquely”                                       122, 125, 126, 129, 130, 167 . . .

“identify”                                       123, 126, 130, 142, 143, 167 . . .

“each”                                           123, 130, 131, 135, 162, 177 . . .   Document
“uniquely identify”
                                                 126, 130, 167, 212, 219, 377 . . .   References
<article>                                        ...
                                                                                      126, 130, 167, 212, 219, 377 . . .
article/abstract/@author                         ...

<product>IMS</product>




  Slide 17   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Range Index

                  <trade>
                   <trader_id>8</trader_id>                                       Value     Docid
                   <time>2012-02-20T14:00:00</time>
                   <instrument>IBM</instrument>                                   0         287
                   …
                  </trade>                                                        8         1129
                                                                                  13        531    Docid Value
                  <trade>
                   <trader_id>13</trader_id>                                      …         …      287    0
                   <time>2012-02-20T14:30:00</time>
Rows               <instrument>AAPL</instrument>                                  …         …      531    13
                   …                                                                               1129   8
                  </trade>
                                                                                                   …      …
                  <trade>
                   <trader_id>0</trader_id>                                                        …      …
                   <time>2012-02-20T15:30:00</time>
                   <instrument>GOOG</instrument>                                      • Column Oriented
                   …                                                                  • Memory Mapped
                  </trade>


       Slide 18   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Shared-Nothing Clustering




     Host D1                                        Host D2                Host D3   …   Host Dj




Slide 19   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Query Evaluation – “Map”


                                         Q
      Host E1                                         Host E2                      Host E3   …       Host Ei




         Host D1                                        Host D2                    Host D3            Host Dj
                                                                                             …

Q                                        Q                                     Q                 Q
      F1 … Fn                                       F1 … Fn                        F1 … Fn           F1 … Fn
    Slide 21   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Query Evaluation – “Reduce”

                                           Top
                                           10
        Host E1                                         Host E2                    Host E3   …     Host Ei




           Host D1                                        Host D2                  Host D3             Host Dj
                                                                                             …
Top                                             Top                              Top             Top
10                                              10                               10              10
        F1 … Fn                                       F1 … Fn                      F1 … Fn             F1 … Fn
      Slide 22   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Queries/Updates with MVCC


 Every query has a timestamp
 Documents do not change

    Reads are lock-free
    Inserts – see next slide
    Deletes – mark as deleted
    Edits –
            copy
            edit
            insert the copy
            mark the original as deleted

Slide 23    Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Insert Mechanics


1)         New URI+Document arrive at E-node
2)         URI Probe – determine whether URI exists in any forest
3)         URI Lock – write locks is taken on D node(s)
4)         Forest Assignment – URI is deterministically placed in Forest
5)         Indexing
6)         Journaling
7)         Commit – transaction complete
8)         Release URI Locks – D node(s) are notified to release lock




Slide 24   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Save-and-merge
       (Log Stuctured Tree Merge)

                                                                                  Database
                          Insert/
                          Update



                                                            Forest 1                              Forest 2
                                                 S0
Memory                         Save

Disk
       Journaled
                                                 S1                 S2                       S1     S2
                                                                                    S3                       S3

                                          Merge
                  J

       Slide 25   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Back to
the money
Trades and Positions

 1 million trades per day
 followed by 1 million position reports at end of day
        Roll up trades of the current “date” for each “book:instrument” pair
        Group-by, with key = “book:date:instrument”


                           1M positions
                  1M trades




                                Day                    Day                 Day   Day   Day   ...
                                 1                      2                   3     4     5


Slide 27   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Naive Query Pseudocode



for each book
for each instrument in that book
  position = position(yesterday, book, instrument)
  for each trade of that instrument in this book
   position += trade(today, book, instrument).quantity
  insert(today, book, instrument, position)




    Slide 28   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Initial results




Single node – 19,000 inserts per second




Slide 29   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Initial results with cluster


           70000


           60000


           50000


           40000
 doc/sec




           30000                                                                                 Report Query 2


           20000


           10000


               0
                                      1DE                                        2DE       3DE
                                                                              # of nodes




Slide 30      Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Techniques to get to 100K


 Query for Computing New Positions
            Materialized compound key, Co-Occurrence Query and Aggregation
 Insert of New Positions
            Batching
            In-Forest Eval




Slide 31    Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Materializing a compound key

<trade>
   <quantity2>1193.71</quantity2>
   <instrument>WASAX</instrument>
   <book>679</book>
   <trade-date>2011-03-13-07:00</trade-date>
   <settle-date>2011-03-17-07:00</settle-date>
</trade>




  Slide 32   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Materializing a compound key

<trade>
   <roll-up book-date-instrument=“151333445566782303”/>
   <quantity2>1193.71</quantity2>
   <instrument>WASAX</instrument>
   <book>679</book>
   <trade-date>2011-03-13-07:00</trade-date>
   <settle-date>2011-03-17-07:00</settle-date>
</trade>




 Slide 33   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Co-Occurrence and
       Distributed Aggregation
<trade>
 <roll-up
    book-date-instrument=“151373445566703”/>                                       V        D
                                                                                       D         V
 <quantity>8540882</quantity>                                                      …        …
 <instrument>WASAX</instrument>                                                        …        …
                                                                                   …        1129
 <book>679</book>                                                                      …        …
 <trade-date>2011-03-13-07:00</trade-date>                                         …         …
                                                                                       15137…    …
 <settle-date>2011-03-17-07:00</settle-date>
                                                                                   …        …
</trade>                                                                               …         …
                                                                                   …        …
                                                                                       …         …

• Co-occurrences:
                                                                                                 V         D
     Find pairings of range indexed values                                                            D        V
                                                                                                 …         …
• Aggregate on the D nodes (Map/Reduce):
                                                                                                 8540882
                                                                                                       …   1129 …
     Sum up the quantities above
                                                                                                      …        …
                                                                                                 …         …
                                                                                                      …        8540882
• Similar to a Group-by,                                                                         …         …
     • in a column-oriented, in-memory                                                           …    …    …   …
     database                                                                                         …        …


        Slide 34   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Initial results with new query




      > 30K inserts/second on a single node




Slide 35   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Co-Occurrence + Aggregate versus
Naïve approach

           70000


           60000


           50000


           40000
 doc/sec




                                                                                                 Report Query 3
           30000
                                                                                                 Report Query 2

           20000


           10000


               0
                                      1DE                                        2DE       3DE
                                                                              # of nodes




Slide 36      Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Techniques


 Computing Positions
            Materialized compound key, Co-Occurrence Query and Aggregation
 Updates
            Batching
            In-Forest Eval




Slide 37    Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Transaction Size and
Throughput

                  35000


                  30000


                  25000
 #docs per sec.




                  20000


                  15000                                                                                                   insert throughput


                  10000


                   5000


                      0
                                1           2            4            8              16   32   100   500   1000   10000
                                                                   # doc inserts per transaction




Slide 38             Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Techniques


 Computing Positions (Data Warehouse)
            Hash Key Decoration
            Co-Occurrence
 Updates (Transaction)
            Batching
            In-Forest Eval




Slide 39    Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Insert Mechanics


1)         New URI+Document arrive at E-node
2)         *URI Probe – determine whether URI exists in any forest
3)         *URI Lock – write locks are created
4)         Forest Assignment – URI is deterministically placed in Forest
5)         Indexing
6)         Journaling
7)         Commit – transaction complete
8)         *Release URI Locks – D nodes are notified to release lock



* Overhead of these operations increases with cluster size


Slide 40   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Deterministic Placement


4) Forest Assignment – URI is deterministically placed in Forest

                             Hash                                          64-bit   Bucketed        Fi
  URI
                             function                                      number   into a Forest




  • Done in C++ within server
  • But…
       • Can also be done in the client
       • Server allows queries to be evaluated against only one
       forest…




Slide 41   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
In-Forest Eval


                                                        Q
                                                                           E




                                        D1                                              D2


                                                                               Q
                         F1                                F2                      F3        F4



Slide 42   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
In-Forest Eval Insert Mechanics


1) New URI+Document arrive at E-node
           a)      Compute Fi using
           b)      Ask server to evaluated the insert query only against Fi
2)         URI Probe – Fi Only
3)         URI Lock – Fi Only
4)         Forest Assignment – Fi Only
5)         Indexing
6)         Journaling
7)         Commit – transaction complete
8)         Lock Release - Fi Only



Slide 43    Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Regular Insert Vs. In-Forest Eval


               70000


               60000


               50000
 docs/second




               40000


                                                                                                     regular insert
               30000
                                                                                                     in-forest-eval
                                                                                                     In-Forest Eval

               20000


               10000


                   0
                                           1DE                                       2DE       3DE
                                                                                  # of nodes




Slide 44          Copyright © 2012 MarkLogic® Corporation. All rights reserved.
Tada! Achieving 100K Updates




                                                                           In-Forest Eval




Slide 45   Copyright © 2012 MarkLogic® Corporation. All rights reserved.
NoSQL document-oriented database
with real-time full-text search and transactions
    Doing 100K transactions per second




                                     (Yes that’s marker felt)
Thank You!

@eedeebee
eric.bloch@marklogic.com




Slide 49   Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Weitere ähnliche Inhalte

Andere mochten auch

Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
nathanmarz
 
36518008 hdfc-bank-3
36518008 hdfc-bank-336518008 hdfc-bank-3
36518008 hdfc-bank-3
Dhiraj Ahuja
 
Organization study on hdfc bank
Organization study on hdfc bankOrganization study on hdfc bank
Organization study on hdfc bank
StudsPlanet.com
 
Project report format
Project report formatProject report format
Project report format
rohtb2010
 
Preparing Detailed Project Report and Presenting Business Plan to Investors
Preparing Detailed Project Report  and Presenting Business Plan to InvestorsPreparing Detailed Project Report  and Presenting Business Plan to Investors
Preparing Detailed Project Report and Presenting Business Plan to Investors
Rahul Sharma
 
Preparation of project report for bank finance
Preparation of project report for bank financePreparation of project report for bank finance
Preparation of project report for bank finance
Revanth Rao
 
A study on financial analysis of hdfc bank
A study on financial analysis of hdfc bankA study on financial analysis of hdfc bank
A study on financial analysis of hdfc bank
Mehul Rasadiya
 

Andere mochten auch (20)

MarkLogic Overview and Use Cases
MarkLogic Overview and Use CasesMarkLogic Overview and Use Cases
MarkLogic Overview and Use Cases
 
Runaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop itRunaway complexity in Big Data... and a plan to stop it
Runaway complexity in Big Data... and a plan to stop it
 
MiFID II: The Top 5 Data Challenges
MiFID II: The Top 5 Data ChallengesMiFID II: The Top 5 Data Challenges
MiFID II: The Top 5 Data Challenges
 
36518008 hdfc-bank-3
36518008 hdfc-bank-336518008 hdfc-bank-3
36518008 hdfc-bank-3
 
Junli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving RevolutionJunli Gu at AI Frontiers: Autonomous Driving Revolution
Junli Gu at AI Frontiers: Autonomous Driving Revolution
 
Naghi Prasad at AI Frontiers: Building AI systems to automate enterprise proc...
Naghi Prasad at AI Frontiers: Building AI systems to automate enterprise proc...Naghi Prasad at AI Frontiers: Building AI systems to automate enterprise proc...
Naghi Prasad at AI Frontiers: Building AI systems to automate enterprise proc...
 
Speed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with RedisSpeed up your Symfony2 application and build awesome features with Redis
Speed up your Symfony2 application and build awesome features with Redis
 
hdfc Bank project
hdfc Bank projecthdfc Bank project
hdfc Bank project
 
Organization study on hdfc bank
Organization study on hdfc bankOrganization study on hdfc bank
Organization study on hdfc bank
 
HDFC Persentation
HDFC PersentationHDFC Persentation
HDFC Persentation
 
GDPR: A Step-By-Step Guide To Compliance
GDPR: A Step-By-Step Guide To ComplianceGDPR: A Step-By-Step Guide To Compliance
GDPR: A Step-By-Step Guide To Compliance
 
Project report format
Project report formatProject report format
Project report format
 
Preparing Detailed Project Report and Presenting Business Plan to Investors
Preparing Detailed Project Report  and Presenting Business Plan to InvestorsPreparing Detailed Project Report  and Presenting Business Plan to Investors
Preparing Detailed Project Report and Presenting Business Plan to Investors
 
Hdfc bank ppt
Hdfc bank pptHdfc bank ppt
Hdfc bank ppt
 
Preparation of project report for bank finance
Preparation of project report for bank financePreparation of project report for bank finance
Preparation of project report for bank finance
 
A study on financial analysis of hdfc bank
A study on financial analysis of hdfc bankA study on financial analysis of hdfc bank
A study on financial analysis of hdfc bank
 
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning ResearchJeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
Jeff Dean at AI Frontiers: Trends and Developments in Deep Learning Research
 
Soumith Chintala at AI Frontiers: A Dynamic View of the Deep Learning World
Soumith Chintala at AI Frontiers: A Dynamic View of the Deep Learning WorldSoumith Chintala at AI Frontiers: A Dynamic View of the Deep Learning World
Soumith Chintala at AI Frontiers: A Dynamic View of the Deep Learning World
 
Jisheng Wang at AI Frontiers: Deep Learning in Security
Jisheng Wang at AI Frontiers: Deep Learning in SecurityJisheng Wang at AI Frontiers: Deep Learning in Security
Jisheng Wang at AI Frontiers: Deep Learning in Security
 
Charles Fan at AI Frontiers: The New Era of AI Plus
Charles Fan at AI Frontiers: The New Era of AI PlusCharles Fan at AI Frontiers: The New Era of AI Plus
Charles Fan at AI Frontiers: The New Era of AI Plus
 

Ähnlich wie Real World Big Data, 100,000 transactions a second in a NoSQL Database

Mock Objects from Concept to Code
Mock Objects from Concept to CodeMock Objects from Concept to Code
Mock Objects from Concept to Code
Rob Myers
 
DIADEM WWW 2012
DIADEM WWW 2012DIADEM WWW 2012
DIADEM WWW 2012
Giorgio Orsi
 
O leary2012 comp_ppt_ch07
O leary2012 comp_ppt_ch07O leary2012 comp_ppt_ch07
O leary2012 comp_ppt_ch07
Dalia Saeed
 
Bending The Curve
Bending The CurveBending The Curve
Bending The Curve
finteligent
 
Building Windows 8 Apps with Windows Azure
Building Windows 8 Apps with Windows AzureBuilding Windows 8 Apps with Windows Azure
Building Windows 8 Apps with Windows Azure
Supote Phunsakul
 
SeCold - A Linked Data Platform for Mining Software Repositories
SeCold - A Linked Data Platform for  Mining Software RepositoriesSeCold - A Linked Data Platform for  Mining Software Repositories
SeCold - A Linked Data Platform for Mining Software Repositories
imanmahsa
 
Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6
Crashlytics
 

Ähnlich wie Real World Big Data, 100,000 transactions a second in a NoSQL Database (20)

UN/EDIFACT Interchange Processing with Smooks v1.4
UN/EDIFACT Interchange Processing  with Smooks v1.4UN/EDIFACT Interchange Processing  with Smooks v1.4
UN/EDIFACT Interchange Processing with Smooks v1.4
 
Intro To Starling Framework for ActionScript 3.0
Intro To Starling Framework for ActionScript 3.0Intro To Starling Framework for ActionScript 3.0
Intro To Starling Framework for ActionScript 3.0
 
Mock Objects from Concept to Code
Mock Objects from Concept to CodeMock Objects from Concept to Code
Mock Objects from Concept to Code
 
Cambridge Consultants Innovation Day 2012: Minimising risk and delay for comp...
Cambridge Consultants Innovation Day 2012: Minimising risk and delay for comp...Cambridge Consultants Innovation Day 2012: Minimising risk and delay for comp...
Cambridge Consultants Innovation Day 2012: Minimising risk and delay for comp...
 
Community based harversting for USDL
Community based harversting for USDLCommunity based harversting for USDL
Community based harversting for USDL
 
Migrating a Cognos Instance between Authentication Sources with Motio
Migrating a Cognos Instance between Authentication Sources with MotioMigrating a Cognos Instance between Authentication Sources with Motio
Migrating a Cognos Instance between Authentication Sources with Motio
 
SmartRM -- key technologies
SmartRM -- key technologiesSmartRM -- key technologies
SmartRM -- key technologies
 
Elveego circuits
Elveego circuitsElveego circuits
Elveego circuits
 
Sap edi idoc
Sap edi idocSap edi idoc
Sap edi idoc
 
SimD
SimDSimD
SimD
 
DIADEM WWW 2012
DIADEM WWW 2012DIADEM WWW 2012
DIADEM WWW 2012
 
O leary2012 comp_ppt_ch07
O leary2012 comp_ppt_ch07O leary2012 comp_ppt_ch07
O leary2012 comp_ppt_ch07
 
Seed capital
Seed capitalSeed capital
Seed capital
 
Bending The Curve
Bending The CurveBending The Curve
Bending The Curve
 
Domain Driven Design Development Spring Portfolio
Domain Driven Design Development Spring PortfolioDomain Driven Design Development Spring Portfolio
Domain Driven Design Development Spring Portfolio
 
Building Windows 8 Apps with Windows Azure
Building Windows 8 Apps with Windows AzureBuilding Windows 8 Apps with Windows Azure
Building Windows 8 Apps with Windows Azure
 
Asian Eye for the Slovene Guy
Asian Eye for the Slovene GuyAsian Eye for the Slovene Guy
Asian Eye for the Slovene Guy
 
Penerapan TIK Bagi Dunia Pendidikan
Penerapan TIK Bagi Dunia PendidikanPenerapan TIK Bagi Dunia Pendidikan
Penerapan TIK Bagi Dunia Pendidikan
 
SeCold - A Linked Data Platform for Mining Software Repositories
SeCold - A Linked Data Platform for  Mining Software RepositoriesSeCold - A Linked Data Platform for  Mining Software Repositories
SeCold - A Linked Data Platform for Mining Software Repositories
 
Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6Scaling Crashlytics: Building Analytics on Redis 2.6
Scaling Crashlytics: Building Analytics on Redis 2.6
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Real World Big Data, 100,000 transactions a second in a NoSQL Database

  • 1. Achieving 100,000 Transactions Per Second with a NoSQL Database Eric David Bloch @eedeebee 23 May 2012
  • 2. A bit about me • I’ve written software used by millions of people. Apps, libraries, compilers, device drivers, operating systems • This is my third Gluecon and my first talk • I’m the Community Director at MarkLogic, last 2 years. • I survived having 3 kids in less than 2 years.
  • 3. Like me, but not me
  • 5. Storyline • Why? • How? • Whirl-wind tour of database architecture • Techniques to get to 100K
  • 6.
  • 7.  It’s about money.  Top 5 bank needed to manage trades  Trades look more like documents than tables  Schemas for trades change all the time  Transactions  Scale and velocity (“Big Data”)
  • 8. Trades and Positions  1 million trades per day  Followed by 1 million position reports at end of day  Roll up trades of current date for each “book, instrument” pair  Group-by, with key = “date, book, instrument” 1M positions 1M trades Day Day Day Day Day ... 1 2 3 4 5
  • 9. Trades and Positions <trade> <quantity>8540882</quantity> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date> </trade> <position> <instrument>EAAFX</instrument> <book>679</book> <quantity>3</quantity> <business-date>2011-03-25Z</business-date> <position-date>2011-03-24Z</position-date> </position>
  • 10. Requirements NoSQL flexibility, performance & scale Enterprise-grade transactional guarantees
  • 11. Now show us • 15K inserts per second • Linear scalability
  • 12. What NoSQL Database out there can we use? A quote from the bank “We threw everything we had at MarkLogic and it didn’t break a sweat”
  • 13. We weren’t content with 15K So we showed them… Slide 13 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 14. What is MarkLogic?  Non-relational, document-oriented, distributed database  Shared nothing clustering, linear scale out  Multi-Version Concurrency Control (MVCC)  Transactions (ACID)  Search Engine  Web scale (Big Data)  Inverted indexes (term lists)  Real-time updates  Compose-able queries Slide 14 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 15. Architecture  Data model  Indexing  Clustering  Query execution  Transactions Slide 15 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 16. Key (URI) Value (Document) /trade/153748994 <trade> <id>8</id> <time>2012-02-20T14:00:00</time> <instrument>BYME AAA</instrument> <price cur=“usd”>600.27</price> </trade> /user/eedeebee { “name” : “Eric Bloch”, “age” : 47, “hair” : “gray”, “kids” : [ “Grace”, “Ryan”, “Owen” ] } /book5293 It was the best of times, it was the worst of times, it was the age of wisdom, ... /2012-02-20T14:47:53/01445 .mp3 .avi [your favorite binary format]
  • 17. Inverted Index “which” 123, 127, 129, 152, 344, 791 . . . “uniquely” 122, 125, 126, 129, 130, 167 . . . “identify” 123, 126, 130, 142, 143, 167 . . . “each” 123, 130, 131, 135, 162, 177 . . . Document “uniquely identify” 126, 130, 167, 212, 219, 377 . . . References <article> ... 126, 130, 167, 212, 219, 377 . . . article/abstract/@author ... <product>IMS</product> Slide 17 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 18. Range Index <trade> <trader_id>8</trader_id> Value Docid <time>2012-02-20T14:00:00</time> <instrument>IBM</instrument> 0 287 … </trade> 8 1129 13 531 Docid Value <trade> <trader_id>13</trader_id> … … 287 0 <time>2012-02-20T14:30:00</time> Rows <instrument>AAPL</instrument> … … 531 13 … 1129 8 </trade> … … <trade> <trader_id>0</trader_id> … … <time>2012-02-20T15:30:00</time> <instrument>GOOG</instrument> • Column Oriented … • Memory Mapped </trade> Slide 18 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 19. Shared-Nothing Clustering Host D1 Host D2 Host D3 … Host Dj Slide 19 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 20. Query Evaluation – “Map” Q Host E1 Host E2 Host E3 … Host Ei Host D1 Host D2 Host D3 Host Dj … Q Q Q Q F1 … Fn F1 … Fn F1 … Fn F1 … Fn Slide 21 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 21. Query Evaluation – “Reduce” Top 10 Host E1 Host E2 Host E3 … Host Ei Host D1 Host D2 Host D3 Host Dj … Top Top Top Top 10 10 10 10 F1 … Fn F1 … Fn F1 … Fn F1 … Fn Slide 22 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 22. Queries/Updates with MVCC  Every query has a timestamp  Documents do not change  Reads are lock-free  Inserts – see next slide  Deletes – mark as deleted  Edits –  copy  edit  insert the copy  mark the original as deleted Slide 23 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 23. Insert Mechanics 1) New URI+Document arrive at E-node 2) URI Probe – determine whether URI exists in any forest 3) URI Lock – write locks is taken on D node(s) 4) Forest Assignment – URI is deterministically placed in Forest 5) Indexing 6) Journaling 7) Commit – transaction complete 8) Release URI Locks – D node(s) are notified to release lock Slide 24 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 24. Save-and-merge (Log Stuctured Tree Merge) Database Insert/ Update Forest 1 Forest 2 S0 Memory Save Disk Journaled S1 S2 S1 S2 S3 S3 Merge J Slide 25 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 26. Trades and Positions  1 million trades per day  followed by 1 million position reports at end of day  Roll up trades of the current “date” for each “book:instrument” pair  Group-by, with key = “book:date:instrument” 1M positions 1M trades Day Day Day Day Day ... 1 2 3 4 5 Slide 27 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 27. Naive Query Pseudocode for each book for each instrument in that book position = position(yesterday, book, instrument) for each trade of that instrument in this book position += trade(today, book, instrument).quantity insert(today, book, instrument, position) Slide 28 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 28. Initial results Single node – 19,000 inserts per second Slide 29 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 29. Initial results with cluster 70000 60000 50000 40000 doc/sec 30000 Report Query 2 20000 10000 0 1DE 2DE 3DE # of nodes Slide 30 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 30. Techniques to get to 100K  Query for Computing New Positions  Materialized compound key, Co-Occurrence Query and Aggregation  Insert of New Positions  Batching  In-Forest Eval Slide 31 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 31. Materializing a compound key <trade> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date> </trade> Slide 32 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 32. Materializing a compound key <trade> <roll-up book-date-instrument=“151333445566782303”/> <quantity2>1193.71</quantity2> <instrument>WASAX</instrument> <book>679</book> <trade-date>2011-03-13-07:00</trade-date> <settle-date>2011-03-17-07:00</settle-date> </trade> Slide 33 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 33. Co-Occurrence and Distributed Aggregation <trade> <roll-up book-date-instrument=“151373445566703”/> V D D V <quantity>8540882</quantity> … … <instrument>WASAX</instrument> … … … 1129 <book>679</book> … … <trade-date>2011-03-13-07:00</trade-date> … … 15137… … <settle-date>2011-03-17-07:00</settle-date> … … </trade> … … … … … … • Co-occurrences: V D Find pairings of range indexed values D V … … • Aggregate on the D nodes (Map/Reduce): 8540882 … 1129 … Sum up the quantities above … … … … … 8540882 • Similar to a Group-by, … … • in a column-oriented, in-memory … … … … database … … Slide 34 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 34. Initial results with new query > 30K inserts/second on a single node Slide 35 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 35. Co-Occurrence + Aggregate versus Naïve approach 70000 60000 50000 40000 doc/sec Report Query 3 30000 Report Query 2 20000 10000 0 1DE 2DE 3DE # of nodes Slide 36 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 36. Techniques  Computing Positions  Materialized compound key, Co-Occurrence Query and Aggregation  Updates  Batching  In-Forest Eval Slide 37 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 37. Transaction Size and Throughput 35000 30000 25000 #docs per sec. 20000 15000 insert throughput 10000 5000 0 1 2 4 8 16 32 100 500 1000 10000 # doc inserts per transaction Slide 38 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 38. Techniques  Computing Positions (Data Warehouse)  Hash Key Decoration  Co-Occurrence  Updates (Transaction)  Batching  In-Forest Eval Slide 39 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 39. Insert Mechanics 1) New URI+Document arrive at E-node 2) *URI Probe – determine whether URI exists in any forest 3) *URI Lock – write locks are created 4) Forest Assignment – URI is deterministically placed in Forest 5) Indexing 6) Journaling 7) Commit – transaction complete 8) *Release URI Locks – D nodes are notified to release lock * Overhead of these operations increases with cluster size Slide 40 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 40. Deterministic Placement 4) Forest Assignment – URI is deterministically placed in Forest Hash 64-bit Bucketed Fi URI function number into a Forest • Done in C++ within server • But… • Can also be done in the client • Server allows queries to be evaluated against only one forest… Slide 41 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 41. In-Forest Eval Q E D1 D2 Q F1 F2 F3 F4 Slide 42 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 42. In-Forest Eval Insert Mechanics 1) New URI+Document arrive at E-node a) Compute Fi using b) Ask server to evaluated the insert query only against Fi 2) URI Probe – Fi Only 3) URI Lock – Fi Only 4) Forest Assignment – Fi Only 5) Indexing 6) Journaling 7) Commit – transaction complete 8) Lock Release - Fi Only Slide 43 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 43. Regular Insert Vs. In-Forest Eval 70000 60000 50000 docs/second 40000 regular insert 30000 in-forest-eval In-Forest Eval 20000 10000 0 1DE 2DE 3DE # of nodes Slide 44 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 44. Tada! Achieving 100K Updates In-Forest Eval Slide 45 Copyright © 2012 MarkLogic® Corporation. All rights reserved.
  • 45. NoSQL document-oriented database with real-time full-text search and transactions Doing 100K transactions per second (Yes that’s marker felt)
  • 46.
  • 47.
  • 48. Thank You! @eedeebee eric.bloch@marklogic.com Slide 49 Copyright © 2012 MarkLogic® Corporation. All rights reserved.

Hinweis der Redaktion

  1. Pain pointsThree tiered architectureHouse keeping &amp; AvailabilityStraw in a swimming pool
  2. Pain pointsThree tiered architectureHouse keeping &amp; AvailabilityStraw in a swimming pool
  3. Pain pointsThree tiered architectureHouse keeping &amp; AvailabilityStraw in a swimming pool
  4. word and phrase search, booleansearch, proximity, wildcarding, stemming, tokenization, decompounding, case-sensitivity options, punctuation-sensitivity options, diacritic-sensitivity options, document quality settings, numerous relevance algorithms, individual term weighting, topic clustering, faceted navigation, custom-indexed fields, and more.