SlideShare ist ein Scribd-Unternehmen logo
1 von 53
NoSQL: The New Normal


Matt Asay (@mjasay)
VP, Corporate Strategy, 10gen
Watch the video with slide
                         synchronization on InfoQ.com!
                      http://www.infoq.com/presentations
                              /nosql-history-future

       InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
  Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Presented at QCon London
                       www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
 - practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
10gen Overview




       180+ employees                             500+ customers




                                    Offices in New York, Palo Alto, Washington
  Over $81 million in funding       DC, London, Dublin, Barcelona and Sydney


                                2
Global Community

3,800,000+
MongoDB Downloads

47,000+
Online Education Registrants

15,000+
MongoDB User Group Members

14,000+
MongoDB Monitoring Service Users (MMS)

10,000+
Annual MongoDB Days Attendees


                                     3
the past…
is prologue?
Back to the Future?



 What has been will be again,
  what has been done will be done again;
  there is nothing new under the sun.
            (Ecclesiastes 1:9, NIV)




                         5
What Do You Remember about 1969?




                6
nothing? hold that thought




             7
Back to the Future: PreSQL

            •  IBM’s IMS (1969) – Developed
               as part of the Apollo Project
            •  IDS (Integrated Data Store),
               navigational database, 1973
            •  High performance but:
              –  Forced developers to worry
                 about both query design and
                 schema design upfront
              –  Made it hard to change anything
                 mid-stream

                   8
Enter SQL

•  Designed to overcome “PreSQL” deficiencies
  –  Decoupled query design from schema design
  –  Allowed developers to focus on schema design
  –  Could be confident that you could query the data as
     you wanted later


•  30 years of dominance later…




                            9
…the present…
RDBMS Is Like a Spreadsheet




                11
With “Relations” Between Rows




                12
Lots of relations.
 Lots of rows.




         13
It Hides What You’re Really Doing




                  14
It Makes Development Hard




     Code          XML Config        DB Schema




                 Object Relational   Relational
   Application
                     Mapping         Database




                        15
And Makes Things Hard to Change
New
Table                              New
                                  Column
              New
              Table


                           Name    Pet     Phone   Email




          New
         Column




                            3 months later…


                      16
RDBMS Scale = Bigger Computers




     “Clients can also opt to run zEC12 without a raised
     datacenter floor -- a first for high-end IBM mainframes.”

                                 IBM Press Release 28 Aug, 2012



                            17
This Was a Problem for Google

2010 Search Index Size:




                                                                     250,000+ MBP’s == 4.1 miles
100,000,000 GB


New data added per day
100,000+ GB


Databases they could use
0

                       18
                     Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
And for Facebook

            2010: 13,000,000 queries per second




                    19
And for Facebook

                  2010: 13,000,000 queries per second



                       TPC #1 DB: 504161 tps

TPC Top Results




                          20
And for Facebook

                     2010: 13,000,000 queries per second



                          TPC #1 DB: 504161 tps

TPC Top Results




            Top 10 combined: 1,370,368 tps

                             21
Big Data Changes Everything…
for Everyone
Variety of Data                   Agile Development
•  Unstructured data              •  Iterative
•  Semi-structured                •  Short development
   data                              cycles
•  Polymorphic data               •  New workloads




Volume/Velocity of Data           New Architectures
•  Petabytes of data              •  Horizontal scaling
•  Trillions of records           •  Commodity servers
•  Millions of queries per        •  Cloud computing
   second

                             22
Shift in What We’re Computing




                 23
Living in the Post-transactional Future

                               Order-processing systems largely “done” (RDBMS);
                               primary focus on better search and recommendations
                               or adapting prices on the fly (NoSQL)




Vast majority of its engineering is focused on
recommending better movies (NoSQL), not
processing monthly bills (RDBMS)




                               Easy part is processing the credit card (RDBMS).
                               Hard part is making it location aware, so it knows
                               where you are and what you’re buying (NoSQL)


                                          24
Shift in How We Develop Applications


 “Systems of Engagement are built by front-line
  developers using modern languages who are
 driven by time to market, the need for rapid
deployment and iteration….They value solutions
   that make it easy for them to deploy their
   application code with as little friction as
                possible.” (Forrester 2013)



                       25
Why NoSQL?




  Agile      Scalable   Lower TCO




                27
what does this mean?
Developers Are More Productive




     Code          XML Config        DB Schema




                 Object Relational   Relational
   Application
                     Mapping         Database




                        29
Developers Are More Productive




     Code          XML Config        DB Schema




                 Object Relational   Relational
   Application
                     Mapping         Database




                        30
Developers? They Matter




                 31
some nosql examples
Agility

      RDBMS                   MongoDB




                   {
                       _id : ObjectId("4c4ba5e5e8aabf3"),
                       employee_name: "Dunham, Justin",
                       department : "Marketing",
                       title : "Product Manager, Web",
                       report_up: "Neray, Graham",
                       pay_band: “C",
                       benefits : [
                              {   type : "Health",
                                  plan : "PPO Plus" },
                              {   type :   "Dental",
                                  plan : "Standard" }
                                   ]
                   }




              33
Example

  Serves targeted content to users using MongoDB-
  powered identity system
        Problem                   Why MongoDB                  Results

•  20M+ unique visitors      •  Easy-to-manage        •  Rapid rollout of new
   per month                    dynamic data model       features
                                enables limitless
•  Rigid relational schema      growth, interactive   •  Customized, social
   unable to evolve with                                 conversations
                                content
   changing data types                                   throughout site
   and new features          •  Support for ad hoc
                                                      •  Tracks user data to
                                queries
•  Slow development                                      increase engagement,
   cycles                    •  Highly extensible        revenue


                                         34
Scalability

                       Auto-Sharding




•  Increase capacity as you go
•  Commodity and cloud architectures
•  Improved operational simplicity and cost visibility



                              35
Case Study

  Manages a wide range of content and services
  for its web properties using MongoDB
         Problem                  Why MongoDB                        Results

•  Trouble dealing with a    •  Move from 6 billion rows   •  Significant performance
   huge variety of content      in RDBMS to simplicity        gains despite big
•  MySQL unable to keep         of 1 document                 increase in volume and
   up with performance                                        variety of data
                             •  Automated failover and
   and scalability
   requirements                 ability to add nodes       •  Greater agility, faster
                                without downtime              development iteration
•  Problems compounded
   by integrating            •  “Blazingly fast” query     •  Saved £2m in licenses
   information from T-          performance: “blown           and hardware
   Mobile joint venture         away by [MongoDB’s]
                                performance”
                                          36
Better Total Cost of Ownership (TCO)

Developer/Ops Savings
•  Ease of Use
•  Agile development
•  Less maintenance

Hardware Savings
•  Commodity servers
•  Internal storage (no SAN)
•  Scale out, not up

Software/Support Savings                   DB Alternative
•  No upfront license
•  Cost visibility for usage growth


                                      37
Case Study

  Stores one of world’s largest record repositories and
  searchable catalogues in MongoDB
         Problem                   Why MongoDB                         Results

•  One of world’s largest     •  Fast, easy scalability       •  Will scale to 100s of TB
   record repositories                                           by 2013, PB by 2020
                              •  Full query language
•  Move to SOA required                                       •  Searchable catalogue
   new approach to data       •  Complex metadata
                                                                 of varied data types
   store                         storage
                                                              •  Decreased SW and
•  RDBMS could not            •  Delivers high scalability,
                                                                 support costs
   support centralized data      fast performance, and
   mgt and federation of         easy maintenance, while
   information services          keeping support costs low


                                           38
Performance




 Better Data   In-Memory   In-Place
  Locality      Caching    Updates

                   39
Case Study

  Uses MongoDB to safeguard over 6 billion images
  served to millions of customers
         Problem                   Why MongoDB                     Results

•  6B images, 20TB of         •  JSON-based data          •  5x cost reduction
   data                          model
                                                          •  9x performance
•  Brittle code base on top   •  Agile, high                 improvement
   of Oracle database –          performance, scalable
   hard to scale, add                                     •  Faster time-to-market
   features                   •  Alignment with
                                                          •  Dev cycles in weeks vs.
                                 Shutterfly’s services-
•  High SW and HW costs                                      tens of months
                                 based architecture




                                           40
…the future
Leading Organizations Rely on NoSQL




                  42
NoSQL Adoption




                                                NoSQL
                                                First
                                                Policy
                                 Multiple
                                   NoSQL
                                 NoSQL of
                                   Centre
                                 Projects
                                   Excellence
                 Multiple
                 NoSQL
       First     Projects
       NoSQL
       Project



                            43
NoSQL: The New Normal


                         Document
                         Store Meets
                         Requirements

RDBMSs Meet
Requirements




                         Key/Value or
                         Column Stores
                         Meet Requirements

                44
Is It a Polyglot Future?




                   45
Remember the Long Tail?




                46
It Didn’t Work out Well




                   47
General Purpose, High Performance

                              Database Popularity
                              Jobs, Searches, Mentions, Etc.

               Rank        Database      Database Type    Score     % Change
                   1 Oracle             RDBMS             1567.93       + 8.6
                     Microsoft SQL
                   2 Server             RDBMS             1310.61       - 0.73
                   3 MySQL              RDBMS             1284.78     - 28.89
                   4 Microsoft Access   RDBMS              186.05     + 15.12
                   5 PostgreSQL         RDBMS              183.47         + 16
                   6 DB2                RDBMS              162.05      + 8.39
                   7 MongoDB            Document store     116.14     + 20.01
                   8 Sybase             RDBMS               84.52      + 1.07
                   9 SQLite             RDBMS               81.01      + 0.65
                  10 Solr               Search engine       47.35
                                        Wide column
                  11 Cassandra          store               36.16      + 1.24
                  12 Redis              Key-value store     32.03      + 6.06
                  13 Informix           RDBMS               25.43      + 1.52
                  14 Memcached          Key-value store     23.28      - 1.83
                                        Wide column
                  15 Hbase              store               20.74      + 1.05
                                               Source: DBEngines, Feb 2013

                      48
NoSQL Benefits




Increased Developer Productivity        Better User Experience




     Faster Time to Market                   Lower TCO


                                   49
Leading NoSQL Database
             Google Search                                             LinkedIn Job Skills

                                                                                              MongoDB

                                                                                              Competitor 1

                                                                                              Competitor 2

                                                                                              Competitor 3

                                                                                              Competitor 4

                                                                                              Competitor 5
           MongoDB            Competitor 2         Competitor 4
                                                                                              All Others
           Competitor 1       Competitor 3


    Jaspersoft Big Data Index                                          Indeed.com Trends

 Direct Real-Time Downloads                                                             Top Job Trends
                                                                                        1.  HTML 5
                                             MongoDB                                    2.  MongoDB
                                             Competitor 1                               3.  iOS
                                                                                        4.  Android
                                             Competitor 2                               5.  Mobile Apps
                                                                                        6.  Puppet
                                             Competitor 3
                                                                                        7.  Hadoop
                                                                                        8.  jQuery
                                                                                        9.  PaaS
                                                                  51                    10.  Social Media

Weitere ähnliche Inhalte

Mehr von C4Media

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoC4Media
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileC4Media
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020C4Media
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsC4Media
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No KeeperC4Media
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like OwnersC4Media
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 

Mehr von C4Media (20)

Streaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live VideoStreaming a Million Likes/Second: Real-Time Interactions on Live Video
Streaming a Million Likes/Second: Real-Time Interactions on Live Video
 
Next Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy MobileNext Generation Client APIs in Envoy Mobile
Next Generation Client APIs in Envoy Mobile
 
Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020Software Teams and Teamwork Trends Report Q1 2020
Software Teams and Teamwork Trends Report Q1 2020
 
Understand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java ApplicationsUnderstand the Trade-offs Using Compilers for Java Applications
Understand the Trade-offs Using Compilers for Java Applications
 
Kafka Needs No Keeper
Kafka Needs No KeeperKafka Needs No Keeper
Kafka Needs No Keeper
 
High Performing Teams Act Like Owners
High Performing Teams Act Like OwnersHigh Performing Teams Act Like Owners
High Performing Teams Act Like Owners
 
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 

Kürzlich hochgeladen

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Kürzlich hochgeladen (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

The Past, Present, and Future of NoSQL

  • 1. NoSQL: The New Normal Matt Asay (@mjasay) VP, Corporate Strategy, 10gen
  • 2. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /nosql-history-future InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month
  • 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. 10gen Overview 180+ employees 500+ customers Offices in New York, Palo Alto, Washington Over $81 million in funding DC, London, Dublin, Barcelona and Sydney 2
  • 5. Global Community 3,800,000+ MongoDB Downloads 47,000+ Online Education Registrants 15,000+ MongoDB User Group Members 14,000+ MongoDB Monitoring Service Users (MMS) 10,000+ Annual MongoDB Days Attendees 3
  • 7. Back to the Future? What has been will be again, what has been done will be done again; there is nothing new under the sun. (Ecclesiastes 1:9, NIV) 5
  • 8. What Do You Remember about 1969? 6
  • 9. nothing? hold that thought 7
  • 10. Back to the Future: PreSQL •  IBM’s IMS (1969) – Developed as part of the Apollo Project •  IDS (Integrated Data Store), navigational database, 1973 •  High performance but: –  Forced developers to worry about both query design and schema design upfront –  Made it hard to change anything mid-stream 8
  • 11. Enter SQL •  Designed to overcome “PreSQL” deficiencies –  Decoupled query design from schema design –  Allowed developers to focus on schema design –  Could be confident that you could query the data as you wanted later •  30 years of dominance later… 9
  • 13. RDBMS Is Like a Spreadsheet 11
  • 15. Lots of relations. Lots of rows. 13
  • 16. It Hides What You’re Really Doing 14
  • 17. It Makes Development Hard Code XML Config DB Schema Object Relational Relational Application Mapping Database 15
  • 18. And Makes Things Hard to Change New Table New Column New Table Name Pet Phone Email New Column 3 months later… 16
  • 19. RDBMS Scale = Bigger Computers “Clients can also opt to run zEC12 without a raised datacenter floor -- a first for high-end IBM mainframes.” IBM Press Release 28 Aug, 2012 17
  • 20. This Was a Problem for Google 2010 Search Index Size: 250,000+ MBP’s == 4.1 miles 100,000,000 GB New data added per day 100,000+ GB Databases they could use 0 18 Source: http://googleblog.blogspot.com/2010/06/our-new-search-index-caffeine.html
  • 21. And for Facebook 2010: 13,000,000 queries per second 19
  • 22. And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504161 tps TPC Top Results 20
  • 23. And for Facebook 2010: 13,000,000 queries per second TPC #1 DB: 504161 tps TPC Top Results Top 10 combined: 1,370,368 tps 21
  • 24. Big Data Changes Everything… for Everyone Variety of Data Agile Development •  Unstructured data •  Iterative •  Semi-structured •  Short development data cycles •  Polymorphic data •  New workloads Volume/Velocity of Data New Architectures •  Petabytes of data •  Horizontal scaling •  Trillions of records •  Commodity servers •  Millions of queries per •  Cloud computing second 22
  • 25. Shift in What We’re Computing 23
  • 26. Living in the Post-transactional Future Order-processing systems largely “done” (RDBMS); primary focus on better search and recommendations or adapting prices on the fly (NoSQL) Vast majority of its engineering is focused on recommending better movies (NoSQL), not processing monthly bills (RDBMS) Easy part is processing the credit card (RDBMS). Hard part is making it location aware, so it knows where you are and what you’re buying (NoSQL) 24
  • 27. Shift in How We Develop Applications “Systems of Engagement are built by front-line developers using modern languages who are driven by time to market, the need for rapid deployment and iteration….They value solutions that make it easy for them to deploy their application code with as little friction as possible.” (Forrester 2013) 25
  • 28.
  • 29. Why NoSQL? Agile Scalable Lower TCO 27
  • 30. what does this mean?
  • 31. Developers Are More Productive Code XML Config DB Schema Object Relational Relational Application Mapping Database 29
  • 32. Developers Are More Productive Code XML Config DB Schema Object Relational Relational Application Mapping Database 30
  • 35. Agility RDBMS MongoDB { _id : ObjectId("4c4ba5e5e8aabf3"), employee_name: "Dunham, Justin", department : "Marketing", title : "Product Manager, Web", report_up: "Neray, Graham", pay_band: “C", benefits : [ { type : "Health", plan : "PPO Plus" }, { type : "Dental", plan : "Standard" } ] } 33
  • 36. Example Serves targeted content to users using MongoDB- powered identity system Problem Why MongoDB Results •  20M+ unique visitors •  Easy-to-manage •  Rapid rollout of new per month dynamic data model features enables limitless •  Rigid relational schema growth, interactive •  Customized, social unable to evolve with conversations content changing data types throughout site and new features •  Support for ad hoc •  Tracks user data to queries •  Slow development increase engagement, cycles •  Highly extensible revenue 34
  • 37. Scalability Auto-Sharding •  Increase capacity as you go •  Commodity and cloud architectures •  Improved operational simplicity and cost visibility 35
  • 38. Case Study Manages a wide range of content and services for its web properties using MongoDB Problem Why MongoDB Results •  Trouble dealing with a •  Move from 6 billion rows •  Significant performance huge variety of content in RDBMS to simplicity gains despite big •  MySQL unable to keep of 1 document increase in volume and up with performance variety of data •  Automated failover and and scalability requirements ability to add nodes •  Greater agility, faster without downtime development iteration •  Problems compounded by integrating •  “Blazingly fast” query •  Saved £2m in licenses information from T- performance: “blown and hardware Mobile joint venture away by [MongoDB’s] performance” 36
  • 39. Better Total Cost of Ownership (TCO) Developer/Ops Savings •  Ease of Use •  Agile development •  Less maintenance Hardware Savings •  Commodity servers •  Internal storage (no SAN) •  Scale out, not up Software/Support Savings DB Alternative •  No upfront license •  Cost visibility for usage growth 37
  • 40. Case Study Stores one of world’s largest record repositories and searchable catalogues in MongoDB Problem Why MongoDB Results •  One of world’s largest •  Fast, easy scalability •  Will scale to 100s of TB record repositories by 2013, PB by 2020 •  Full query language •  Move to SOA required •  Searchable catalogue new approach to data •  Complex metadata of varied data types store storage •  Decreased SW and •  RDBMS could not •  Delivers high scalability, support costs support centralized data fast performance, and mgt and federation of easy maintenance, while information services keeping support costs low 38
  • 41. Performance Better Data In-Memory In-Place Locality Caching Updates 39
  • 42. Case Study Uses MongoDB to safeguard over 6 billion images served to millions of customers Problem Why MongoDB Results •  6B images, 20TB of •  JSON-based data •  5x cost reduction data model •  9x performance •  Brittle code base on top •  Agile, high improvement of Oracle database – performance, scalable hard to scale, add •  Faster time-to-market features •  Alignment with •  Dev cycles in weeks vs. Shutterfly’s services- •  High SW and HW costs tens of months based architecture 40
  • 45. NoSQL Adoption NoSQL First Policy Multiple NoSQL NoSQL of Centre Projects Excellence Multiple NoSQL First Projects NoSQL Project 43
  • 46. NoSQL: The New Normal Document Store Meets Requirements RDBMSs Meet Requirements Key/Value or Column Stores Meet Requirements 44
  • 47. Is It a Polyglot Future? 45
  • 48. Remember the Long Tail? 46
  • 49. It Didn’t Work out Well 47
  • 50. General Purpose, High Performance Database Popularity Jobs, Searches, Mentions, Etc. Rank Database Database Type Score % Change 1 Oracle RDBMS 1567.93 + 8.6 Microsoft SQL 2 Server RDBMS 1310.61 - 0.73 3 MySQL RDBMS 1284.78 - 28.89 4 Microsoft Access RDBMS 186.05 + 15.12 5 PostgreSQL RDBMS 183.47 + 16 6 DB2 RDBMS 162.05 + 8.39 7 MongoDB Document store 116.14 + 20.01 8 Sybase RDBMS 84.52 + 1.07 9 SQLite RDBMS 81.01 + 0.65 10 Solr Search engine 47.35 Wide column 11 Cassandra store 36.16 + 1.24 12 Redis Key-value store 32.03 + 6.06 13 Informix RDBMS 25.43 + 1.52 14 Memcached Key-value store 23.28 - 1.83 Wide column 15 Hbase store 20.74 + 1.05 Source: DBEngines, Feb 2013 48
  • 51. NoSQL Benefits Increased Developer Productivity Better User Experience Faster Time to Market Lower TCO 49
  • 52.
  • 53. Leading NoSQL Database Google Search LinkedIn Job Skills MongoDB Competitor 1 Competitor 2 Competitor 3 Competitor 4 Competitor 5 MongoDB Competitor 2 Competitor 4 All Others Competitor 1 Competitor 3 Jaspersoft Big Data Index Indeed.com Trends Direct Real-Time Downloads Top Job Trends 1.  HTML 5 MongoDB 2.  MongoDB Competitor 1 3.  iOS 4.  Android Competitor 2 5.  Mobile Apps 6.  Puppet Competitor 3 7.  Hadoop 8.  jQuery 9.  PaaS 51 10.  Social Media