SlideShare a Scribd company logo
1 of 37
What ya gonna do?
  without the help of Moore’s Law?
Scope

• Internet effect on corporate data centre
• End of Moore’s law
• Scaling on and off CPU
Internet emerges
• 1980s - Connections
  •   Broadband connectivity at work, modem @ home
  •   Beginnings of e-Commerce (Amazon’s readers recommendations
      shows the way)

• 1990s - Few Publishers
  •   Internet bubble
  •   Rise of Search (Google shows the way)
  •   Start of consumer publications (Blogs / WIKIs)
Read-write Internet

• Good connectivity / reach
• Social networking = publication explosion
• Smart phones WIFI / 3G
Outputs

• More, much more data
• Content is rich (read BIG!!)
 • audio, video, photo
• Data is unstructured or semi-structured
 • users don’t do DBs
We ain’t Twitter
•   OK, but wouldn’t you like to mine all of that
    public information?

    •   See what they are saying about your
        products / competitors / their requirements?

•   Is there any possibility of turning on an internal
    fire hose?

    •   How many fine-grained business events
        happen in your company that you would like
        to track / analyse? Someone will....
Fire Hydrants
• They’re coming - more data, from more people
  and more devices

• Use data to improve decisions
• Gain insight to the organisation
• Jump competition or at least maintain pace
Numbers
• Facebook serves 250k unique pages per
    second (June 2010)

• Twitter has seen a rise from 10m to 50m
    tweets per day in the last year (July 2010)

• 1Gb of disk $700k         (1980) ---   10c (2010)
•   “Between the birth of the world and 2003, there
    were 5 exabytes of information created. We [now]
    create 5 exabytes every 2 days.” Eric Schmidt,
    CEO, Google
So what?
•   As people share more, they will change the way
    they form their opinions

•   Existing media channels are struggling to adapt
    their business models

•   Traditional market research, product marketing
    and after-sales channels become less relevant to
    these consumers

•   Being out of the loop is bad for business
How bad for business?
• Now: data is a key asset of business
• Future: business data is not only private
 • as public content integrated into analysis
• Maintaining secrecy will rise in cost
 • internal systems management
 • governance as you join the conversation
Effects

• Conventional platforms cannot
   • store so much data cost effectively
   • process the data cost effectively
   • derive meaning from unstructured sources
Hardware Now

• SMP x86-64 & bit players
• Large local RAM (<=2TB)
• NAS for high capacity storage (<=14PB)
• On-premise
Today’s Big Boxes

• Indicate trends and influences
• Use 50k-250k CPU cores
• All Top 10 supercomputers run Linux
• Algorithms must be fault tolerant
Moore’s is Less

•   Moore’s law was software developer’s friend
    •   30 years of good times, speed ups “for free”
•   Outward effect of Moore’s law only
    maintained if exploiting multiple cores
•   Standard programming models need to adapt
    to use multiple cores
Hardware Horizon
• Fast inter-core buses and networks
 •   Infiniband: 10Gb/s - 120Gb/s

• Networked memory
 •   NUMA - not homogenous

• Exabyte disk clusters
• Elastic scaling
• On and off-premise integrated
Distributed Disruption

• Existing clustering options do not work
• Existing software models do not work
• Existing data models do not work
Old Skool
• Traditional clustering enables all machines
  in a cluster to behave as if they are one in
  space and time
• Not physically possible to cluster online
  access to all data globally with today’s
  hardware and networks (ask Google)
  •   Not news: traditional corporations do not
      have real-time, coherent global BI databases
I’m gonna pop a CAP in
       your head
• Repeat: clustering does not scale
• So, you can have 2 from 3 of:
 • Consistency
 • Availability
 • Partition Tolerance
AC / DC

• One needs Partition tolerance to scale, so
  you can only have:
  • Availability OR
  • Consistency
• All attempts to scale out conventional
  databases and application servers prove the
  theorem (who still believes in sharding?)
Availability

• Enables high service levels so the site stays
  alive
• Lose global consistency for periods
  (seconds or less)
Consistency

• Focus of RDBMS today
• High cost only appropriate for high value
• Remains the default for non-scaling cases
Eventual Consistency
• A datastore guarantees to eventually
  provide updates to all cluster members
• Some desirable properties
 • Read your own writes
   •   Limited form of cursor stability

 • Monotonic read consistency
   •   Only see updates in the order they happened
Sclerotic Software
• Early (mostly static) binding of everything
  to everything else
• Point to point traffic routing
• Application to server
• Single thread model of control
• Program language to runtime
• Object models to SQL
Shapeability

• Dynamic data routing
• Runtime, in-place upgrades
• Languages that support parallel functions
• Multiple evolving and coexisting schemas
• Zero impedance mismatching
Dynamic Data Routing
• Cannot rely on per input solutions
• Data transfer protocols should have
  minimal impact on programming models
 • Law of leaky abstractions
• Bus required to allow evolution and to add
  intelligence to routes
Upgrades
• Software must be upgradeable in parts
• Software must stay up while upgrade is
  ongoing
• Modular, transitive upgrades (Maven, OSGI)
• Hypervisor VM mobility (vMotion,
  Teleportation)
SCAlable LAnguages
• Java 7 comes with more concurrency
  support (fork/join ... due mid 2011)
• Functional languages have support for high
  concurrency
  • JVM Languages: Scala, Clojure
  • .NET: F#
  • Others: Erlang, Haskell, Ocaml
Schema Shmeema

• Easiest schema evolution is with no schema
  (NoSQL data stores)
• Where schema needed, data can travel with
  its schema (AVRO, Riak, CouchDB)
• Data can be shared via REST, JMS or trickle
  to RDMBS
Objections to models

• Remove the RM from ORM
• Externalize schemas don’t internalize them
• Prefer simple persistence options
 • key/value, graph or document-oriented
What scales

• HTTP - it’s stateless . . . but:
 • Caching layers need to be added
 • Protocol can go faster (Google et al
    proposing updates for 1.2)
• Er, that’s it from the current stack
App server scale FAIL

• Threads too coarse grained and expensive
• Need Actor model to be reliable and scale
  out to exploit the hardware
• CAP based design patterns over data
Software scaling limits
• Ahmdahl’s law still applies:
 • Can only go as fast as the slowest
    serializable task
 • Worse if that task blocks others, which it
    often does
• Software needed to support cloud design
  and testing
RDBMS scale FAIL
•   Index updates do not scale linearly with data

•   Normalize to reduce data volumes but then joins
    become too expensive

•   Transactions are costly and often not needed
    (especially for READ)

•   Hard to manage xx,000 MySQL instances (ask
    Yahoo! and Facebook)

•   License fees scale with load ($1m+ / month for
    Facebook just to serve photos)
NoSQL

• Different flavours (with examples)
 • column oriented (Hadoop, Hbase)
 • document store (Couch, Mongo)
 • key value store (Riak, Redis)
 • eventual consistent (Dynamo,Voldemort)
 • graph database (Neo4J, InfiniteGraph)
NoSQL gains
• Scale
• Performance
• Reliability and uptime
• Simpler application persistence API
• Some SQL syntax for aggregate operations
• Zero backup, if using HA file system
NoSQL loses
• SQL - especially joins
• Schema
• Transactions
• Consistency (for some coarse-grained
  aspects at least)
• Query tools are immature / low-level
NoSQL


• For the diplomats: No(t Only) SQL
• SQL will live on in many applications and
  Use Cases

More Related Content

What's hot

Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010JUG Lausanne
 
Infinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseInfinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseAlexander Petrov
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsTodd Hoff
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQLCrate.io
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?DataStax
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?Venu Anuganti
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Gavin Heavyside
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardNOLOH LLC.
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Gavin Heavyside
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document DsplayChris Despopoulos
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalTech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalDez Blanchfield
 
SharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENSharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENChris McNulty
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singhMayank Singh
 
D Maeda Bi Portfolio
D Maeda Bi PortfolioD Maeda Bi Portfolio
D Maeda Bi PortfolioDMaeda
 

What's hot (20)

Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010Infinispan - Galder Zamarreno - October 2010
Infinispan - Galder Zamarreno - October 2010
 
Infinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql databaseInfinispan, transactional key value data grid and nosql database
Infinispan, transactional key value data grid and nosql database
 
RDBMS vs NoSQL
RDBMS vs NoSQLRDBMS vs NoSQL
RDBMS vs NoSQL
 
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web ApplicationsWhat Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
What Should I Do? Choosing SQL, NoSQL or Both for Scalable Web Applications
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
Hpts 2011 flexible_oltp
Hpts 2011 flexible_oltpHpts 2011 flexible_oltp
Hpts 2011 flexible_oltp
 
C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?C*ollege Credit: Is My App a Good Fit for Cassandra?
C*ollege Credit: Is My App a Good Fit for Cassandra?
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?SQL/NoSQL How to choose ?
SQL/NoSQL How to choose ?
 
Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011Non-Relational Databases at ACCU2011
Non-Relational Databases at ACCU2011
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
 
Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010Introduction to Hadoop - ACCU2010
Introduction to Hadoop - ACCU2010
 
4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay4D Pubs - Distributed Dynamic Document Dsplay
4D Pubs - Distributed Dynamic Document Dsplay
 
Database History From Codd to Brewer
Database History From Codd to BrewerDatabase History From Codd to Brewer
Database History From Codd to Brewer
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-finalTech lab 2016-ep01-pepper-data-dez-slides-20160303-final
Tech lab 2016-ep01-pepper-data-dez-slides-20160303-final
 
Revision
RevisionRevision
Revision
 
SharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDENSharePoint SpeedMetal Admin 101 - SPSDEN
SharePoint SpeedMetal Admin 101 - SPSDEN
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
D Maeda Bi Portfolio
D Maeda Bi PortfolioD Maeda Bi Portfolio
D Maeda Bi Portfolio
 

Similar to What ya gonna do?

ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)Huibert Aalbers
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Ricard Clau
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learnJohn D Almon
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.Steve Hoffman
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Ricard Clau
 
Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Dockerallingeek
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha Talagala
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of databcantrill
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDBFoundationDB
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Bob Pusateri
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 

Similar to What ya gonna do? (20)

Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)ITI015En-The evolution of databases (I)
ITI015En-The evolution of databases (I)
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.
 
Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015Modern software architectures - PHP UK Conference 2015
Modern software architectures - PHP UK Conference 2015
 
Why we got to Docker
Why we got to DockerWhy we got to Docker
Why we got to Docker
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016Nisha talagala keynote_inflow_2016
Nisha talagala keynote_inflow_2016
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
The Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of dataThe Internet-of-things: Architecting for the deluge of data
The Internet-of-things: Architecting for the deluge of data
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
noSQL choices
noSQL choicesnoSQL choices
noSQL choices
 
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
HPCC Systems Engineering Summit: Community Use Case: Because Who Has Time for...
 
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
Select Stars: A DBA's Guide to Azure Cosmos DB (SQL Saturday Oslo 2018)
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 

What ya gonna do?

  • 1. What ya gonna do? without the help of Moore’s Law?
  • 2. Scope • Internet effect on corporate data centre • End of Moore’s law • Scaling on and off CPU
  • 3. Internet emerges • 1980s - Connections • Broadband connectivity at work, modem @ home • Beginnings of e-Commerce (Amazon’s readers recommendations shows the way) • 1990s - Few Publishers • Internet bubble • Rise of Search (Google shows the way) • Start of consumer publications (Blogs / WIKIs)
  • 4. Read-write Internet • Good connectivity / reach • Social networking = publication explosion • Smart phones WIFI / 3G
  • 5. Outputs • More, much more data • Content is rich (read BIG!!) • audio, video, photo • Data is unstructured or semi-structured • users don’t do DBs
  • 6. We ain’t Twitter • OK, but wouldn’t you like to mine all of that public information? • See what they are saying about your products / competitors / their requirements? • Is there any possibility of turning on an internal fire hose? • How many fine-grained business events happen in your company that you would like to track / analyse? Someone will....
  • 7. Fire Hydrants • They’re coming - more data, from more people and more devices • Use data to improve decisions • Gain insight to the organisation • Jump competition or at least maintain pace
  • 8. Numbers • Facebook serves 250k unique pages per second (June 2010) • Twitter has seen a rise from 10m to 50m tweets per day in the last year (July 2010) • 1Gb of disk $700k (1980) --- 10c (2010) • “Between the birth of the world and 2003, there were 5 exabytes of information created. We [now] create 5 exabytes every 2 days.” Eric Schmidt, CEO, Google
  • 9. So what? • As people share more, they will change the way they form their opinions • Existing media channels are struggling to adapt their business models • Traditional market research, product marketing and after-sales channels become less relevant to these consumers • Being out of the loop is bad for business
  • 10. How bad for business? • Now: data is a key asset of business • Future: business data is not only private • as public content integrated into analysis • Maintaining secrecy will rise in cost • internal systems management • governance as you join the conversation
  • 11. Effects • Conventional platforms cannot • store so much data cost effectively • process the data cost effectively • derive meaning from unstructured sources
  • 12. Hardware Now • SMP x86-64 & bit players • Large local RAM (<=2TB) • NAS for high capacity storage (<=14PB) • On-premise
  • 13. Today’s Big Boxes • Indicate trends and influences • Use 50k-250k CPU cores • All Top 10 supercomputers run Linux • Algorithms must be fault tolerant
  • 14. Moore’s is Less • Moore’s law was software developer’s friend • 30 years of good times, speed ups “for free” • Outward effect of Moore’s law only maintained if exploiting multiple cores • Standard programming models need to adapt to use multiple cores
  • 15. Hardware Horizon • Fast inter-core buses and networks • Infiniband: 10Gb/s - 120Gb/s • Networked memory • NUMA - not homogenous • Exabyte disk clusters • Elastic scaling • On and off-premise integrated
  • 16. Distributed Disruption • Existing clustering options do not work • Existing software models do not work • Existing data models do not work
  • 17. Old Skool • Traditional clustering enables all machines in a cluster to behave as if they are one in space and time • Not physically possible to cluster online access to all data globally with today’s hardware and networks (ask Google) • Not news: traditional corporations do not have real-time, coherent global BI databases
  • 18. I’m gonna pop a CAP in your head • Repeat: clustering does not scale • So, you can have 2 from 3 of: • Consistency • Availability • Partition Tolerance
  • 19. AC / DC • One needs Partition tolerance to scale, so you can only have: • Availability OR • Consistency • All attempts to scale out conventional databases and application servers prove the theorem (who still believes in sharding?)
  • 20. Availability • Enables high service levels so the site stays alive • Lose global consistency for periods (seconds or less)
  • 21. Consistency • Focus of RDBMS today • High cost only appropriate for high value • Remains the default for non-scaling cases
  • 22. Eventual Consistency • A datastore guarantees to eventually provide updates to all cluster members • Some desirable properties • Read your own writes • Limited form of cursor stability • Monotonic read consistency • Only see updates in the order they happened
  • 23. Sclerotic Software • Early (mostly static) binding of everything to everything else • Point to point traffic routing • Application to server • Single thread model of control • Program language to runtime • Object models to SQL
  • 24. Shapeability • Dynamic data routing • Runtime, in-place upgrades • Languages that support parallel functions • Multiple evolving and coexisting schemas • Zero impedance mismatching
  • 25. Dynamic Data Routing • Cannot rely on per input solutions • Data transfer protocols should have minimal impact on programming models • Law of leaky abstractions • Bus required to allow evolution and to add intelligence to routes
  • 26. Upgrades • Software must be upgradeable in parts • Software must stay up while upgrade is ongoing • Modular, transitive upgrades (Maven, OSGI) • Hypervisor VM mobility (vMotion, Teleportation)
  • 27. SCAlable LAnguages • Java 7 comes with more concurrency support (fork/join ... due mid 2011) • Functional languages have support for high concurrency • JVM Languages: Scala, Clojure • .NET: F# • Others: Erlang, Haskell, Ocaml
  • 28. Schema Shmeema • Easiest schema evolution is with no schema (NoSQL data stores) • Where schema needed, data can travel with its schema (AVRO, Riak, CouchDB) • Data can be shared via REST, JMS or trickle to RDMBS
  • 29. Objections to models • Remove the RM from ORM • Externalize schemas don’t internalize them • Prefer simple persistence options • key/value, graph or document-oriented
  • 30. What scales • HTTP - it’s stateless . . . but: • Caching layers need to be added • Protocol can go faster (Google et al proposing updates for 1.2) • Er, that’s it from the current stack
  • 31. App server scale FAIL • Threads too coarse grained and expensive • Need Actor model to be reliable and scale out to exploit the hardware • CAP based design patterns over data
  • 32. Software scaling limits • Ahmdahl’s law still applies: • Can only go as fast as the slowest serializable task • Worse if that task blocks others, which it often does • Software needed to support cloud design and testing
  • 33. RDBMS scale FAIL • Index updates do not scale linearly with data • Normalize to reduce data volumes but then joins become too expensive • Transactions are costly and often not needed (especially for READ) • Hard to manage xx,000 MySQL instances (ask Yahoo! and Facebook) • License fees scale with load ($1m+ / month for Facebook just to serve photos)
  • 34. NoSQL • Different flavours (with examples) • column oriented (Hadoop, Hbase) • document store (Couch, Mongo) • key value store (Riak, Redis) • eventual consistent (Dynamo,Voldemort) • graph database (Neo4J, InfiniteGraph)
  • 35. NoSQL gains • Scale • Performance • Reliability and uptime • Simpler application persistence API • Some SQL syntax for aggregate operations • Zero backup, if using HA file system
  • 36. NoSQL loses • SQL - especially joins • Schema • Transactions • Consistency (for some coarse-grained aspects at least) • Query tools are immature / low-level
  • 37. NoSQL • For the diplomats: No(t Only) SQL • SQL will live on in many applications and Use Cases

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. #1 owned by USA, #2 owned by PRC\n
  14. \n
  15. \n
  16. \n
  17. \n
  18. CAP theorem proposed in June 2000 by Eric Brewer\n
  19. \n
  20. \n
  21. \n
  22. Robert Patrick\n
  23. \n
  24. \n
  25. \n
  26. \n
  27. Java: Join/Fork, Parallel arrays, tail call recursion improvements, not closures\n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n