SlideShare ist ein Scribd-Unternehmen logo
1 von 16
YesSQL
Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows).  Small amount of data. Database loads was small. Today internet has changed the game:  24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load
NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware
History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL  2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.
ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware
Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!
Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3
Distributed data over large amount of partition groups ,[object Object]
Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node
Many nodes in each group – higher replication ,[object Object]
Slower on insert/update – more machines needs to be updated
Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node
Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.
Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2
Demo on Amazon EC2 Customer c_id   bigint c_name varchar c_zip  varchar Products p_id    bigint p_Name  varchar p_price money Orders o_id     autobigint o_c_id   bigint o_p_id   bigint o_amount int o_date   datetime o_price  money

Weitere ähnliche Inhalte

Was ist angesagt?

Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Lviv Startup Club
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
DataStax
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
injae yeo
 

Was ist angesagt? (19)

Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...Understanding and tuning WiredTiger, the new high performance database engine...
Understanding and tuning WiredTiger, the new high performance database engine...
 
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
Yaroslav Nedashkovsky - "Data Engineering in Information Security: how to col...
 
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
Kubernetes Optimization - How We Cut Our Cloud Infrastructure Cost By 40% Usi...
 
Apache Cassandra in the Real World
Apache Cassandra in the Real WorldApache Cassandra in the Real World
Apache Cassandra in the Real World
 
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
A New MongoDB Sharding Architecture for Higher Availability and Better Resour...
 
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* ...
 
Running Cassandra in AWS
Running Cassandra in AWSRunning Cassandra in AWS
Running Cassandra in AWS
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
ScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous SpeedScyllaDB: NoSQL at Ludicrous Speed
ScyllaDB: NoSQL at Ludicrous Speed
 
Load testing Cassandra applications
Load testing Cassandra applicationsLoad testing Cassandra applications
Load testing Cassandra applications
 
Cassandra
CassandraCassandra
Cassandra
 
Mongo db multidc_webinar
Mongo db multidc_webinarMongo db multidc_webinar
Mongo db multidc_webinar
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
HDFS introduction
HDFS introductionHDFS introduction
HDFS introduction
 
MySQL HA
MySQL HAMySQL HA
MySQL HA
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Building your own NSQL store
Building your own NSQL storeBuilding your own NSQL store
Building your own NSQL store
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
 

Ähnlich wie ScimoreDB @ CommunityDays 2011

Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
Richard McDougall
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
Santal Li
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
drewz lin
 

Ähnlich wie ScimoreDB @ CommunityDays 2011 (20)

Agility and Scalability with MongoDB
Agility and Scalability with MongoDBAgility and Scalability with MongoDB
Agility and Scalability with MongoDB
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
Getting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCacheGetting Started with Amazon ElastiCache
Getting Started with Amazon ElastiCache
 
2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red Hat2012-03-15 What's New at Red Hat
2012-03-15 What's New at Red Hat
 
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to NoSQL | Big Data Hadoop Spark Tutorial | CloudxLab
 
Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013Is your cloud ready for Big Data? Strata NY 2013
Is your cloud ready for Big Data? Strata NY 2013
 
Getting started with Amazon ElastiCache
Getting started with Amazon ElastiCacheGetting started with Amazon ElastiCache
Getting started with Amazon ElastiCache
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk Dragonflow Austin Summit Talk
Dragonflow Austin Summit Talk
 
Getting Started with Amazon Redshift
Getting Started with Amazon RedshiftGetting Started with Amazon Redshift
Getting Started with Amazon Redshift
 
DynamoDB Deep Dive
DynamoDB Deep DiveDynamoDB Deep Dive
DynamoDB Deep Dive
 
Cassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and LimitationsCassandra Consistency: Tradeoffs and Limitations
Cassandra Consistency: Tradeoffs and Limitations
 
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
Production NoSQL in an Hour: Introduction to Amazon DynamoDB (DAT101) | AWS r...
 
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach ShoolmanRedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
RedisConf17 - Doing More With Redis - Ofer Bengal and Yiftach Shoolman
 
ceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-shortceph optimization on ssd ilsoo byun-short
ceph optimization on ssd ilsoo byun-short
 
Distribute Key Value Store
Distribute Key Value StoreDistribute Key Value Store
Distribute Key Value Store
 
Distribute key value_store
Distribute key value_storeDistribute key value_store
Distribute key value_store
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
 

ScimoreDB @ CommunityDays 2011

  • 2. Evolution of database – birth of NoSQL 15 year ago : Availability requirements was different from today (ATM shutdown 2 AM, services maintenance windows). Small amount of data. Database loads was small. Today internet has changed the game: 24x7 availability. Large data. Insane database loads Tomorrow Switching to hosted apps and thin clients. Even larger load
  • 3. NoSQL – mosaic of options key‐value‐caches memcached, repcached, coherence, infinispan, eXtreme scale, jboss cache, velocity, terracoqa key‐value‐store keyspace, flare, schema‐free, RAMCloud eventually‐consistent key‐value‐store dynamo, voldemort, Dynomite, SubRecord, Mo8onDb, Dovetaildb ordered‐key‐value‐store tokyo tyrant, lightcloud, NMDB, luxio, memcachedb, actord data‐structures server redis tuple‐store gigaspaces, coord, apache river object database ZopeDB, db4o, Shoal document store CouchDB, Mongo, Jackrabbit, XML Databases, ThruDB, CloudKit, Perservere, Riak Basho, Scalaris wide columnar store BigTable, Hbase, Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI
  • 4. Best of two worlds SQL Transactions Consistency Ad-hoc query language Common language No-SQL Scales horizontally Super fast Always available Comodity hardware
  • 5. History of ScimoreDB Driven by demand: 1999 Jubii - memory based / COM interface 2003 transaction/disc enabled 2004 distributed and DQL 2005 Scimore founded 2007 sql 2009 embedded 2010 replication, merge/bi-directional 2011 new distributied version for massive scale, fault tolerant.
  • 6. ScimoreDB v.4 Native SQL Database for Windows Distributed Elastic Fault tolerant Transactional / consistent Scale on commodity hardware
  • 7. Going distributed is easy Used to select primary key and indexes pr. Table. Now you additionally need to select distribution pr. table. All existing sql queries continue to run. There is no magic – its just doing it how you would program your own sharding and map-reduce layer!
  • 8. Partition Groups (shard) Group1 Group2 Group3 Node #6 Node #4 Node #1 Node #2 Node #5 Node #3
  • 9.
  • 10. Scale for large data setsGroup Group Group Group Group Group Group Group Group Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node Node
  • 11.
  • 12. Slower on insert/update – more machines needs to be updated
  • 13. Fast reads – more machines with same dataGroup Group Node Node Node Node Node Node Node Node Node Node Node Node
  • 14. Partitioning Distribute to all – replicated on all groups. Partition by column hash value. Round-robin. Relation.
  • 15. Partitioning by column hash value Column [col1]>hash(100)MOD 1024 Select * from table where col1 = 100 0 1024 512 Group2 Group1 Node #3 Node #1 Node #4 Node #2
  • 16. Demo on Amazon EC2 Customer c_id bigint c_name varchar c_zip varchar Products p_id bigint p_Name varchar p_price money Orders o_id autobigint o_c_id bigint o_p_id bigint o_amount int o_date datetime o_price money
  • 17. Performance Single machine 8 core Simple select: 75.000 queries/s (10 client threads) Vodafone cluster of 6 machines: 21.000 transactions inserting 1 row DTU cluster of 31 small machines TPC-C : 140.000 transactions/s (35% insert, 35% update, 30% select)
  • 18. ACID transactions Crash safe recovery Row & tabel level locking Dynamic phase commit (D2PC) Dynamic group commit Transactions isolation levels (read commit, read repeatable) In-Doubt transaction state Multiversioning Concurrency control MVCC Local and distributed deadlock detection Write ahead logging Fuzzy checkpoint - non blocking checkpoints B+-Tree Page compression TEXT/NTEXT field compression System tables: performance, monitoring, schema T-SQL Recursive queries/CTE Security – users & roles Free text (lucene) ScimoreOS, fiber based tasks scheduling 100% asynchronious, io-completion based NUMA aware Distributed query optimizer Distributed tree execution Query prioritization and throttling