SlideShare ist ein Scribd-Unternehmen logo
1 von 57
When to NoSQL and When 
to Know SQL 
Simon Elliston Ball 
Head of Big Data 
@sireb 
#noSQLknowSQL 
http://nosqlknowsql.io
what is NoSQL? 
SQL 
Not only SQL 
Many many things 
NoSQL 
No, SQL
before SQL 
files 
multi-value 
ur… hash maps?
after SQL 
everything is relational 
ORMs fill in the other data structures 
scale up rules 
data first design
and now NoSQL 
datastores that suit applications 
polyglot persistence: the right tools 
scale out rules 
APIs not EDWs
why should you care? 
data growth 
rapid development 
fewer migration headaches… maybe 
machine learning 
social
big bucks. 
O’Reilly 2013 Data Science Salary Survey
So many NoSQLs…
document databases
document databases 
rapid development 
JSON docs 
complex, variable models 
known access pattern
document databases 
learn a very new query language 
denormalize 
document 
form 
joins? JUST DON’T 
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
document vs SQL 
what can SQL do? 
query all the angles 
sure, you can use blobs… 
… but you can’t get into them
documents in SQL 
SQL xml fields 
mapping xquery paths is painful 
native JSON 
but still structured
query everything: search 
class of database database 
full-text indexing
you know google, right… 
range query 
span query 
keyword query
you know the score 
scores 
"query": { 
"function_score": { 
"query": { 
"match": { "title": "NoSQL"} 
}, 
"functions": [ 
"boost": 1, 
"gauss": { 
"timestamp": { 
"scale": "4w" 
} 
}, 
"script_score" : { 
"script" : "_score * doc['important_document'].value ? 2 : 1" 
} 
], 
"score_mode": "sum" 
} 
}
SQL knows the score too 
scores 
declare @origin float = 0; 
declare @delay_weeks float = 4; 
SELECT TOP 10 * FROM ( 
SELECT title, 
score * 
CASE 
WHEN p.important = 1 THEN 2.0 
WHEN p.important = 0 THEN 1.0 
END 
* exp(-power(timestamp-@origin,2)/(2*@delay*7*24*3600)) 
+ 1 
AS score 
FROM posts p 
WHERE title LIKE '%NoSQL%' 
) as found 
ORDER BY score
you know google, right… 
more like this: instant tf-idf 
{ 
"more_like_this" : { 
"fields" : ["name.first", "name.last"], 
"like_text" : "text like this one", 
"min_term_freq" : 1, 
"max_query_terms" : 12 
} 
}
Facets
Facets
Facets 
SQL: 
x lots 
SELECT a.name, count(p.id) FROM 
people p 
JOIN industry a on a.id = p.industry_id 
JOIN people_keywords pk on pk.person_id = p.id 
JOIN keywords k on k.id = pk.keyword_id 
WHERE CONTAINS(p.description, 'NoSQL') 
OR k.name = 'NoSQL' 
... 
GROUP BY a.name 
SELECT a.name, count(p.id) FROM 
people p 
JOIN area a on a.id = p.area_id 
JOIN people_keywords pk on pk.person_id = p.id 
JOIN keywords k on k.id = pk.keyword_id 
WHERE CONTAINS(p.description, 'NoSQL') 
OR k.name = 'NoSQL' 
... 
GROUP BY a.name
Facets 
Elastic search: 
{ 
"query": { 
“query_string": { 
“default_field”: “content”, 
“query”: “keywords” 
} 
}, 
"facets": { 
“myTerms": { 
"terms": { 
"field" : "lang", 
"all_terms" : true 
} 
} 
} 
}
logs 
untyped free-text documents 
timestamped 
semi-structured 
discovery 
aggregation and statistics
key: value 
close to your programming model 
distributed map | list | set 
keys can be objects
SQL extensions 
hash types 
hstore
SQL and polymorphism 
inheritance 
ORMs hide the horror
turning round the rows 
columnar databases 
physical layout matters
turning round the rows 
key value type 
1 A Home 
2 B Work 
3 C Work 
4 D Work 
Row storage 
00001 1 A Home 00002 2 B Work 00003 3 C Work … 
Column storage 
A B C D Home Work Work Work …
teaching an old SQL new tricks 
MySQL InfoBright 
SQL Server Columnar Indexes 
CREATE NONCLUSTERED COLUMNSTORE INDEX idx_col 
ON Orders (OrderDate, DueDate, ShipDate) 
Great for your data warehouse, but no use for OLTP
column for hadoop and other animals 
ORC files 
Parquet 
http://parquet.io
column families 
wide column databases
millions of columns 
eventually consistent 
CQL 
http://cassandra.apache.org/ 
http://www.datastax.com/ 
set | list | map types
cell level security 
SQL: so many views, so much confusion 
accumulo https://accumulo.apache.org/
time 
retrieving time series and graphs 
window functions 
SELECT business_date, ticker, 
close, 
close / 
LAG(close,1) OVER (PARTITION BY ticker ORDER BY business_date ASC) 
- 1 AS ret 
FROM sp500
time 
Open TSDB 
Influx DB 
key:sub-key:metric 
key:* 
key:*:metric
Queues
queues in SQL 
CREATE procedure [dbo].[Dequeue] 
AS 
set nocount on 
declare @BatchSize int 
set @BatchSize = 10 
declare @Batch table (QueueID int, QueueDateTime datetime, Title nvarchar(255)) 
begin tran 
insert into @Batch 
select Top (@BatchSize) QueueID, QueueDateTime, Title from QueueMeta 
WITH (UPDLOCK, HOLDLOCK) 
where Status = 0 
order by QueueDateTime ASC 
declare @ItemsToUpdate int 
set @ItemsToUpdate = @@ROWCOUNT 
update QueueMeta 
SET Status = 1 
WHERE QueueID IN (select QueueID from @Batch) 
AND Status = 0 
if @@ROWCOUNT = @ItemsToUpdate 
begin 
commit tran 
select b.*, q.TextData from @Batch b 
inner join QueueData q on q.QueueID = b.QueueID 
print 'SUCCESS' 
end 
else 
begin 
rollback tran 
print 'FAILED' 
end
queues in SQL 
index fragmentation is a problem 
but built in logs of a sort
message queues 
specialised apis 
capabilities like fan-out 
routing 
acknowledgement
relationships count 
Graph databases
relationships count 
trees and hierarchies 
overloaded relationships 
fancy algorithms
hierarchies with SQL 
adjacency lists 
CONSTRAIN fk_parent_id_id 
FOREIGN KEY parent_id REFERENCES 
some_table.id 
materialised path 
nested sets (MPTT) 
path = 1.2.23.55.786.33425 
Node Left Right Depth 
A 1 22 1 
B 2 9 2 
C 10 21 2 
D 3 4 3 
E 5 6 3 
F 7 8 3 
G 11 18 3 
H 19 20 3 
I 12 13 4 
J 14 15 4 
K 16 17 4
Consolidation 
OrientDB 
ArangoDB 
http://www.orientechnologies.com/ 
SQLs with NoSQL 
Postgres 
MySQL 
https://www.arangodb.org/
Big Data
SQL on Hadoop
More than SQL 
Shark 
Drill 
Cascading 
Map Reduce
System issues, 
Speed issues, 
Soft issues
the ACID, BASE litmus 
Atomic 
Consistent 
Isolated 
Durable 
Basically Available 
Soft-state 
Eventually consistent 
what matters to you?
CAP it all 
Consistency 
Partition Availability
write fast, ask questions later 
SQL writes cost a lot 
mainly write workload: NoSQL 
low latency write workload: NoSQL
is it web scale? 
most NoSQL scales well 
but clusters still need management 
are you facebook? one machine is easier than n 
can ops handle it? app developers make bad admins
who is going to use it? 
analysts: they want SQL 
developers: they want applications 
data scientists: they want access
choose the 
right tool 
Photo: http://www.homespothq.com/
Thank you! 
Simon Elliston Ball 
simon@simonellistonball.com 
@sireb 
#noSQLknowSQL 
http://nosqlknowsql.io
Questions 
Simon Elliston Ball 
simon@simonellistonball.com 
@sireb 
http://nosqlknowsql.io

Weitere ähnliche Inhalte

Was ist angesagt?

UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesMarco Gralike
 
NyaruDBにゃるものを使ってみた話 (+Realm比較)
NyaruDBにゃるものを使ってみた話 (+Realm比較)NyaruDBにゃるものを使ってみた話 (+Realm比較)
NyaruDBにゃるものを使ってみた話 (+Realm比較)Masaki Oshikawa
 
XML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBXML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBMarco Gralike
 
Scrap your query boilerplate with specql
Scrap your query boilerplate with specqlScrap your query boilerplate with specql
Scrap your query boilerplate with specqlTatu Tarvainen
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Marco Gralike
 
Oracle Database 11g Release 2 - XMLDB New Features
Oracle Database 11g Release 2 - XMLDB New FeaturesOracle Database 11g Release 2 - XMLDB New Features
Oracle Database 11g Release 2 - XMLDB New FeaturesMarco Gralike
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Julian Hyde
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLMarco Gralike
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
LATERAL Derived Tables in MySQL 8.0
LATERAL Derived Tables in MySQL 8.0LATERAL Derived Tables in MySQL 8.0
LATERAL Derived Tables in MySQL 8.0Norvald Ryeng
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2Marco Gralike
 
Jdbc 4.0 New Features And Enhancements
Jdbc 4.0 New Features And EnhancementsJdbc 4.0 New Features And Enhancements
Jdbc 4.0 New Features And Enhancementsscacharya
 

Was ist angesagt? (20)

UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex DatatypesUKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
UKOUG Tech14 - Using Database In-Memory Column Store with Complex Datatypes
 
NyaruDBにゃるものを使ってみた話 (+Realm比較)
NyaruDBにゃるものを使ってみた話 (+Realm比較)NyaruDBにゃるものを使ってみた話 (+Realm比較)
NyaruDBにゃるものを使ってみた話 (+Realm比較)
 
Mdst 3559-03-01-sql-php
Mdst 3559-03-01-sql-phpMdst 3559-03-01-sql-php
Mdst 3559-03-01-sql-php
 
Polyglot Persistence
Polyglot PersistencePolyglot Persistence
Polyglot Persistence
 
XML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBXML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDB
 
Scrap your query boilerplate with specql
Scrap your query boilerplate with specqlScrap your query boilerplate with specql
Scrap your query boilerplate with specql
 
Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2Starting with JSON Path Expressions in Oracle 12.1.0.2
Starting with JSON Path Expressions in Oracle 12.1.0.2
 
Oracle Database 11g Release 2 - XMLDB New Features
Oracle Database 11g Release 2 - XMLDB New FeaturesOracle Database 11g Release 2 - XMLDB New Features
Oracle Database 11g Release 2 - XMLDB New Features
 
Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...Is there a perfect data-parallel programming language? (Experiments with More...
Is there a perfect data-parallel programming language? (Experiments with More...
 
Mssql to oracle
Mssql to oracleMssql to oracle
Mssql to oracle
 
SQL Reports in Koha
SQL Reports in KohaSQL Reports in Koha
SQL Reports in Koha
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XML
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
LATERAL Derived Tables in MySQL 8.0
LATERAL Derived Tables in MySQL 8.0LATERAL Derived Tables in MySQL 8.0
LATERAL Derived Tables in MySQL 8.0
 
How sqlite works
How sqlite worksHow sqlite works
How sqlite works
 
wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?wtf is in Java/JDK/wtf7?
wtf is in Java/JDK/wtf7?
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 2
 
Jdbc 4.0 New Features And Enhancements
Jdbc 4.0 New Features And EnhancementsJdbc 4.0 New Features And Enhancements
Jdbc 4.0 New Features And Enhancements
 

Andere mochten auch

Parking eye reply
Parking eye replyParking eye reply
Parking eye replyjpskiller
 
Chapter8 contentlit part 2
Chapter8 contentlit part 2Chapter8 contentlit part 2
Chapter8 contentlit part 2Brittany Clawson
 
Hz study internationalisationaerosuppliers
Hz study internationalisationaerosuppliersHz study internationalisationaerosuppliers
Hz study internationalisationaerosuppliersEugenio Fontán
 
Addressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified AccessAddressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified AccessAlcatel-Lucent Enterprise
 
American gangster trailer analysis
American gangster trailer analysisAmerican gangster trailer analysis
American gangster trailer analysisBenjaminSSmith
 
A Practical Guide to Developing a Connected Hospital
A Practical Guide to Developing a Connected HospitalA Practical Guide to Developing a Connected Hospital
A Practical Guide to Developing a Connected HospitalAlcatel-Lucent Enterprise
 
BIT 201 Carta Decano y Portada
BIT 201 Carta Decano y PortadaBIT 201 Carta Decano y Portada
BIT 201 Carta Decano y PortadaEugenio Fontán
 
NDC London 2013 - Mongo db for c# developers
NDC London 2013 - Mongo db for c# developersNDC London 2013 - Mongo db for c# developers
NDC London 2013 - Mongo db for c# developersSimon Elliston Ball
 
Multiplication of binomials
Multiplication of binomialsMultiplication of binomials
Multiplication of binomialsDenjie Magrimbao
 
User guide membuat diagram dengan draw express lite
User guide membuat diagram dengan draw express liteUser guide membuat diagram dengan draw express lite
User guide membuat diagram dengan draw express literesarahadian
 
Nuevo presentación de microsoft office power point (3)
Nuevo presentación de microsoft office power point (3)Nuevo presentación de microsoft office power point (3)
Nuevo presentación de microsoft office power point (3)Bachicmc1A
 
Chat log 12:06:2014 18:33:18
Chat log 12:06:2014 18:33:18Chat log 12:06:2014 18:33:18
Chat log 12:06:2014 18:33:18Remigio Russo
 
Types of documentary
Types of documentaryTypes of documentary
Types of documentarycw00531169
 
Simple guitar lesson
Simple guitar lessonSimple guitar lesson
Simple guitar lessondrdxp
 
Organisation design
Organisation designOrganisation design
Organisation designomardiana
 

Andere mochten auch (20)

Parking eye reply
Parking eye replyParking eye reply
Parking eye reply
 
Swing ui design
Swing ui designSwing ui design
Swing ui design
 
Chapter8 contentlit part 2
Chapter8 contentlit part 2Chapter8 contentlit part 2
Chapter8 contentlit part 2
 
Hz study internationalisationaerosuppliers
Hz study internationalisationaerosuppliersHz study internationalisationaerosuppliers
Hz study internationalisationaerosuppliers
 
Addressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified AccessAddressing Needs of BYOD for Enterprises with Unified Access
Addressing Needs of BYOD for Enterprises with Unified Access
 
ΙΣΤΟΡΙΑ
ΙΣΤΟΡΙΑΙΣΤΟΡΙΑ
ΙΣΤΟΡΙΑ
 
Trabajo momento 2 7
Trabajo momento 2   7Trabajo momento 2   7
Trabajo momento 2 7
 
American gangster trailer analysis
American gangster trailer analysisAmerican gangster trailer analysis
American gangster trailer analysis
 
A Practical Guide to Developing a Connected Hospital
A Practical Guide to Developing a Connected HospitalA Practical Guide to Developing a Connected Hospital
A Practical Guide to Developing a Connected Hospital
 
Plane figures1
Plane figures1Plane figures1
Plane figures1
 
BIT 201 Carta Decano y Portada
BIT 201 Carta Decano y PortadaBIT 201 Carta Decano y Portada
BIT 201 Carta Decano y Portada
 
NDC London 2013 - Mongo db for c# developers
NDC London 2013 - Mongo db for c# developersNDC London 2013 - Mongo db for c# developers
NDC London 2013 - Mongo db for c# developers
 
Multiplication of binomials
Multiplication of binomialsMultiplication of binomials
Multiplication of binomials
 
User guide membuat diagram dengan draw express lite
User guide membuat diagram dengan draw express liteUser guide membuat diagram dengan draw express lite
User guide membuat diagram dengan draw express lite
 
Nuevo presentación de microsoft office power point (3)
Nuevo presentación de microsoft office power point (3)Nuevo presentación de microsoft office power point (3)
Nuevo presentación de microsoft office power point (3)
 
Chat log 12:06:2014 18:33:18
Chat log 12:06:2014 18:33:18Chat log 12:06:2014 18:33:18
Chat log 12:06:2014 18:33:18
 
Types of documentary
Types of documentaryTypes of documentary
Types of documentary
 
Simple guitar lesson
Simple guitar lessonSimple guitar lesson
Simple guitar lesson
 
Magazine analysis
Magazine analysisMagazine analysis
Magazine analysis
 
Organisation design
Organisation designOrganisation design
Organisation design
 

Ähnlich wie When to no sql and when to know sql javaone

Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...NoSQLmatters
 
When to NoSQL and when to know SQL
When to NoSQL and when to know SQLWhen to NoSQL and when to know SQL
When to NoSQL and when to know SQLSimon Elliston Ball
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQLjeykottalam
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
 
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...BigDataEverywhere
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPBob Ward
 
NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020Thodoris Bais
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptxAndrew Lamb
 
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...Karen Lopez
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP ConferenceDave Stokes
 
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdf
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdfKultam MM UI - MySQL for Data Analytics and Business Intelligence.pdf
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdfShaNatasha1
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx政宏 张
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapePaco Nathan
 
What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?ukdpe
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewEric Nelson
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterDatabricks
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksDatabricks
 
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...Marco Gralike
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...Insight Technology, Inc.
 

Ähnlich wie When to no sql and when to know sql javaone (20)

Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
Simon Elliston Ball – When to NoSQL and When to Know SQL - NoSQL matters Barc...
 
When to NoSQL and when to know SQL
When to NoSQL and when to know SQLWhen to NoSQL and when to know SQL
When to NoSQL and when to know SQL
 
Intro to Spark and Spark SQL
Intro to Spark and Spark SQLIntro to Spark and Spark SQL
Intro to Spark and Spark SQL
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introduction
 
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
Big Data Everywhere Chicago: Apache Spark Plus Many Other Frameworks -- How S...
 
Inside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTPInside SQL Server In-Memory OLTP
Inside SQL Server In-Memory OLTP
 
NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020NoSQL Endgame DevoxxUA Conference 2020
NoSQL Endgame DevoxxUA Conference 2020
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
A Designer's Favourite Security and Privacy Features in SQL Server and Azure ...
 
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
MySQL Without the SQL -- Oh My!  Longhorn PHP ConferenceMySQL Without the SQL -- Oh My!  Longhorn PHP Conference
MySQL Without the SQL -- Oh My! Longhorn PHP Conference
 
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdf
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdfKultam MM UI - MySQL for Data Analytics and Business Intelligence.pdf
Kultam MM UI - MySQL for Data Analytics and Business Intelligence.pdf
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
How Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscapeHow Apache Spark fits into the Big Data landscape
How Apache Spark fits into the Big Data landscape
 
What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?What's New for Developers in SQL Server 2008?
What's New for Developers in SQL Server 2008?
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Master tuning
Master   tuningMaster   tuning
Master tuning
 
Apache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and SmarterApache Spark 2.0: Faster, Easier, and Smarter
Apache Spark 2.0: Faster, Easier, and Smarter
 
Jump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on DatabricksJump Start with Apache Spark 2.0 on Databricks
Jump Start with Apache Spark 2.0 on Databricks
 
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
XMLDB Building Blocks And Best Practices - Oracle Open World 2008 - Marco Gra...
 
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
[db tech showcase Tokyo 2017] C23: Lessons from SQLite4 by SQLite.org - Richa...
 

Mehr von Simon Elliston Ball

A streaming architecture for Cyber Security - Apache Metron
A streaming architecture for Cyber Security - Apache MetronA streaming architecture for Cyber Security - Apache Metron
A streaming architecture for Cyber Security - Apache MetronSimon Elliston Ball
 
mcubed london - data science at the edge
mcubed london - data science at the edgemcubed london - data science at the edge
mcubed london - data science at the edgeSimon Elliston Ball
 
Machine learning without the PhD - azure ml
Machine learning without the PhD - azure mlMachine learning without the PhD - azure ml
Machine learning without the PhD - azure mlSimon Elliston Ball
 
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...Simon Elliston Ball
 
Getting your Big Data on with HDInsight
Getting your Big Data on with HDInsightGetting your Big Data on with HDInsight
Getting your Big Data on with HDInsightSimon Elliston Ball
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Simon Elliston Ball
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Simon Elliston Ball
 
Finding and Using Big Data in your business
Finding and Using Big Data in your businessFinding and Using Big Data in your business
Finding and Using Big Data in your businessSimon Elliston Ball
 

Mehr von Simon Elliston Ball (10)

A streaming architecture for Cyber Security - Apache Metron
A streaming architecture for Cyber Security - Apache MetronA streaming architecture for Cyber Security - Apache Metron
A streaming architecture for Cyber Security - Apache Metron
 
mcubed london - data science at the edge
mcubed london - data science at the edgemcubed london - data science at the edge
mcubed london - data science at the edge
 
Machine learning without the PhD - azure ml
Machine learning without the PhD - azure mlMachine learning without the PhD - azure ml
Machine learning without the PhD - azure ml
 
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...
Why Hadoop and SQL just want to be friends - lightning talk NoSQL Matters Dub...
 
Getting your Big Data on with HDInsight
Getting your Big Data on with HDInsightGetting your Big Data on with HDInsight
Getting your Big Data on with HDInsight
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
 
Finding and Using Big Data in your business
Finding and Using Big Data in your businessFinding and Using Big Data in your business
Finding and Using Big Data in your business
 
Mongo db for c# developers
Mongo db for c# developersMongo db for c# developers
Mongo db for c# developers
 
Mongo db for C# Developers
Mongo db for C# DevelopersMongo db for C# Developers
Mongo db for C# Developers
 

Kürzlich hochgeladen

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Kürzlich hochgeladen (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

When to no sql and when to know sql javaone

  • 1. When to NoSQL and When to Know SQL Simon Elliston Ball Head of Big Data @sireb #noSQLknowSQL http://nosqlknowsql.io
  • 2. what is NoSQL? SQL Not only SQL Many many things NoSQL No, SQL
  • 3. before SQL files multi-value ur… hash maps?
  • 4. after SQL everything is relational ORMs fill in the other data structures scale up rules data first design
  • 5. and now NoSQL datastores that suit applications polyglot persistence: the right tools scale out rules APIs not EDWs
  • 6. why should you care? data growth rapid development fewer migration headaches… maybe machine learning social
  • 7. big bucks. O’Reilly 2013 Data Science Salary Survey
  • 9.
  • 11. document databases rapid development JSON docs complex, variable models known access pattern
  • 12. document databases learn a very new query language denormalize document form joins? JUST DON’T http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
  • 13. document vs SQL what can SQL do? query all the angles sure, you can use blobs… … but you can’t get into them
  • 14. documents in SQL SQL xml fields mapping xquery paths is painful native JSON but still structured
  • 15. query everything: search class of database database full-text indexing
  • 16. you know google, right… range query span query keyword query
  • 17. you know the score scores "query": { "function_score": { "query": { "match": { "title": "NoSQL"} }, "functions": [ "boost": 1, "gauss": { "timestamp": { "scale": "4w" } }, "script_score" : { "script" : "_score * doc['important_document'].value ? 2 : 1" } ], "score_mode": "sum" } }
  • 18. SQL knows the score too scores declare @origin float = 0; declare @delay_weeks float = 4; SELECT TOP 10 * FROM ( SELECT title, score * CASE WHEN p.important = 1 THEN 2.0 WHEN p.important = 0 THEN 1.0 END * exp(-power(timestamp-@origin,2)/(2*@delay*7*24*3600)) + 1 AS score FROM posts p WHERE title LIKE '%NoSQL%' ) as found ORDER BY score
  • 19. you know google, right… more like this: instant tf-idf { "more_like_this" : { "fields" : ["name.first", "name.last"], "like_text" : "text like this one", "min_term_freq" : 1, "max_query_terms" : 12 } }
  • 22. Facets SQL: x lots SELECT a.name, count(p.id) FROM people p JOIN industry a on a.id = p.industry_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ... GROUP BY a.name SELECT a.name, count(p.id) FROM people p JOIN area a on a.id = p.area_id JOIN people_keywords pk on pk.person_id = p.id JOIN keywords k on k.id = pk.keyword_id WHERE CONTAINS(p.description, 'NoSQL') OR k.name = 'NoSQL' ... GROUP BY a.name
  • 23. Facets Elastic search: { "query": { “query_string": { “default_field”: “content”, “query”: “keywords” } }, "facets": { “myTerms": { "terms": { "field" : "lang", "all_terms" : true } } } }
  • 24. logs untyped free-text documents timestamped semi-structured discovery aggregation and statistics
  • 25. key: value close to your programming model distributed map | list | set keys can be objects
  • 26. SQL extensions hash types hstore
  • 27. SQL and polymorphism inheritance ORMs hide the horror
  • 28. turning round the rows columnar databases physical layout matters
  • 29. turning round the rows key value type 1 A Home 2 B Work 3 C Work 4 D Work Row storage 00001 1 A Home 00002 2 B Work 00003 3 C Work … Column storage A B C D Home Work Work Work …
  • 30. teaching an old SQL new tricks MySQL InfoBright SQL Server Columnar Indexes CREATE NONCLUSTERED COLUMNSTORE INDEX idx_col ON Orders (OrderDate, DueDate, ShipDate) Great for your data warehouse, but no use for OLTP
  • 31. column for hadoop and other animals ORC files Parquet http://parquet.io
  • 32. column families wide column databases
  • 33. millions of columns eventually consistent CQL http://cassandra.apache.org/ http://www.datastax.com/ set | list | map types
  • 34. cell level security SQL: so many views, so much confusion accumulo https://accumulo.apache.org/
  • 35.
  • 36. time retrieving time series and graphs window functions SELECT business_date, ticker, close, close / LAG(close,1) OVER (PARTITION BY ticker ORDER BY business_date ASC) - 1 AS ret FROM sp500
  • 37. time Open TSDB Influx DB key:sub-key:metric key:* key:*:metric
  • 39. queues in SQL CREATE procedure [dbo].[Dequeue] AS set nocount on declare @BatchSize int set @BatchSize = 10 declare @Batch table (QueueID int, QueueDateTime datetime, Title nvarchar(255)) begin tran insert into @Batch select Top (@BatchSize) QueueID, QueueDateTime, Title from QueueMeta WITH (UPDLOCK, HOLDLOCK) where Status = 0 order by QueueDateTime ASC declare @ItemsToUpdate int set @ItemsToUpdate = @@ROWCOUNT update QueueMeta SET Status = 1 WHERE QueueID IN (select QueueID from @Batch) AND Status = 0 if @@ROWCOUNT = @ItemsToUpdate begin commit tran select b.*, q.TextData from @Batch b inner join QueueData q on q.QueueID = b.QueueID print 'SUCCESS' end else begin rollback tran print 'FAILED' end
  • 40. queues in SQL index fragmentation is a problem but built in logs of a sort
  • 41. message queues specialised apis capabilities like fan-out routing acknowledgement
  • 43. relationships count trees and hierarchies overloaded relationships fancy algorithms
  • 44. hierarchies with SQL adjacency lists CONSTRAIN fk_parent_id_id FOREIGN KEY parent_id REFERENCES some_table.id materialised path nested sets (MPTT) path = 1.2.23.55.786.33425 Node Left Right Depth A 1 22 1 B 2 9 2 C 10 21 2 D 3 4 3 E 5 6 3 F 7 8 3 G 11 18 3 H 19 20 3 I 12 13 4 J 14 15 4 K 16 17 4
  • 45. Consolidation OrientDB ArangoDB http://www.orientechnologies.com/ SQLs with NoSQL Postgres MySQL https://www.arangodb.org/
  • 48. More than SQL Shark Drill Cascading Map Reduce
  • 49. System issues, Speed issues, Soft issues
  • 50. the ACID, BASE litmus Atomic Consistent Isolated Durable Basically Available Soft-state Eventually consistent what matters to you?
  • 51. CAP it all Consistency Partition Availability
  • 52. write fast, ask questions later SQL writes cost a lot mainly write workload: NoSQL low latency write workload: NoSQL
  • 53. is it web scale? most NoSQL scales well but clusters still need management are you facebook? one machine is easier than n can ops handle it? app developers make bad admins
  • 54. who is going to use it? analysts: they want SQL developers: they want applications data scientists: they want access
  • 55. choose the right tool Photo: http://www.homespothq.com/
  • 56. Thank you! Simon Elliston Ball simon@simonellistonball.com @sireb #noSQLknowSQL http://nosqlknowsql.io
  • 57. Questions Simon Elliston Ball simon@simonellistonball.com @sireb http://nosqlknowsql.io

Hinweis der Redaktion

  1. Some SQL implementations have added a certain amount of document style implementations…. SQL Server’s XML type lets you dig into things with Xquery. Not much fun Postgres also supports native JSON objects, so it makes a great solution for slinging JSON to a web server if you have multiple columns, or rather keys you want to use to get the data out. These do tend to be very bolt on solutions.
  2. Range queries, span, boosting gives you a probabilistic match.
  3. Search engines can also be thought of as sort engines. They order your results by a relevance score. This can be extremely flexible. For example Elastic search will let you control the sort function in huge detail. Here gausian decay of relevance for articles over time. Besting queries, adding boost factors to certain fields or certain classes on document based on custom expressions and scripts can be very powerful, and much easier to express than it would be in SQL.
  4. So how would you do this in SQL? That doesn’t seem so bad, however, here we’ve do a sub query, which requires the entire matching set to evaluate, we’re also using a simple scoring function here, which can just be a case statement, if we wanted to involve multiple fields, these scorings can get quite obtuse in SQL.
  5. you CAN implement a queue in SQL. Doesn’t mean you should.
  6. Big Data is a big topic. Hadoop is the leading platform, but it’s a platform.
  7. go through each one, say who made it, what it does. spark: technically, spark is the framework, the SQL bit is called Shark, but doesn’t have a logo, so it can’t go on the logo slide. hive: aiming for SQL completion, compiles to MapReduce, currently does most things, not correlated subqueries.