50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS (Tokyo, Japan, November 13th, Oracle Groundbreakers JAPAC Tour)

50 Shades of
Data
how, when and why
Big, Fast, Relational,
NoSQL, Elastic,
Event, CQRS
On the many types of
data, data stores and data
usages
50 Shades of Data 1
µ
µ
Lucas Jellema, CTO of AMIS
Oracle Groundbreakers APAC Tour

Lucas Jellema
Architect / Developer
1994 started in IT at Oracle
2002 joined AMIS
Currently CTO & Solution Architect
Implementing Microservices on Oracle Cloud: Open, Manageable, Polyglot, and Scalable 2
こんばんは

Overview
• Multiple types of data
• Stored and processed in different ways
• Same data sometimes used in multiple, different ways
• Stored and processed multiple times – optimized for each use case
• The meaning of some terms cannot be taken too literally
• Real Time and Fresh
• Integrity and Truth
• Consistency and transactions
• Understand your data
• Meta: What does it mean?
• Master: Where is the source?

Select from <stream of tweet events>
select text
, author
, timestamp
from tweets
Where tag = 'codeone'
<--- streaming data

Select Running Count
from <stream of tweet events>
select tag
, count(*) tweet_count
from tweets
group
by tag

Tweets on
#CodeOne #java
#oraclecode
Tweets
Topic
Oracle Cloud
Event HubApplication
Container
TWEET_COUNT
Topic
Running
Tweets
Aggregation
Client
Client
Client
Client
IoT metrics from
hundreds of devices
User actions & click
events from webshop
Live Traffic EventsMicroservices chatter
Social Media events
(Facebook,
Whatsapp, …)
IT Operations –
monitoring metrics
µ
µ
µ
µ

Tweets on #JEEConf
#java #oraclecode
Tweets
Topic
Oracle Cloud
Event HubApplication
Container
TWEET_COUNT
Topic
Running
Tweets
Aggregation
Client
Client
Client
Client
IoT metrics from
hundreds of devices
User actions & click
events from webshop
Live Traffic EventsMicroservices chatter
Social Media events
(Facebook,
Whatsapp, …)
IT Operations –
monitoring metrics
µ
µ
µ
µ

Real Time
live | fresh | instantaneous |
on line | synchronous

< 10
ms
< 100
ms
< 500
ms
<3
secs
> 3
secs
50 Shades of Data 14
Machine Response Human Reaction
14

< 10
ms
< 100
ms
< 500
ms
<3
secs
> 3
secs
Machine Response Human Reaction
15

Integrity
• Madelon’s pasje
• Real world vs World of Databases
• Relax!
• Anomaly detection

Data Constraints
to protect integrity
• Allowable values
• Mandatory attributes
• (Foreign Key) References
• NULL
• Constraints on
• type
• length
• format
• Spelling
• Character encoding

Data is representation of
the known real world
• How useful is it to enforce data integrity?

Data Integrity
• Why?
• Is it about truth?
• About regulations and by-the-book?
• Allow IT systems to run smoothly and not get confused?
• About auditability and non-repudiation?
• What about the real world?
• Data in IT is just a representation;
if the world is not by the book – what should IT do?

Anomaly Detection
• Find fishy values and derive business integrity rules by scanning data

BOL - CQRS

Books Online - WebShop
Products
Product updates
firewall
Data manipulation
Data Quality (enforcement)
<10K transactions
Batch jobs next to online
Speed is nice
Read only
On line
Speed is crucial
XHTML & JSON
> 5M visits
Webshop visits
- searches
- product details
- Orders

Products
Products
Products
Webshop visits
- searches
- product details
- Orders
firewall
Data manipulation
Data Quality (enforcement)
<10K transactions
Batch jobs next to online
Speed is nice
Read only
On line
Speed is crucial
XHTML & JSON
> 1M visits
DMZ
Read only
JSON documents
Images
Text Search
Scale Horizontally
Stale but consistent
Products
Nightly generation
Product updates

Hoe integreer je applicaties en data? 25
Products
Data Manipulation
Data
Retrieval

Hoe integreer je applicaties en data? 26
Special
Products
Product
Clusters
ProductsData Manipulation
Data Retrieval
Food
Stuff
Toys
Quick Product
Search Index
Product Store in
SaaS app

Comand Query Responsbility Segregation = CQRS
Special
Products
Product Clusters
ProductsData Manipulation
Data Retrieval
Food Stuff
Toys
Quick Product Search
Index
Product Store in
SaaS app
Detect changes
Extract Data
Transport Data
Convert Data
Apply Data

From C to Q
• How quickly?
• How frequently?
• How reliably?
• How atomically?
•
Products
Index

From C to Q
• How quickly?
• How frequently?
• How reliably?
• How atomic?
•
• Data Authorization Considerations
• Locations & Connectivity
• Full resynch | restore of Query Store
Products
Index

CQRS is not new

Event Sourcing Driving CQRS
Events Event Store
Current State
accountId:
123
amount: 10
Owner: Jane Doe

Event Sourcing Driving CQRS
Events Event Store
Current State
Other State Aggregate

Distributed Database with Event Sourcing & Current State
Implementing Microservices on Oracle Cloud: Open, Manageable, Polyglot, and Scalable34
World State

SQL is not good at anything
• But it sucks at nothing

Session Recommendation Engine for CodeOne
• Recommend sessions to me
• That are Presented by Speakers
• Who are Liked by People
• Who Attended the same Sessions that I Attended
• Start from me and the sessions
I attended
• Locate other attendees in these
sessions
• Find the speakers they like
• Retrieve the sessions presented
by those speakers
36

The Relational Approach
37
PEOPLE SESSIONS
ATTENDANCE
SPEAKERS
SPEAKER_LIKING

SQL Query to find the Recommendations
38

The Graph DB Approach (Neo4J using Cypher)
• No tables are created
• As data is created, meta-data is derived
39

Performing the Recommendations Query
41

Graph Database
• Natural fit during development
• Easier to write and maintain
• Superior (10-1000 times better)
performance Person liked
by anyone
liked by Bob
Find People
liked by
anyone liked
by Bob
Find People
liked by
anyone liked
by Bob

SQL vs NoSQL
ACID vs BASE
Relational vs …

Relational Databases
• Based on relational model of data (E.F. Codd), a mathematical foundation
• Uses SQL for query, DML and DDL
• Transactions are ACID (Atomicity, Consistency, Isolation, Durability)
• All or nothing
• Constraint Compliant
• Individual experience
[in a multi-session environment]
(aka concurrency)
• Down does not hurt

ACID comes at a cost – performance & scalability
• Transaction results have to be persisted [before the transaction completes]
in order to guarantee D
• Concurrency requires some degree of locking (and multi-versioning) in order
to have I
• Constraint compliance (unique key, foreign key) means all data hangs
together (as do all transactions)
in order to have C
• Two-phase commit (across multiple participants)
introduces complexity, dependencies and delays,
yet required for A

NoSQL n’est pas No SQL

When things were simple
RDBMS
SQL
ACID
Data
files
Log
Files
Backup
Backup
Backup
SAN

And then stuff happened
Middle Tier:
Java EE (Stateful) application
Client Tier:
Browser
Client Tier:
Browser
Client Tier:
Browser
Mobile App
(offline)
Mobile App
(offline)
Mobile App
(offline)
Data
Warehouse
OO,
XML,
JSON
Content
Management
Big Data
Fast Data
API
API
API
µ λ

50 Shades of Data
Oracle Database
SQL
RDBMS
ACID

http
IoT Fast Data
Ingestion
Sharding
http
Machine Learning
No
SQL
Big Data
SQL
Multitenant
(Pluggable Database) Architecture
Flashback

Oracle Database XE – eXpress Edition
• Current version: XE 11gR2
• Available since October 2018: XE 18c, with yearly releases (19c, 20c, …)
• All functionality of single instance Oracle Database Enterprise Edition
plus Extra Options
• (including R, Machine Learning, Spatial, Compression, Multi Tenant – for 3 PDBs, Partitioning)
• Code and Data Compatible with other editions – including plug/unplug
• Resource Limitations for 18c:
• 2 CPUs
• 2 GB of memory
• 12 GB of disk space (using Compression effectively 40 GB of data)
• No patches or support
Review of Oracle OpenWorld & CodeOne 2018 - #oowamis 65

usage
Total Cost of Data Ownership
authorization
distribution
formatvolatility volume
ACID demands
availability
freshness requirements
(staleness allowance)
location
speed
ownership
required consistency
integrity
query patterns

Summary
• Multiple types of data
• Stored and processed in different ways
• Same data sometimes used in multiple, different ways
• Stored and processed multiple times – optimized for each use case
• The meaning of some terms cannot be taken too literally
• Real Time and Fresh
• Integrity and Truth
• Consistency and transactions
• Understand your data
• Meta: What does it mean?
• Master: Where is the source?

Thank you!
ありがとうございました
• Blog: technology.amis.nl
• Email: lucas.jellema@amis.nl
• : @lucasjellema
• : lucas-jellema
• : www.amis.nl, info@amis.nl

50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS (Tokyo, Japan, November 13th, Oracle Groundbreakers JAPAC Tour)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie 50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS (Tokyo, Japan, November 13th, Oracle Groundbreakers JAPAC Tour)

Ähnlich wie 50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS (Tokyo, Japan, November 13th, Oracle Groundbreakers JAPAC Tour) (20)

Mehr von Lucas Jellema

Mehr von Lucas Jellema (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

50 Shades of Data - how, when and why Big, Fast, Relational, NoSQL, Elastic, Event, CQRS (Tokyo, Japan, November 13th, Oracle Groundbreakers JAPAC Tour)

Hinweis der Redaktion