This document provides an overview of OrientDB, an open-source multi-model NoSQL database that supports document, key-value, graph and object-oriented models. It discusses OrientDB's logical concepts like classes, clusters, vertices and edges. It also covers querying in OrientDB using SQL-like statements and traversing relationships between vertices. The document demonstrates how to create and query data in OrientDB as well as different traversal strategies like depth-first and breadth-first search.
4. O N C E U P O N A T I M E - 1 9 7 9
• first commercially available RDBMS
• written in assembly
• runs in 128K of memory
• not support for transactions
• support for basic sql queries and joins
5. R E L AT I O N A L D ATA B A S E S
• data is presented to the user in the form of
rows and columns (a relation)
• data can be manipulated through relational
operators in a tabular form
6. O V E R T I M E
• data start growing in size
• data become heterogeneous
• structured, semi-structured, unstructured data
• rate at which data is generated increased
8. 3 0 Y E A R S L AT E R ( 2 0 0 9 )
• NoSQL movement
• some intents of NOSQL databases:
• being non-relational
• simplicity of design
• simpler horizontal scaling
• speed up some operations
• distributed
9. ( S O M E ) T Y P E S O F N O S Q L D ATA B A S E S
• document
• key-value
• object-oriented
• graph
• multi-model
10. D O C U M E N T M O D E L
• the document encapsulate data in some standard
format: yaml, json, xml, bson
{
"id": 45,
"name": "Andrea",
"fav_colours": ["blue", "green"],
"driver_license": {
"number": "AA123"
}
}
11. K E Y- VA L U E M O D E L
• dictionary in which data is represented as a collection
of key-value pairs
> SET akey “Andrea”
> GET akey
“Andrea”
akey Andrea
12. O B J E C T- O R I E N T E D M O D E L
• data is represented in the form of objects
Animal
Dog Cat
13. G R A P H M O D E L
• data is represented in the form of a graph
14. M U LT I M O D E L
K e y - Va l u e
D o c u m e n t
O b j e c t - o r i e n t e dG r a p h
15. R E L AT I O N A L V S N O S Q L
• how data is represented
• how data is related
• relational databases have the concept of joins
• NoSQL databases have multiple concepts
• aggregation
• relation (through edges)
16. I S S U E S W I T H J O I N
User
name id
Andrea 45
John 48
Steven 53
Bill 70
Like
user_id food_id
45 13
45 49
70 38
Food
id name
13 Pasta
38 Sushi
49 Kebab
63 Meat
SELECT F.name FROM User U, Like L, Food F
WHERE U.name='Andrea' AND U.id=L.user_id AND L.food_id=F.id;
17. I S S U E S W I T H J O I N
User
name id
Andrea 45
John 48
Steven 53
Bill 70
Like
user_id food_id
45 13
45 49
70 38
Food
id name
13 Pasta
38 Sushi
49 Kebab
63 Meat
SELECT F.name FROM User U, Like L, Food F
WHERE U.name='Andrea' AND U.id=L.user_id AND L.food_id=F.id;
double JOIN per record at runtime
18. I S S U E S W I T H J O I N
• the relationships are computed every time a query is
performed
• time complexity grows with data: O(log n)
• heavy runtime cost with large datasets
• index lookup does not help
• speeds up searches but slows down inserts, updates, deletes
• imagine on billions of records
speakerdeck.com/agiuliano/index-management-in-depth
19. S U M M I N G U P J O I N
• a join operation involves
• searching a record in the starting table (User)
• use the foreign key to lookup the intermediate table
(Like) through its index
• traversing the intermediate table looking up the
target table (Food) ids
20. The more entries you have
the more your queries are SLOW
www.flickr.com/photos/blacktigersdream/8737830046
22. S AV I N G P R O J E C T I O N S
advantages
• data is predetermined
disadvantages
• data synchronization
• solves only reads
UserLikesFood
User user_id Like food_id
Andrea 45 Pasta 13
Andrea 45 Kebab 49
Bill 70 Sushi 38
23. R E L AT I O N S H I P S
I N N O S Q L W O R L D
24. R E L AT I O N S H I P S I N D O C U M E N T S
• embed information in documents where you need
them
• data duplication
• faster access
{
"id": 45,
"name": "Andrea",
"likes": ["Pasta", "Kebab"]
}
26. G R A P H
G = (V, E)
Graph Vertices Edges
Edge Vertex
Graph
27. G R A P H
Andrea
BMW
name: Andrea
license: A123
drives
model: X5
doors: 5
V E RT I C E S
A R E D I R E C T E D
V E RT I C E S
C A N H AV E
P R O P E RT I E S
E D G E S
C A N H AV E
P R O P E RT I E S
28. G R A P H
Andrea
BMW
drives
owns
N-M relationships can be represented
using multiple edges
29. B U I L D S M A R T R E L AT I O N S H I P S
Andrea
Luxury Cars
BMW
Ferrari
Customers
John
Cars
Root vertices
30. B U I L D S M A R T R E L AT I O N S H I P S
• root vertices can be meta graphs
• meta graphs add information to make traversal
easier and faster
31. a Car can be enriched with information regarding
• date of purchase
• country of manufacture
EXAMPLE
www.flickr.com/photos/aigle_dore/5952275132
32. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
33. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Ferrari
Maserati
EuropeItaly
Germany
34. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
35. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
get all the italian cars
sold on 01/15/2016
36. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
let’s start from Made
37. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
38. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
found the cars made in Italy
now filter by date using incoming edges
39. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
40. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
let’s try from Purchase
41. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
42. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
found the cars purchased on 01/15/2016
now filter by country using incoming edges
43. B U I L D S M A R T R E L AT I O N S H I P S
BMW
Made
Purchase Year
2016
Month
Jan 2016
Day
01/15/2016
Ferrari
Maserati
Month
Feb 2016
Day
02/01/2016
EuropeItaly
Germany
45. O R I E N T D B
• nosql database
• multimodel
• high performance (can write 400,000 records/sec*)
• http rest and json api
• ACID
*On Intel i7 8 core CPU, 16 GB RAM, SSD RPM, Multi-threads, no indexes (orientdb.com)
47. I N S TA L L AT I O N
orientdb.com/docs/2.1/Tutorial-Installation.html
$ docker run -d -v … orientdb/orientdb
$ brew install orientdb
48. L O G I C A L C O N C E P T S
• class
• type of data model
• cluster
• stores groups of records within a class
class Car
cluster
USA_car
cluster
Italy_car
49. V E R T I C E S
• record identifier (RID)
• each record has its own self-assigned unique ID
• composed of 2 parts
#<cluster-id>:<cluster-position>
• list of properties
• edge’s RID
• in
• out
50. E D G E S
• record identifier (RID)
• each record has its own self-assigned unique ID
• composed of 2 parts
#<cluster-id>:<cluster-position>
• in
• RID of the ingoing vertex
• out
• RID of the outgoing vertex
51. R E L AT I O N S H I P S
• does not make use of JOINs like RDBMS
• physical links O(1)
• relationship managed by storing the edge’s RID in
both vertices as “out” and “in”
• for 1-to-n relationship collections of rid are used
o u t : [ # 1 3 : 3 5 ]
i n : [ # 1 5 : 1 0 0 ]
l i c e n s e : A 1 2 3
drives
o u t : [ # 1 4 : 5 4 ]
n a m e : A n d re a
i n : [ # 1 4 : 5 4 ]
m o d e l : X 5
#13:35 #15:100
#14:54
Andrea BMW
52. T R AV E R S E A R E L AT I O N S H I P
o u t : [ # 1 3 : 3 5 ]
i n : [ # 1 5 : 1 0 0 ]
drives
o u t : [ # 1 4 : 5 4 ] i n : [ # 1 4 : 5 4 ]
#13:35 #15:100
#14:54
Andrea BMW
53. T R AV E R S E A R E L AT I O N S H I P
drives
#13:35 #15:100
#14:54
Andrea BMW
o u t : [ # 1 3 : 3 5 ]
i n : [ # 1 5 : 1 0 0 ]
o u t : [ # 1 4 : 5 4 ] i n : [ # 1 4 : 5 4 ]
54. C R E AT E A C L A S S
CREATE CLASS Car EXTENDS V
V
C a r
E
d r i v e s
CREATE CLASS drives EXTENDS E
55. A D D P R O P E R T I E S T O A C L A S S
• create properties involves to define its name and its
type
• is mandatory in order to define indexes or constraints
CREATE PROPERTY Car.model String
C a r
m o d e l : S t r i n g
56. A D D C O N S T R A I N T S T O A P R O P E R T Y
• alter the defined property adding the constraint
ALTER PROPERTY Car.model MANDATORY TRUE
C a r
m o d e l : S t r i n g
57. Q U E RY I N G
SELECT FROM Car WHERE model=‘X5’
C a r
r i d : # 1 5 : 6
m o d e l : X 5
SELECT FROM #15:6
58. Q U E RY I N G
C a r
r i d : # 1 5 : 6
m o d e l : X 5
SELECT FROM [#15:6, #15:7]
C a r
r i d : # 1 5 : 7
m o d e l : Z 4
59. Q U E RY I N G
SELECT name, OUT(“drives”).model AS DrivesCar
FROM #17:0
name DrivesCar
Andrea [“X5”, “Z4”]
60. Q U E RY I N G
SELECT name, OUT(“drives”).model AS DrivesCar
FROM #17:0
UNWIND DrivesCar
name DrivesCar
Andrea X5
Andrea Z4
61. Q U E RY I N G
TRAVERSE * FROM #17:0 MAXDEPTH 4
Andrea
BMW
Maserati
drives
drives
62. D E P T H F I R S T S E A R C H
TRAVERSE * FROM #17:0 STRATEGY DEPTH_FIRST
1
2 87
3 6 9 1 2
1 11 054
63. B R E A D T H F I R S T S E A R C H
1
2 43
TRAVERSE * FROM #17:0 STRATEGY BREADTH_FIRST
5 6 7 8
1 21 11 09
64. W H E N
• store inter-connected data
• query data by relation of arbitrary length
• continuously evolving data set
• make it easy to evolve the database