SlideShare ist ein Scribd-Unternehmen logo
1 von 37
Downloaden Sie, um offline zu lesen
An Overview of Data Management Paradigms:
     Relational, Document, and Graph



                 Marko A. Rodriguez
          T-5, Center for Nonlinear Studies
          Los Alamos National Laboratory
             http://markorodriguez.com

                 February 15, 2010
Relational, Document, and Graph Database Data Models
       Relational Database                 Document Database                        Graph Database

                                                                                       d

                                          { data }      { data }                       a
                                                                                c
                                                                                                a
                                                 { data }
                                                                                        b




           MySQL                               MongoDB                                   Neo4j
         PostgreSQL                            CouchDB                               AllegroGraph
           Oracle                                                                   HyperGraphDB

Database models are optimized for solving particular types of problems. This is why different database
models exist — there are many types of problems in the world.




                                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Finding the Right Solution to your Problem
1. Come to terms with your problem.
   • “I have metadata for a massive number of objects and I don’t know
     how to get at my data.”

2. Identify the solution to your problem.
   • “I need to be able to find objects based on their metadata.”

3. Identify the type of database that is optimized for that type of solution.
   • “A document database scales, stores metadata, and can be queried.”

4. Identify the database of that type that best meets your particular needs.
   • “CouchDB has a REST web interface and all my developers are
     good with REST.”


                              Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases

• Relational databases have been the de facto data management solution
  for many years.




MySQL is available at http://www.mysql.com
PostgreSQL is available at http://www.postgresql.org
Oracle is available at http://www.oracle.com



                                      Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: The Relational Structure

• Relational databases require a schema before data can be inserted.

• Relational databases organizes data according to relations — or tables.

                             columns (attributes/properties)
                                         j
rows (tuples/objects)




                        i               x

                                                                  Object i has the value x for property j.

                                                          Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Creating a Table

• Relational databases organizes data according to relations — or tables.

• Relational databases require a schema before data can be inserted.

• Lets create a table for Grateful Dead songs.

mysql> CREATE TABLE songs (
  name VARCHAR(255) PRIMARY KEY,
  performances INT,
  song_type VARCHAR(20));
Query OK, 0 rows affected (0.40 sec)




                            Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Viewing the Table Schema

• Its possible to look at the defined structure (schema) of your newly
  created table.

mysql> DESCRIBE songs;
+--------------+--------------+------+-----+---------+-------+
| Field        | Type         | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+-------+
| name         | varchar(255) | NO   | PRI | NULL    |       |
| performances | int(11)      | YES |      | NULL    |       |
| song_type    | varchar(20) | YES |       | NULL    |       |
+--------------+--------------+------+-----+---------+-------+
3 rows in set (0.00 sec)



                          Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Inserting Rows into a Table

• Lets insert song names, the number of times they were played in
  concert, and whether they were and original or cover.

mysql> INSERT INTO songs VALUES ("DARK STAR", 219, "original");
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO songs VALUES (
   "FRIEND OF THE DEVIL", 304, "original");
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO songs VALUES (
   "MONKEY AND THE ENGINEER", 32, "cover");
Query OK, 1 row affected (0.00 sec)


                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Searching a Table

• Lets look at the entire songs table.

mysql> SELECT * FROM songs;
+-------------------------+--------------+-----------+
| name                    | performances | song_type |
+-------------------------+--------------+-----------+
| DARK STAR               |          219 | original |
| FRIEND OF THE DEVIL     |          304 | original |
| MONKEY AND THE ENGINEER |           32 | cover     |
+-------------------------+--------------+-----------+
3 rows in set (0.00 sec)




                             Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Searching a Table

• Lets look at all original songs.

mysql> SELECT * FROM songs WHERE song_type="original";
+---------------------+--------------+-----------+
| name                | performances | song_type |
+---------------------+--------------+-----------+
| DARK STAR           |          219 | original |
| FRIEND OF THE DEVIL |          304 | original |
+---------------------+--------------+-----------+
2 rows in set (0.00 sec)




                              Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Relational Databases: Searching a Table

• Lets look at only the names of the original songs.

mysql> SELECT name FROM songs WHERE song_type="original";
+---------------------+
| name                |
+---------------------+
| DARK STAR           |
| FRIEND OF THE DEVIL |
+---------------------+
2 rows in set (0.00 sec)




                            Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases
• Document databases store structured documents. Usually these
  documents are organized according a standard (e.g. JavaScript Object
  Notation—JSON, XML, etc.)

• Document databases tend to be schema-less. That is, they do not
  require the database engineer to apriori specify the structure of the data
  to be held in the database.




  MongoDB is available at http://mongodb.org and CouchDB is available at http://couchdb.org



                                   Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: JavaScript Object Notation

• A JSON document is a collection of key/value pairs, where a value can
  be yet another collection of key/value pairs.
     string: a string value (e.g. “marko”, “rodriguez”).
     number: a numeric value (e.g. 1234, 67.012).
     boolean: a true/false value (e.g. true, false)
     null: a non-existant value.
     array: an array of values (e.g. [1,“marko”,true])
     object: a key/value map (e.g. { “key” : 123 })

The JSON specification is very simple and can be found at http://www.json.org/.




                               Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: JavaScript Object Notation
{
    _id : "D0DC29E9-51AE-4A8C-8769-541501246737",
    name : "Marko A. Rodriguez",
    homepage : "http://markorodriguez.com",
    age : 30,
    location : {
        country : "United States",
        state : "New Mexico",
        city : "Santa Fe",
        zipcode : 87501
    },
    interests : ["graphs", "hockey", "motorcycles"]
}



                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Handling JSON Documents

• Use object-oriented “dot notation” to access components.

> marko = eval({_id : "D0DC29E9...", name : "Marko...})
> marko._id
D0DC29E9-51AE-4A8C-8769-541501246737
> marko.location.city
Santa Fe
> marko.interests[0]
graphs

All document database examples presented are using MongoDB [http://mongodb.org].




                               Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Inserting JSON Documents
• Lets insert a Grateful Dead document into the database.




> db.songs.insert({
  _id : "91",
  properties : {
    name : "TERRAPIN STATION",
    song_type : "original",
    performances : 302
  }
})

                           Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Finding JSON Documents

• Searching is based on created a “subset” document and pattern matching
  it in the database.

• Find all songs where properties.name equals TERRAPIN STATION.

> db.songs.find({"properties.name" : "TERRAPIN STATION"})
{ "_id" : "91", "properties" :
  { "name" : "TERRAPIN STATION", "song_type" : "original",
    "performances" : 302 }}
>




                           Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Finding JSON Documents

• You can also do comparison-type operations.

• Find all songs where properties.performances is greater than 200.

> db.songs.find({"properties.performances" : { $gt : 200 }})
{ "_id" : "104", "properties" :
  { "name" : "FRIEND OF THE DEVIL", "song_type" : "original",
    "performances" : 304}}
{ "_id" : "122", "properties" :
  { "name" : "CASEY JONES", "song_type" :
    "original", "performances" : 312}}
has more
>

                           Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Processing JSON Documents
• Sharding is the process of distributing a database’s data across multiple
  machines. Each partition of the data is known as a shard.

• Document databases shard easily because there are no explicit references
  between documents.

                                  client appliation


                             communication service



                { _id : }   { _id : }            { _id : }        { _id : }
                { _id : }   { _id : }            { _id : }        { _id : }
                { _id : }   { _id : }            { _id : }        { _id : }




                            Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Processing JSON Documents

• Most document databases come with a Map/Reduce feature to allow for
  the parallel processing of all documents in the database.
     Map function: apply a function to every document in the database.
     Reduce function: apply a function to the grouped results of the map.

                             M : D → (K, V ),
where D is the space of documents, K is the space of keys, and V is the
space of values.
                         R : (K, V n) → (K, V ),
where V n is the space of all possible combination of values.




                             Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Processing JSON Documents
• Create a distribution of the Grateful Dead original song performances.

> map = function(){
  if(this.properties.song_type == "original")
    emit(this.properties.performances, 1);
};

> reduce = function(key, values) {
  var sum = 0;
  for(var i in values) {
    sum = sum + values[i];
  }
  return sum;
};

                            Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Processing JSON Documents
> results = db.songs.mapReduce(map, reduce)
{
  "result" : "tmp.mr.mapreduce_1266016122_8",
  "timeMillis" : 72,
  "counts" : {
     "input" : 809,
     "emit" : 184,
     "output" : 119
  },
  "ok" : 1,
}




                       Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
{ _id : 122,                      { _id : 100,                       { _id : 91,
  properties : {                    properties : {                     properties : {
    name : "CASEY ..."                name : "PLAYIN..."                 name : "TERRAP..."
    performances : 312                performances : 312                 performances : 302
  }}                                }}                                 }}


                         map = function(){
                           if(this.properties.song_type == "original")
                             emit(this.properties.performances, 1);
                         };

                                         key     value
                                           312 : 1
                                           312 : 1
                                           302 : 1
                                             ...
                                         key       values
                                         312 : [1,1]
                                         302 : [1]
                                             ...


                             reduce = function(key, values) {
                               var sum = 0;
                               for(var i in values) {
                                 sum = sum + values[i];
                               }
                               return sum;
                             };


                                         {
                                             312 : 2
                                             302 : 1
                                             ...
                                         }




                              Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Document Databases: Processing JSON Documents
> db[results.result].find()
{ "_id" : 0, "value" : 11 }
{ "_id" : 1, "value" : 14 }
{ "_id" : 2, "value" : 5 }
{ "_id" : 3, "value" : 8 }
{ "_id" : 4, "value" : 3 }
{ "_id" : 5, "value" : 4 }
...
{ "_id" : 554, "value" : 1 }
{ "_id" : 582, "value" : 1 }
{ "_id" : 583, "value" : 1 }
{ "_id" : 594, "value" : 1 }
{ "_id" : 1386, "value" : 1 }



                       Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases
• Graph databases store objects (vertices) and their relationships to one
  another (edges). Usually these relationships are typed/labeled and
  directed.

• Graph databases tend to be optimized for graph-based traversal
  algorithms.




Neo4j is available at http://neo4j.org
AllegroGraph is available at http://www.franz.com/agraph/allegrograph
HyperGraphDB is available at http://www.kobrix.com/hgdb.jsp


                                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Property Graph Model
                                                               name = "lop"
                                                               lang = "java"

                                              weight = 0.4              3
                            name = "marko"
                            age = 29            created                                weight = 0.2
                                          9
                                      1
                                                                                       created
                                          8                     created
                                                                                                 12
                                      7       weight = 1.0
                                                                       weight = 0.4                   6
                       weight = 0.5
                                               knows
                                 knows                          11                           name = "peter"
                                                                                             age = 35
                                                                       name = "josh"
                                                               4       age = 32
                                      2
                                                               10
                            name = "vadas"
                                                                    weight = 1.0
                            age = 27
                                                             created



                                                               5

                                                       name = "ripple"
                                                       lang = "java"




Graph data models vary. This section will use the data model popularized by Neo4j.


                                              Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Handling Property Graphs
• Gremlin is a graph-based programming language that can be used to
  interact with graph databases.

• However, graph databases also come with their own APIs.




                                           Gremlin                G = (V, E)

Gremlin is available at http://gremlin.tinkerpop.com.
All the examples in this section are using Gremlin and Neo4j.


                                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Moving Around a Graph in Gremlin
gremlin> $_ := g:key(‘name’,‘marko’)
==>v[1]
gremlin> ./outE
==>e[7][1-knows->2]
==>e[9][1-created->3]
==>e[8][1-knows->4]
gremlin> ./outE/inV
==>v[2]
==>v[3]
==>v[4]
gremlin> ./outE/inV/@name
==>vadas
==>lop
==>josh


                       Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Inserting Vertices and Edges

• Lets create a Grateful Dead graph.

gremlin> $_g := neo4j:open(‘/tmp/grateful-dead’)
==>neo4jgraph[/tmp/grateful-dead]
gremlin> $v := g:add-v(g:map(‘name’,‘TERRAPIN STATION’))
==>v[0]
gremlin> $u := g:add-v(g:map(‘name’,‘TRUCKIN’))
==>v[1]
gremlin> $e := g:add-e(g:map(‘weight’,1),$v,‘followed_by’,$u)
==>e[2][0-followed_by->1]

You can batch load graph data as well: g:load(‘data/grateful-dead.xml’) using the GraphML
specification [http://graphml.graphdrawing.org/]



                                     Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Inserting Vertices and Edges
• When all the data is in, you have a directed, weighted graph of the
  concert behavior of the Grateful Dead. A song is followed by another
  song if the second song was played next in concert. The weight of the
  edge denotes the number of times this happened in concert over the 30
  years that the Grateful Dead performed.




                           Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Finding Vertices

• Find the vertex with the name TERRAPIN STATION.

• Find the name of all the songs that followed TERRAPIN STATION in
  concert more than 3 times.

gremlin> $_ := g:key(‘name’,‘TERRAPIN STATION’)
==>v[0]
gremlin> ./outE[@weight > 3]/inV/@name
==>DRUMS
==>MORNING DEW
==>DONT NEED LOVE
==>ESTIMATED PROPHET
==>PLAYING IN THE BAND

                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Processing Graphs
• Most graph algorithms are aimed at traversing a graph in some manner.

• The traverser makes use of vertex and edge properties in order to
  guide its walk through the graph.




                           Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Processing Graphs
• Find all songs related to TERRAPIN STATION according to concert
  behavior.

$e := 1.0
$scores := g:map()
repeat 75
  $_ := (./outE[@label=‘followed_by’]/inV)[g:rand-nat()]
  if $_ != null()
    g:op-value(‘+’,$scores,$_/@name,$e)
    $e := $e * 0.85
  else
    $_ := g:key(‘name, ‘TERRAPIN STATION)
    $e := 1.0
  end
end

                         Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Graph Databases: Processing Graphs
gremlin> g:sort($scores,‘value’,true())
==>PLAYING IN THE BAND=1.9949905250390623
==>THE MUSIC NEVER STOPPED=0.85
==>MEXICALI BLUES=0.5220420095726453
==>DARK STAR=0.3645706137191774
==>SAINT OF CIRCUMSTANCE=0.20585176856988666
==>ALTHEA=0.16745479118927242
==>ITS ALL OVER NOW=0.14224175713617204
==>ESTIMATED PROPHET=0.12657286655816163
...




                       Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Conclusions
• Relational Databases
    Stable, solid technology that has been used in production for decades.
    Good for storing inter-linked tables of data and querying within and across tables.
    They do not scale horizontally due to the interconnectivity of table keys and the
    cost of joins.
• Document Databases
    For JSON documents, there exists a one-to-one mapping from document-to-
    programming object.
    They scale horizontally and allow for parallel processing due to forced sharding at
    document.
    Performing complicated queries requires relatively sophisticated programming skills.
• Graph Databases
    Optimized for graph traversal algorithms and local neighborhood searches.
    Low impedance mismatch between a graph in a database and a graph of objects in
    object-oriented programming.
    They do not scale well horizontally due to interconnectivity of vertices.


                                 Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
A Collection of References
•   http://www.wakandasoftware.com/blog/nosql-but-so-much-more/
•   http://horicky.blogspot.com/2009/07/choosing-between-sql-and-non-sql.html
•   http://ai.mee.nu/seeking a database that doesnt suck
•   http://blogs.neotechnology.com/emil/2009/11/
    nosql-scaling-to-size-and-scaling-to-complexity.html
• http://horicky.blogspot.com/2009/11/nosql-patterns.html
• http://horicky.blogspot.com/2010/02/nosql-graphdb.html




                             Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
Fin.
Thank your for your time...

• My homepage: http://markorodriguez.com

• TinkerPop: http://tinkerpop.com




Acknowledgements: Peter Neubauer (Neo Technology) for comments and review.


                                      Data Management Workshop – Albuquerque, New Mexico – February 15, 2010

Weitere ähnliche Inhalte

Was ist angesagt?

Improvement in the turn around time
Improvement in the turn around timeImprovement in the turn around time
Improvement in the turn around timeDr Jasbeer Singh
 
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...Koray Tugberk GUBUR
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarOracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarMinnie Seungmin Cho
 
PROCEDURES FOR VALIDATION.docx
PROCEDURES FOR VALIDATION.docxPROCEDURES FOR VALIDATION.docx
PROCEDURES FOR VALIDATION.docxLAURENCERAMIREZ3
 
What do clinicians need to know about lab tests?
What do clinicians need to know about lab tests?What do clinicians need to know about lab tests?
What do clinicians need to know about lab tests?Ola Elgaddar
 
Controlling clinical laboratory errors
Controlling clinical laboratory errorsControlling clinical laboratory errors
Controlling clinical laboratory errorsDr. Rajesh Bendre
 
Achieving Optimal Laboratory Blood Testing TAT
Achieving Optimal Laboratory Blood Testing TATAchieving Optimal Laboratory Blood Testing TAT
Achieving Optimal Laboratory Blood Testing TATDavid Wranovics
 
The ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsThe ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsPerficient, Inc.
 
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...How to write Standard Operating Procedures (SOPs) for clinical laboratories -...
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...Tamer Soliman
 
Dr. negi quality assurance
Dr. negi quality assuranceDr. negi quality assurance
Dr. negi quality assurancesanjay negi
 
Reporting and interpretation of laboratory results
Reporting and interpretation of laboratory resultsReporting and interpretation of laboratory results
Reporting and interpretation of laboratory resultsJoyce Mwatonoka
 
20180805-Hematology-Calibration-1.pptx
20180805-Hematology-Calibration-1.pptx20180805-Hematology-Calibration-1.pptx
20180805-Hematology-Calibration-1.pptxPUPUTPUJIANTI
 
GCP Guidelines: By RxVichuZ!! :)
GCP Guidelines: By RxVichuZ!! :)GCP Guidelines: By RxVichuZ!! :)
GCP Guidelines: By RxVichuZ!! :)RxVichuZ
 
Big data in pharmaceutical industry
Big data in pharmaceutical industryBig data in pharmaceutical industry
Big data in pharmaceutical industryKevin Lee
 
ICH GCP by Yogesh Yadav.pptx
ICH GCP by Yogesh Yadav.pptxICH GCP by Yogesh Yadav.pptx
ICH GCP by Yogesh Yadav.pptxyogesh532361
 

Was ist angesagt? (17)

Quality control
Quality controlQuality control
Quality control
 
Improvement in the turn around time
Improvement in the turn around timeImprovement in the turn around time
Improvement in the turn around time
 
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...
SEO Başarı Hikayesi - Hangikredi.com 12 Mart'tan 24 Eylül Google Core Güncell...
 
Oracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinarOracle to Azure PostgreSQL database migration webinar
Oracle to Azure PostgreSQL database migration webinar
 
PROCEDURES FOR VALIDATION.docx
PROCEDURES FOR VALIDATION.docxPROCEDURES FOR VALIDATION.docx
PROCEDURES FOR VALIDATION.docx
 
What do clinicians need to know about lab tests?
What do clinicians need to know about lab tests?What do clinicians need to know about lab tests?
What do clinicians need to know about lab tests?
 
Controlling clinical laboratory errors
Controlling clinical laboratory errorsControlling clinical laboratory errors
Controlling clinical laboratory errors
 
Achieving Optimal Laboratory Blood Testing TAT
Achieving Optimal Laboratory Blood Testing TATAchieving Optimal Laboratory Blood Testing TAT
Achieving Optimal Laboratory Blood Testing TAT
 
The ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management SystemsThe ABCs of Clinical Trial Management Systems
The ABCs of Clinical Trial Management Systems
 
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...How to write Standard Operating Procedures (SOPs) for clinical laboratories -...
How to write Standard Operating Procedures (SOPs) for clinical laboratories -...
 
Dr. negi quality assurance
Dr. negi quality assuranceDr. negi quality assurance
Dr. negi quality assurance
 
Reporting and interpretation of laboratory results
Reporting and interpretation of laboratory resultsReporting and interpretation of laboratory results
Reporting and interpretation of laboratory results
 
20180805-Hematology-Calibration-1.pptx
20180805-Hematology-Calibration-1.pptx20180805-Hematology-Calibration-1.pptx
20180805-Hematology-Calibration-1.pptx
 
GCP Guidelines: By RxVichuZ!! :)
GCP Guidelines: By RxVichuZ!! :)GCP Guidelines: By RxVichuZ!! :)
GCP Guidelines: By RxVichuZ!! :)
 
SPS-Power BI Introduction
SPS-Power BI IntroductionSPS-Power BI Introduction
SPS-Power BI Introduction
 
Big data in pharmaceutical industry
Big data in pharmaceutical industryBig data in pharmaceutical industry
Big data in pharmaceutical industry
 
ICH GCP by Yogesh Yadav.pptx
ICH GCP by Yogesh Yadav.pptxICH GCP by Yogesh Yadav.pptx
ICH GCP by Yogesh Yadav.pptx
 

Ähnlich wie An Overview of Data Management Paradigms: Relational, Document, and Graph

Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...Jennifer Bowen
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Dave Stokes
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMohan Rathour
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membaseArdak Shalkarbayuli
 
Python Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesPython Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesMats Kindahl
 
Week3 Lecture Database Design
Week3 Lecture Database DesignWeek3 Lecture Database Design
Week3 Lecture Database DesignKevin Element
 
Databases evolution in CulturePlex Lab
Databases evolution in CulturePlex LabDatabases evolution in CulturePlex Lab
Databases evolution in CulturePlex LabJavier de la Rosa
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesJonathan Katz
 
Реляционные или нереляционные (Josh Berkus)
Реляционные или нереляционные (Josh Berkus)Реляционные или нереляционные (Josh Berkus)
Реляционные или нереляционные (Josh Berkus)Ontico
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseRobert Lujo
 
Seminar by Luca Cabibbo : Nosql db design-20140110
Seminar by Luca Cabibbo : Nosql db design-20140110Seminar by Luca Cabibbo : Nosql db design-20140110
Seminar by Luca Cabibbo : Nosql db design-20140110INRIA-OAK
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptxAndrew Lamb
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPDave Stokes
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Michael Redlich
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Dave Stokes
 
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptxCCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptxAsst.prof M.Gokilavani
 
Adv DB - Full Handout.pdf
Adv DB - Full Handout.pdfAdv DB - Full Handout.pdf
Adv DB - Full Handout.pdf3BRBoruMedia
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sqlRam kumar
 

Ähnlich wie An Overview of Data Management Paradigms: Relational, Document, and Graph (20)

Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...Moving Library Metadata Toward Linked Data:  Opportunities Provided by the eX...
Moving Library Metadata Toward Linked Data: Opportunities Provided by the eX...
 
3. ADO.NET
3. ADO.NET3. ADO.NET
3. ADO.NET
 
Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019Hybrid Databases - PHP UK Conference 22 February 2019
Hybrid Databases - PHP UK Conference 22 February 2019
 
Mongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorialMongo Bb - NoSQL tutorial
Mongo Bb - NoSQL tutorial
 
Presentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membasePresentation: mongo db & elasticsearch & membase
Presentation: mongo db & elasticsearch & membase
 
Python Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL DatabasesPython Utilities for Managing MySQL Databases
Python Utilities for Managing MySQL Databases
 
Week3 Lecture Database Design
Week3 Lecture Database DesignWeek3 Lecture Database Design
Week3 Lecture Database Design
 
Databases evolution in CulturePlex Lab
Databases evolution in CulturePlex LabDatabases evolution in CulturePlex Lab
Databases evolution in CulturePlex Lab
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling StrategiesWebscale PostgreSQL - JSONB and Horizontal Scaling Strategies
Webscale PostgreSQL - JSONB and Horizontal Scaling Strategies
 
Реляционные или нереляционные (Josh Berkus)
Реляционные или нереляционные (Josh Berkus)Реляционные или нереляционные (Josh Berkus)
Реляционные или нереляционные (Josh Berkus)
 
ElasticSearch - index server used as a document database
ElasticSearch - index server used as a document databaseElasticSearch - index server used as a document database
ElasticSearch - index server used as a document database
 
Seminar by Luca Cabibbo : Nosql db design-20140110
Seminar by Luca Cabibbo : Nosql db design-20140110Seminar by Luca Cabibbo : Nosql db design-20140110
Seminar by Luca Cabibbo : Nosql db design-20140110
 
2021 04-20 apache arrow and its impact on the database industry.pptx
2021 04-20  apache arrow and its impact on the database industry.pptx2021 04-20  apache arrow and its impact on the database industry.pptx
2021 04-20 apache arrow and its impact on the database industry.pptx
 
MySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHPMySQL without the SQL -- Cascadia PHP
MySQL without the SQL -- Cascadia PHP
 
Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)Getting Started with MongoDB (TCF ITPC 2014)
Getting Started with MongoDB (TCF ITPC 2014)
 
Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016Polyglot Database - Linuxcon North America 2016
Polyglot Database - Linuxcon North America 2016
 
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptxCCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
 
Adv DB - Full Handout.pdf
Adv DB - Full Handout.pdfAdv DB - Full Handout.pdf
Adv DB - Full Handout.pdf
 
Non relational databases-no sql
Non relational databases-no sqlNon relational databases-no sql
Non relational databases-no sql
 

Mehr von Marko Rodriguez

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic MachineMarko Rodriguez
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data TypeMarko Rodriguez
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryMarko Rodriguez
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialMarko Rodriguez
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryMarko Rodriguez
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph ComputingMarko Rodriguez
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageMarko Rodriguez
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageMarko Rodriguez
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics EngineMarko Rodriguez
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with GraphsMarko Rodriguez
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataMarko Rodriguez
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph DatabasesMarko Rodriguez
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinMarko Rodriguez
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical GremlinMarko Rodriguez
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the GraphMarko Rodriguez
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMarko Rodriguez
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataMarko Rodriguez
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceMarko Rodriguez
 

Mehr von Marko Rodriguez (20)

mm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machinemm-ADT: A Virtual Machine/An Economic Machine
mm-ADT: A Virtual Machine/An Economic Machine
 
mm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Typemm-ADT: A Multi-Model Abstract Data Type
mm-ADT: A Multi-Model Abstract Data Type
 
Open Problems in the Universal Graph Theory
Open Problems in the Universal Graph TheoryOpen Problems in the Universal Graph Theory
Open Problems in the Universal Graph Theory
 
Gremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM DialGremlin 101.3 On Your FM Dial
Gremlin 101.3 On Your FM Dial
 
Gremlin's Graph Traversal Machinery
Gremlin's Graph Traversal MachineryGremlin's Graph Traversal Machinery
Gremlin's Graph Traversal Machinery
 
Quantum Processes in Graph Computing
Quantum Processes in Graph ComputingQuantum Processes in Graph Computing
Quantum Processes in Graph Computing
 
ACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and LanguageACM DBPL Keynote: The Graph Traversal Machine and Language
ACM DBPL Keynote: The Graph Traversal Machine and Language
 
The Gremlin Graph Traversal Language
The Gremlin Graph Traversal LanguageThe Gremlin Graph Traversal Language
The Gremlin Graph Traversal Language
 
The Path Forward
The Path ForwardThe Path Forward
The Path Forward
 
Faunus: Graph Analytics Engine
Faunus: Graph Analytics EngineFaunus: Graph Analytics Engine
Faunus: Graph Analytics Engine
 
Solving Problems with Graphs
Solving Problems with GraphsSolving Problems with Graphs
Solving Problems with Graphs
 
Titan: The Rise of Big Graph Data
Titan: The Rise of Big Graph DataTitan: The Rise of Big Graph Data
Titan: The Rise of Big Graph Data
 
The Pathology of Graph Databases
The Pathology of Graph DatabasesThe Pathology of Graph Databases
The Pathology of Graph Databases
 
Traversing Graph Databases with Gremlin
Traversing Graph Databases with GremlinTraversing Graph Databases with Gremlin
Traversing Graph Databases with Gremlin
 
The Path-o-Logical Gremlin
The Path-o-Logical GremlinThe Path-o-Logical Gremlin
The Path-o-Logical Gremlin
 
The Gremlin in the Graph
The Gremlin in the GraphThe Gremlin in the Graph
The Gremlin in the Graph
 
Memoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to RedemptionMemoirs of a Graph Addict: Despair to Redemption
Memoirs of a Graph Addict: Despair to Redemption
 
Graph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of DataGraph Databases: Trends in the Web of Data
Graph Databases: Trends in the Web of Data
 
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...
 
A Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network ScienceA Perspective on Graph Theory and Network Science
A Perspective on Graph Theory and Network Science
 

Kürzlich hochgeladen

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Kürzlich hochgeladen (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

An Overview of Data Management Paradigms: Relational, Document, and Graph

  • 1. An Overview of Data Management Paradigms: Relational, Document, and Graph Marko A. Rodriguez T-5, Center for Nonlinear Studies Los Alamos National Laboratory http://markorodriguez.com February 15, 2010
  • 2. Relational, Document, and Graph Database Data Models Relational Database Document Database Graph Database d { data } { data } a c a { data } b MySQL MongoDB Neo4j PostgreSQL CouchDB AllegroGraph Oracle HyperGraphDB Database models are optimized for solving particular types of problems. This is why different database models exist — there are many types of problems in the world. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 3. Finding the Right Solution to your Problem 1. Come to terms with your problem. • “I have metadata for a massive number of objects and I don’t know how to get at my data.” 2. Identify the solution to your problem. • “I need to be able to find objects based on their metadata.” 3. Identify the type of database that is optimized for that type of solution. • “A document database scales, stores metadata, and can be queried.” 4. Identify the database of that type that best meets your particular needs. • “CouchDB has a REST web interface and all my developers are good with REST.” Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 4. Relational Databases • Relational databases have been the de facto data management solution for many years. MySQL is available at http://www.mysql.com PostgreSQL is available at http://www.postgresql.org Oracle is available at http://www.oracle.com Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 5. Relational Databases: The Relational Structure • Relational databases require a schema before data can be inserted. • Relational databases organizes data according to relations — or tables. columns (attributes/properties) j rows (tuples/objects) i x Object i has the value x for property j. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 6. Relational Databases: Creating a Table • Relational databases organizes data according to relations — or tables. • Relational databases require a schema before data can be inserted. • Lets create a table for Grateful Dead songs. mysql> CREATE TABLE songs ( name VARCHAR(255) PRIMARY KEY, performances INT, song_type VARCHAR(20)); Query OK, 0 rows affected (0.40 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 7. Relational Databases: Viewing the Table Schema • Its possible to look at the defined structure (schema) of your newly created table. mysql> DESCRIBE songs; +--------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+--------------+------+-----+---------+-------+ | name | varchar(255) | NO | PRI | NULL | | | performances | int(11) | YES | | NULL | | | song_type | varchar(20) | YES | | NULL | | +--------------+--------------+------+-----+---------+-------+ 3 rows in set (0.00 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 8. Relational Databases: Inserting Rows into a Table • Lets insert song names, the number of times they were played in concert, and whether they were and original or cover. mysql> INSERT INTO songs VALUES ("DARK STAR", 219, "original"); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO songs VALUES ( "FRIEND OF THE DEVIL", 304, "original"); Query OK, 1 row affected (0.00 sec) mysql> INSERT INTO songs VALUES ( "MONKEY AND THE ENGINEER", 32, "cover"); Query OK, 1 row affected (0.00 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 9. Relational Databases: Searching a Table • Lets look at the entire songs table. mysql> SELECT * FROM songs; +-------------------------+--------------+-----------+ | name | performances | song_type | +-------------------------+--------------+-----------+ | DARK STAR | 219 | original | | FRIEND OF THE DEVIL | 304 | original | | MONKEY AND THE ENGINEER | 32 | cover | +-------------------------+--------------+-----------+ 3 rows in set (0.00 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 10. Relational Databases: Searching a Table • Lets look at all original songs. mysql> SELECT * FROM songs WHERE song_type="original"; +---------------------+--------------+-----------+ | name | performances | song_type | +---------------------+--------------+-----------+ | DARK STAR | 219 | original | | FRIEND OF THE DEVIL | 304 | original | +---------------------+--------------+-----------+ 2 rows in set (0.00 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 11. Relational Databases: Searching a Table • Lets look at only the names of the original songs. mysql> SELECT name FROM songs WHERE song_type="original"; +---------------------+ | name | +---------------------+ | DARK STAR | | FRIEND OF THE DEVIL | +---------------------+ 2 rows in set (0.00 sec) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 12. Document Databases • Document databases store structured documents. Usually these documents are organized according a standard (e.g. JavaScript Object Notation—JSON, XML, etc.) • Document databases tend to be schema-less. That is, they do not require the database engineer to apriori specify the structure of the data to be held in the database. MongoDB is available at http://mongodb.org and CouchDB is available at http://couchdb.org Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 13. Document Databases: JavaScript Object Notation • A JSON document is a collection of key/value pairs, where a value can be yet another collection of key/value pairs. string: a string value (e.g. “marko”, “rodriguez”). number: a numeric value (e.g. 1234, 67.012). boolean: a true/false value (e.g. true, false) null: a non-existant value. array: an array of values (e.g. [1,“marko”,true]) object: a key/value map (e.g. { “key” : 123 }) The JSON specification is very simple and can be found at http://www.json.org/. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 14. Document Databases: JavaScript Object Notation { _id : "D0DC29E9-51AE-4A8C-8769-541501246737", name : "Marko A. Rodriguez", homepage : "http://markorodriguez.com", age : 30, location : { country : "United States", state : "New Mexico", city : "Santa Fe", zipcode : 87501 }, interests : ["graphs", "hockey", "motorcycles"] } Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 15. Document Databases: Handling JSON Documents • Use object-oriented “dot notation” to access components. > marko = eval({_id : "D0DC29E9...", name : "Marko...}) > marko._id D0DC29E9-51AE-4A8C-8769-541501246737 > marko.location.city Santa Fe > marko.interests[0] graphs All document database examples presented are using MongoDB [http://mongodb.org]. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 16. Document Databases: Inserting JSON Documents • Lets insert a Grateful Dead document into the database. > db.songs.insert({ _id : "91", properties : { name : "TERRAPIN STATION", song_type : "original", performances : 302 } }) Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 17. Document Databases: Finding JSON Documents • Searching is based on created a “subset” document and pattern matching it in the database. • Find all songs where properties.name equals TERRAPIN STATION. > db.songs.find({"properties.name" : "TERRAPIN STATION"}) { "_id" : "91", "properties" : { "name" : "TERRAPIN STATION", "song_type" : "original", "performances" : 302 }} > Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 18. Document Databases: Finding JSON Documents • You can also do comparison-type operations. • Find all songs where properties.performances is greater than 200. > db.songs.find({"properties.performances" : { $gt : 200 }}) { "_id" : "104", "properties" : { "name" : "FRIEND OF THE DEVIL", "song_type" : "original", "performances" : 304}} { "_id" : "122", "properties" : { "name" : "CASEY JONES", "song_type" : "original", "performances" : 312}} has more > Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 19. Document Databases: Processing JSON Documents • Sharding is the process of distributing a database’s data across multiple machines. Each partition of the data is known as a shard. • Document databases shard easily because there are no explicit references between documents. client appliation communication service { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } { _id : } Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 20. Document Databases: Processing JSON Documents • Most document databases come with a Map/Reduce feature to allow for the parallel processing of all documents in the database. Map function: apply a function to every document in the database. Reduce function: apply a function to the grouped results of the map. M : D → (K, V ), where D is the space of documents, K is the space of keys, and V is the space of values. R : (K, V n) → (K, V ), where V n is the space of all possible combination of values. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 21. Document Databases: Processing JSON Documents • Create a distribution of the Grateful Dead original song performances. > map = function(){ if(this.properties.song_type == "original") emit(this.properties.performances, 1); }; > reduce = function(key, values) { var sum = 0; for(var i in values) { sum = sum + values[i]; } return sum; }; Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 22. Document Databases: Processing JSON Documents > results = db.songs.mapReduce(map, reduce) { "result" : "tmp.mr.mapreduce_1266016122_8", "timeMillis" : 72, "counts" : { "input" : 809, "emit" : 184, "output" : 119 }, "ok" : 1, } Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 23. { _id : 122, { _id : 100, { _id : 91, properties : { properties : { properties : { name : "CASEY ..." name : "PLAYIN..." name : "TERRAP..." performances : 312 performances : 312 performances : 302 }} }} }} map = function(){ if(this.properties.song_type == "original") emit(this.properties.performances, 1); }; key value 312 : 1 312 : 1 302 : 1 ... key values 312 : [1,1] 302 : [1] ... reduce = function(key, values) { var sum = 0; for(var i in values) { sum = sum + values[i]; } return sum; }; { 312 : 2 302 : 1 ... } Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 24. Document Databases: Processing JSON Documents > db[results.result].find() { "_id" : 0, "value" : 11 } { "_id" : 1, "value" : 14 } { "_id" : 2, "value" : 5 } { "_id" : 3, "value" : 8 } { "_id" : 4, "value" : 3 } { "_id" : 5, "value" : 4 } ... { "_id" : 554, "value" : 1 } { "_id" : 582, "value" : 1 } { "_id" : 583, "value" : 1 } { "_id" : 594, "value" : 1 } { "_id" : 1386, "value" : 1 } Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 25. Graph Databases • Graph databases store objects (vertices) and their relationships to one another (edges). Usually these relationships are typed/labeled and directed. • Graph databases tend to be optimized for graph-based traversal algorithms. Neo4j is available at http://neo4j.org AllegroGraph is available at http://www.franz.com/agraph/allegrograph HyperGraphDB is available at http://www.kobrix.com/hgdb.jsp Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 26. Graph Databases: Property Graph Model name = "lop" lang = "java" weight = 0.4 3 name = "marko" age = 29 created weight = 0.2 9 1 created 8 created 12 7 weight = 1.0 weight = 0.4 6 weight = 0.5 knows knows 11 name = "peter" age = 35 name = "josh" 4 age = 32 2 10 name = "vadas" weight = 1.0 age = 27 created 5 name = "ripple" lang = "java" Graph data models vary. This section will use the data model popularized by Neo4j. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 27. Graph Databases: Handling Property Graphs • Gremlin is a graph-based programming language that can be used to interact with graph databases. • However, graph databases also come with their own APIs. Gremlin G = (V, E) Gremlin is available at http://gremlin.tinkerpop.com. All the examples in this section are using Gremlin and Neo4j. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 28. Graph Databases: Moving Around a Graph in Gremlin gremlin> $_ := g:key(‘name’,‘marko’) ==>v[1] gremlin> ./outE ==>e[7][1-knows->2] ==>e[9][1-created->3] ==>e[8][1-knows->4] gremlin> ./outE/inV ==>v[2] ==>v[3] ==>v[4] gremlin> ./outE/inV/@name ==>vadas ==>lop ==>josh Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 29. Graph Databases: Inserting Vertices and Edges • Lets create a Grateful Dead graph. gremlin> $_g := neo4j:open(‘/tmp/grateful-dead’) ==>neo4jgraph[/tmp/grateful-dead] gremlin> $v := g:add-v(g:map(‘name’,‘TERRAPIN STATION’)) ==>v[0] gremlin> $u := g:add-v(g:map(‘name’,‘TRUCKIN’)) ==>v[1] gremlin> $e := g:add-e(g:map(‘weight’,1),$v,‘followed_by’,$u) ==>e[2][0-followed_by->1] You can batch load graph data as well: g:load(‘data/grateful-dead.xml’) using the GraphML specification [http://graphml.graphdrawing.org/] Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 30. Graph Databases: Inserting Vertices and Edges • When all the data is in, you have a directed, weighted graph of the concert behavior of the Grateful Dead. A song is followed by another song if the second song was played next in concert. The weight of the edge denotes the number of times this happened in concert over the 30 years that the Grateful Dead performed. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 31. Graph Databases: Finding Vertices • Find the vertex with the name TERRAPIN STATION. • Find the name of all the songs that followed TERRAPIN STATION in concert more than 3 times. gremlin> $_ := g:key(‘name’,‘TERRAPIN STATION’) ==>v[0] gremlin> ./outE[@weight > 3]/inV/@name ==>DRUMS ==>MORNING DEW ==>DONT NEED LOVE ==>ESTIMATED PROPHET ==>PLAYING IN THE BAND Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 32. Graph Databases: Processing Graphs • Most graph algorithms are aimed at traversing a graph in some manner. • The traverser makes use of vertex and edge properties in order to guide its walk through the graph. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 33. Graph Databases: Processing Graphs • Find all songs related to TERRAPIN STATION according to concert behavior. $e := 1.0 $scores := g:map() repeat 75 $_ := (./outE[@label=‘followed_by’]/inV)[g:rand-nat()] if $_ != null() g:op-value(‘+’,$scores,$_/@name,$e) $e := $e * 0.85 else $_ := g:key(‘name, ‘TERRAPIN STATION) $e := 1.0 end end Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 34. Graph Databases: Processing Graphs gremlin> g:sort($scores,‘value’,true()) ==>PLAYING IN THE BAND=1.9949905250390623 ==>THE MUSIC NEVER STOPPED=0.85 ==>MEXICALI BLUES=0.5220420095726453 ==>DARK STAR=0.3645706137191774 ==>SAINT OF CIRCUMSTANCE=0.20585176856988666 ==>ALTHEA=0.16745479118927242 ==>ITS ALL OVER NOW=0.14224175713617204 ==>ESTIMATED PROPHET=0.12657286655816163 ... Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 35. Conclusions • Relational Databases Stable, solid technology that has been used in production for decades. Good for storing inter-linked tables of data and querying within and across tables. They do not scale horizontally due to the interconnectivity of table keys and the cost of joins. • Document Databases For JSON documents, there exists a one-to-one mapping from document-to- programming object. They scale horizontally and allow for parallel processing due to forced sharding at document. Performing complicated queries requires relatively sophisticated programming skills. • Graph Databases Optimized for graph traversal algorithms and local neighborhood searches. Low impedance mismatch between a graph in a database and a graph of objects in object-oriented programming. They do not scale well horizontally due to interconnectivity of vertices. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 36. A Collection of References • http://www.wakandasoftware.com/blog/nosql-but-so-much-more/ • http://horicky.blogspot.com/2009/07/choosing-between-sql-and-non-sql.html • http://ai.mee.nu/seeking a database that doesnt suck • http://blogs.neotechnology.com/emil/2009/11/ nosql-scaling-to-size-and-scaling-to-complexity.html • http://horicky.blogspot.com/2009/11/nosql-patterns.html • http://horicky.blogspot.com/2010/02/nosql-graphdb.html Data Management Workshop – Albuquerque, New Mexico – February 15, 2010
  • 37. Fin. Thank your for your time... • My homepage: http://markorodriguez.com • TinkerPop: http://tinkerpop.com Acknowledgements: Peter Neubauer (Neo Technology) for comments and review. Data Management Workshop – Albuquerque, New Mexico – February 15, 2010