SlideShare a Scribd company logo
1 of 23
Download to read offline
Inferring Versioned
Schemas from NoSQL
Databases and its
Applications
ER’15
Stockholm, October 2015
[{ ”id”: ”90234 af”, ”value”: { ”author”: ”Diego Sevilla Ruiz”,
”e-mail”: ”dsevilla@um.es”,
”institution”: ”U. of Murcia”}},
{ ”id”: ”a243bb5”, ”value”: { ”author”: ”Severino Feliciano Morales”,
”e-mail”: ”severino.feliciano@um.es”,
”institution”: ”U. of Murcia”}},
{ ”id”: ”096705d”, ”value”: { ”author”: ”Jesús García Molina”,
”e-mail”: ”jmolina@um.es”,
”institution”: ”U. of Murcia”}}]
Motivation
NoSQL Databases are Schemaless
Benefits
▶ No need to previously
define an Schema
▶ Non-uniform data
▶ Custom fields
▶ Non-uniform types
▶ Easier evolution
Drawbacks
▶ Harder to reason about
the DB
▶ Static checking is lost
▶ Some of the data logic is
in the application code
(more error prone)
▶ Some utilities need
Schema information to
work
Schemas for NoSQL Databases
▶ How to alleviate the problems of schemaless
databases? ⇒ Inferring a Schema
▶ The Schema Model contains information about
Entities and Relationships
▶ Take into account the different Entity Versions in
the Database
▶ Heterogeneity usually because of slight variations on
Entities
▶ We obtain a precise database model
▶ The Schema allows us to automate the construction
of tools:
▶ migration, refactoring, visualization, …
Related Work
▶ JSON Schema
▶ Object versions and relationships are not considered
▶ Apache Spark SQL/Drill: SQL-like schemas
▶ Union of all fields, nullable ⇒ incorrect combinations
▶ Over-generalization to String
▶ Aggregations and Reference relations not considered
▶ MongoDB-Schema
▶ Prototype to infer schemas from MongoDB
collections
▶ Same limitations than Spark SQL
▶ JSON Discoverer
▶ A MDE solution to infer domain models from REST
web services (i.e. JSON documents)
▶ Not database-oriented; Object versions not
considered
Spark SQL Example
{”name”:”Michael”}
{”name”:”Andy”, ”age”:30}
{”name”:”Justin”, ”age”:19}
{”name”:”Peter”, ”age”:”tiny”}
{”name”:”Martina”, ”address”:”home!”}
> people.printSchema
root
|-- address: string (nullable = true)
|-- age: string (nullable = true)
|-- name: string (nullable = true)
▶ age promoted to string
▶ age and address are never part of the same object
{
”rows”:[
{
”content”:{
”chapters”:33,
”pages”:527
},
”authors”:[
{
”company”:{
”country”:”USA”,
”name”:”IBM”
},
”name”:”Grady Booch”,
”_id”:”210”
},
{
”company”:{
”country”:”USA”,
”name”:”IBM”
},
”name”:”James Rumbaugh”,
”_id”:”310”
},
{
”country”:”USA”,
”company”:”Ivar Jacobson Consulting”,
”name”:”Ivar Jacobson”,
”_id”:”410”
}],
”type”:”book”,
”year”:2013,
”publisher_id”:”345679”,
”title”:”The Unified Modeling Language”,
”_id”:”1”
},
{
”discipline”:”software engineering”,
”issn”:[
”0098 -5589”,
”1939 -3520”
],
”name”:”IEEE Trans. on Software Engineering”,
”type”:”journal”,
”_id”:”11”
},
{
”name”:”Automated Software Engineering”,
”issn”:[
”0928 -8910”,
”1573 -7535”
],
”discipline”:”software engineering”,
”type”:”journal”,
”_id”:”12”,
”number”:10515
},
{
”city”:”Barcelona”,
”name”:”Omega”,
”type”:”publisher”,
”_id”:”123451”
},
{
”type”:”publisher”,
”city”:”Newton”,
”name”:”O’Reilly Media”,
”_id”:”928672”
},
{
”type”:”book”,
”author”:{
”_id”:”101”,
”name”:”Bradley Holt”,
”company”:{
”country”:”USA”,
”name”:”IBM Cloudant”,
}
},
”title”:”Writing and Querying MapReduce Views in
CouchDB”,
”publisher_id”:”928672”,
”_id”:”2”
},
{
”name”:”Addison -Wesley”,
”type”:”publisher”,
”_id”:”345679”
},
{
”type”:”publisher”,
”journals”:[
”11”,
”12”
],
”name”:”IEEE Publications”,
”_id”:”907863”
}]}
NoSQL Database Model
▶ Objects (Entities) and Entity Versions
▶ Attributes
▶ Relationships
▶ Aggregation
▶ References
{
”type”:”publisher”,
”city”:”Newton”,
”name”:”O’Reilly Media”,
”_id”:”928672”
},
{
”type”:”book”,
”author”:{
”_id”:”101”,
”name”:”Bradley Holt”,
”company”:{
”country”:”USA”,
”name”:”IBM Cloudant”,
}
},
”title”:”Writing and Querying MapReduce Views in CouchDB”,
”publisher_id”:”928672”,
”_id”:”2”
},
Schema & Entity Versions Description
Entity Publisher {
Version 1 {
name: String
city: String
}
Version 2 {
name: String
}
Version 3 {
name: String
journal[+]: [Ref]->[Journal] (opposite=False)
}
}
Entity Journal {
Version 1 {
issn: Tuple [String, String]
name: String
discipline: String
}
Version 2 {
issn: Tuple [String, String]
name: String
discipline: String
number: int
}
}
Entity Book {
Version 1 {
title: String
year: int
publisher[1]: [Ref]->[Publisher] (opossite=False)
content[1]: [Aggregate]Content1
author[+]: [Aggregate]Author1
}
Version 2 {
title: String
publisher[1]: [Ref]->[Publisher] (opossite=False)
author[1]: [Aggregate]Author1
}
}
Entity Author {
Version 1 {
name: String
company[1]: [Aggregate]Company
}
Version 2 {
country: String
company: String
name: String
}
}
Entity Company {
Version 1 {
name: String
country: String
}
}
Entity Content {
Version 1 {
chapters: int
pages: int
}
}
(a) (b)
[1..1] company
[1..1] publisher[1..1] content[1..*] authors
[1..*] journals
Solution Design Considerations
▶ We have to process all the objects in the Database
⇒ Map-Reduce
▶ Natural data processing on NoSQL databases
▶ Leverage MDE technologies
▶ Reuse EMF/Ecore tooling to show entity diagrams
▶ Automation & Code Generation by Metamodeling &
Model Transformations
Proposed MDE Architecture
NoSQL
Database
MapReduce
Object
Versions
(JSON)
JSON
Injection
JSON
Model
JSON
Metamodel
Schema
Reverse
Eng
Schema
Model
Application
Generation
Schema
Viewer/
Data
Validator/
Migration
Assistant
Applications Schema
Metamodel
instance
instance
Reverse Engineering Process (i)
▶ Map-Reduce process
▶ Map: obtains the Raw Schema for each object
▶ Reduce: selects an archetype for each Entity Version
▶ Entity Type
▶ Root objects ⇒ “type” field or collection name
▶ Aggregated objects ⇒ key of the pair (e.g. “author”)
JSON object Raw Schema
{name:“Omega”, city:“Barcelona”} {name:String, city:String}
{title:“Writing and...”,
publisher_id:“928672”,
author:{name:“Bradley Holt”,
company:{country:“USA”,
name:“IBM Cloudant”} } }
{title:String,
publisher_id:String,
author:{name:String,
company:{country:String,
name:String} } }
Reverse Engineering Process (ii)
▶ Attributes: primitive or tuple
▶ Aggregated Entities
▶ Value of the pair is an Object (or array of objects)
▶ Entity type inferred from the key
▶ References
▶ Heuristics/Conventions
▶ Key: <entity_name>_id
▶ Value: MongoDB’s DBRef abstraction:
{”$ref”: ”<entity_name>”, ”$id”, <id_value>}
▶ Honor cardinalities (arrays)
Example NoSQL Applications
▶ From the DBSchema model, using Model
Transformations and Model-to-Text transformations
(Code Generation), we can:
▶ Generate models that Characterize each Entity
Version
▶ That characterization can be used to Visualize the
Database
▶ And also to generate code to Validate objects
entering the Database
▶ Generate models that allow Database Migration to
the desired Entity Versions
Type Discrimination/Characterization
Metamodel
function isOfExactTypeBook_2(obj) {
if (! (”type” in obj)) {
return false;
}
if (obj[type] !== ”Book”) {
return false;
}
if (! (”title” in obj)) {
return false;
}
if (! (”author” in obj)) {
return false;
}
if (”publisher” in obj) {
return false;
}
if (”content” in obj) {
return false;
}
if (”year” in obj) {
return false;
}
return true;
}
Generated using a Model-
to-Text transformation
from an instance of the
previous Type Discrimina-
tion Metamodel
Entity Versions
Alternate: D3.js Treemap
Type Transformation Metamodel
db.<collection >. update(
<query >,
<update >,
{
multi: true
}
)
Obtained by Entity Type Characterization
Generate the correct update
MongoDB statement using $set,
$push, etc., maybe via user assis-
tance through a DSL.
For example, for Journal_1 to
Journal_2:
$set: { ”number”: 1 }
Conclusions & Future work
▶ A process for obtaining Conceptual Model Schemas
for NoSQL Databases is shown
▶ The process takes into account the different Entity
Versions present in the Database
▶ A MDE process allows us to automate the
production of several applications from the Schemas
▶ Example applications that allow Database
Visualization and Migration are shown
Conclusions & Future work (ii)
▶ Future work includes:
▶ Building a NoSQL Database Tool Set (NoSQL Data
Engineering)
▶ DSL for Entity Version migration
▶ Refining the Schema to allow a richer Type System
▶ Allow value ranges or enumerated sets
▶ Infer attribute dependencies (derived attributes,
i.e. the value of an attribute dictates the value of
another attribute)
▶ etc.

More Related Content

What's hot

MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Designaaronheckmann
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Schema & Design
Schema & DesignSchema & Design
Schema & DesignMongoDB
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema DesignMongoDB
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsMongoDB
 
Mysql to mongo
Mysql to mongoMysql to mongo
Mysql to mongoAlex Sharp
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)Sam Shah
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDBVyacheslav
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011Steven Francia
 
Using Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsUsing Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsNicholas Altobelli
 
MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011Steven Francia
 
Schema Design
Schema DesignSchema Design
Schema DesignMongoDB
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Alex Sharp
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in BerlinToshiaki Katayama
 

What's hot (20)

MongoDB Schema Design
MongoDB Schema DesignMongoDB Schema Design
MongoDB Schema Design
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Schema & Design
Schema & DesignSchema & Design
Schema & Design
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Webinar: Schema Design
Webinar: Schema DesignWebinar: Schema Design
Webinar: Schema Design
 
Back to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documentsBack to Basics 1: Thinking in documents
Back to Basics 1: Thinking in documents
 
Mysql to mongo
Mysql to mongoMysql to mongo
Mysql to mongo
 
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
MongoDB Schema Design (Event: An Evening with MongoDB Houston 3/11/15)
 
Schema Design
Schema DesignSchema Design
Schema Design
 
How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)How to Win Friends and Influence People (with Hadoop)
How to Win Friends and Influence People (with Hadoop)
 
ActiveRecord vs Mongoid
ActiveRecord vs MongoidActiveRecord vs Mongoid
ActiveRecord vs Mongoid
 
JSON-LD and MongoDB
JSON-LD and MongoDBJSON-LD and MongoDB
JSON-LD and MongoDB
 
Storing tree structures with MongoDB
Storing tree structures with MongoDBStoring tree structures with MongoDB
Storing tree structures with MongoDB
 
Json
JsonJson
Json
 
MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011MongoDB, PHP and the cloud - php cloud summit 2011
MongoDB, PHP and the cloud - php cloud summit 2011
 
Using Mongoid with Ruby on Rails
Using Mongoid with Ruby on RailsUsing Mongoid with Ruby on Rails
Using Mongoid with Ruby on Rails
 
MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011MongoDB and PHP ZendCon 2011
MongoDB and PHP ZendCon 2011
 
Schema Design
Schema DesignSchema Design
Schema Design
 
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
Practical Ruby Projects with MongoDB - Ruby Kaigi 2010
 
d3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlind3sparql.js demo at SWAT4LS 2014 in Berlin
d3sparql.js demo at SWAT4LS 2014 in Berlin
 

Viewers also liked

05 Problem Detection
05 Problem Detection05 Problem Detection
05 Problem DetectionJorge Ressia
 
Dominguez n fichasdecontenido
Dominguez n fichasdecontenidoDominguez n fichasdecontenido
Dominguez n fichasdecontenidoNerea Dominguez
 
Principles of site design
Principles of site designPrinciples of site design
Principles of site designKnoldus Inc.
 
Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Keltia Viatges
 
Insan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionInsan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionNejat Kutup
 
Conceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualConceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualSelin Carrasco
 
Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014silvias10
 
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...ISLE Network
 
Cómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoCómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoeconred
 
Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". EDUCACIÓN TOLEDO
 
Biografia web pilar
Biografia web pilarBiografia web pilar
Biografia web pilarpilarica11q
 
Sps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerSps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerCSCP
 
Wellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolWellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolLindenhof
 
MCCCD Experts
MCCCD ExpertsMCCCD Experts
MCCCD Expertsmcccd
 

Viewers also liked (19)

05 Problem Detection
05 Problem Detection05 Problem Detection
05 Problem Detection
 
Dominguez n fichasdecontenido
Dominguez n fichasdecontenidoDominguez n fichasdecontenido
Dominguez n fichasdecontenido
 
Principles of site design
Principles of site designPrinciples of site design
Principles of site design
 
Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012Turismo Fluvial en Alemania - 2012
Turismo Fluvial en Alemania - 2012
 
Insan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interactionInsan ve bilgisayar etkileşimi / Human Computer interaction
Insan ve bilgisayar etkileşimi / Human Computer interaction
 
PresentacióN1
PresentacióN1PresentacióN1
PresentacióN1
 
Conceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actualConceptos para avanzar juntos en la educación actual
Conceptos para avanzar juntos en la educación actual
 
Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014Planificación TEOYE 1°APM 2014
Planificación TEOYE 1°APM 2014
 
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable  Practi...
ISLE Professionalization Fair 2. Soledad Gómez González: "Sustainable Practi...
 
Handbook en
Handbook   enHandbook   en
Handbook en
 
June2016TradeComplianceOps
June2016TradeComplianceOpsJune2016TradeComplianceOps
June2016TradeComplianceOps
 
Cómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a pasoCómo registrar tu marca en España paso a paso
Cómo registrar tu marca en España paso a paso
 
Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado". Historia del Planeta Alfa, "el manantial sagrado".
Historia del Planeta Alfa, "el manantial sagrado".
 
Biografia web pilar
Biografia web pilarBiografia web pilar
Biografia web pilar
 
Ondas.pptx 11 b
Ondas.pptx 11 bOndas.pptx 11 b
Ondas.pptx 11 b
 
Sps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum FresnerSps Conferenc Essen 2009 Stenum Fresner
Sps Conferenc Essen 2009 Stenum Fresner
 
Wellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South TyrolWellness & Spa Hotel Lindenhof in South Tyrol
Wellness & Spa Hotel Lindenhof in South Tyrol
 
MCCCD Experts
MCCCD ExpertsMCCCD Experts
MCCCD Experts
 
From Past to Present: Sustainable Transportation Practices in Graz
From Past to Present: Sustainable Transportation Practices in GrazFrom Past to Present: Sustainable Transportation Practices in Graz
From Past to Present: Sustainable Transportation Practices in Graz
 

Similar to Inferring Versioned Schemas from NoSQL Databases and its Applications

Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesDaniel Coupal
 
Data Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONData Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONGeorge McGeachie
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesRyan CrawCour
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichNorberto Leite
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxMongoDB
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxMongoDB
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revisedMongoDB
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessMongoDB
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Anuj Sahni
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN AppMongoDB
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsMichael Mior
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQLAaron Benton
 

Similar to Inferring Versioned Schemas from NoSQL Databases and its Applications (20)

Semi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented DatabasesSemi Formal Model for Document Oriented Databases
Semi Formal Model for Document Oriented Databases
 
Data Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSONData Modelling Zone 2019 - data modelling and JSON
Data Modelling Zone 2019 - data modelling and JSON
 
Modeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databasesModeling JSON data for NoSQL document databases
Modeling JSON data for NoSQL document databases
 
Aggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days MunichAggregation Framework MongoDB Days Munich
Aggregation Framework MongoDB Days Munich
 
MongoDB Meetup
MongoDB MeetupMongoDB Meetup
MongoDB Meetup
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
 
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptxSH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
SH 1 - SES 2 part 2 - Tel Aviv MDBlocal - Eliot Keynote.pptx
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: TutorialMongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
MongoDB .local Chicago 2019: Practical Data Modeling for MongoDB: Tutorial
 
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDBMongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
MongoDB.local DC 2018: Tutorial - Data Analytics with MongoDB
 
Eagle6 mongo dc revised
Eagle6 mongo dc revisedEagle6 mongo dc revised
Eagle6 mongo dc revised
 
Eagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational AwarenessEagle6 Enterprise Situational Awareness
Eagle6 Enterprise Situational Awareness
 
Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0Application development with Oracle NoSQL Database 3.0
Application development with Oracle NoSQL Database 3.0
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDb and Windows Azure
MongoDb and Windows AzureMongoDb and Windows Azure
MongoDb and Windows Azure
 
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
MongoDB .local London 2019: Best Practices for Working with IoT and Time-seri...
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
 
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
MongoDB .local Chicago 2019: Best Practices for Working with IoT and Time-ser...
 
NoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL ApplicationsNoSE: Schema Design for NoSQL Applications
NoSE: Schema Design for NoSQL Applications
 
Getting Started with NoSQL
Getting Started with NoSQLGetting Started with NoSQL
Getting Started with NoSQL
 

Recently uploaded

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Recently uploaded (20)

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Inferring Versioned Schemas from NoSQL Databases and its Applications

  • 1. Inferring Versioned Schemas from NoSQL Databases and its Applications ER’15 Stockholm, October 2015 [{ ”id”: ”90234 af”, ”value”: { ”author”: ”Diego Sevilla Ruiz”, ”e-mail”: ”dsevilla@um.es”, ”institution”: ”U. of Murcia”}}, { ”id”: ”a243bb5”, ”value”: { ”author”: ”Severino Feliciano Morales”, ”e-mail”: ”severino.feliciano@um.es”, ”institution”: ”U. of Murcia”}}, { ”id”: ”096705d”, ”value”: { ”author”: ”Jesús García Molina”, ”e-mail”: ”jmolina@um.es”, ”institution”: ”U. of Murcia”}}]
  • 2. Motivation NoSQL Databases are Schemaless Benefits ▶ No need to previously define an Schema ▶ Non-uniform data ▶ Custom fields ▶ Non-uniform types ▶ Easier evolution Drawbacks ▶ Harder to reason about the DB ▶ Static checking is lost ▶ Some of the data logic is in the application code (more error prone) ▶ Some utilities need Schema information to work
  • 3. Schemas for NoSQL Databases ▶ How to alleviate the problems of schemaless databases? ⇒ Inferring a Schema ▶ The Schema Model contains information about Entities and Relationships ▶ Take into account the different Entity Versions in the Database ▶ Heterogeneity usually because of slight variations on Entities ▶ We obtain a precise database model ▶ The Schema allows us to automate the construction of tools: ▶ migration, refactoring, visualization, …
  • 4. Related Work ▶ JSON Schema ▶ Object versions and relationships are not considered ▶ Apache Spark SQL/Drill: SQL-like schemas ▶ Union of all fields, nullable ⇒ incorrect combinations ▶ Over-generalization to String ▶ Aggregations and Reference relations not considered ▶ MongoDB-Schema ▶ Prototype to infer schemas from MongoDB collections ▶ Same limitations than Spark SQL ▶ JSON Discoverer ▶ A MDE solution to infer domain models from REST web services (i.e. JSON documents) ▶ Not database-oriented; Object versions not considered
  • 5. Spark SQL Example {”name”:”Michael”} {”name”:”Andy”, ”age”:30} {”name”:”Justin”, ”age”:19} {”name”:”Peter”, ”age”:”tiny”} {”name”:”Martina”, ”address”:”home!”} > people.printSchema root |-- address: string (nullable = true) |-- age: string (nullable = true) |-- name: string (nullable = true) ▶ age promoted to string ▶ age and address are never part of the same object
  • 6. { ”rows”:[ { ”content”:{ ”chapters”:33, ”pages”:527 }, ”authors”:[ { ”company”:{ ”country”:”USA”, ”name”:”IBM” }, ”name”:”Grady Booch”, ”_id”:”210” }, { ”company”:{ ”country”:”USA”, ”name”:”IBM” }, ”name”:”James Rumbaugh”, ”_id”:”310” }, { ”country”:”USA”, ”company”:”Ivar Jacobson Consulting”, ”name”:”Ivar Jacobson”, ”_id”:”410” }], ”type”:”book”, ”year”:2013, ”publisher_id”:”345679”, ”title”:”The Unified Modeling Language”, ”_id”:”1” }, { ”discipline”:”software engineering”, ”issn”:[ ”0098 -5589”, ”1939 -3520” ], ”name”:”IEEE Trans. on Software Engineering”, ”type”:”journal”, ”_id”:”11” }, { ”name”:”Automated Software Engineering”, ”issn”:[ ”0928 -8910”, ”1573 -7535” ], ”discipline”:”software engineering”, ”type”:”journal”, ”_id”:”12”, ”number”:10515 }, { ”city”:”Barcelona”, ”name”:”Omega”, ”type”:”publisher”, ”_id”:”123451” }, { ”type”:”publisher”, ”city”:”Newton”, ”name”:”O’Reilly Media”, ”_id”:”928672” }, { ”type”:”book”, ”author”:{ ”_id”:”101”, ”name”:”Bradley Holt”, ”company”:{ ”country”:”USA”, ”name”:”IBM Cloudant”, } }, ”title”:”Writing and Querying MapReduce Views in CouchDB”, ”publisher_id”:”928672”, ”_id”:”2” }, { ”name”:”Addison -Wesley”, ”type”:”publisher”, ”_id”:”345679” }, { ”type”:”publisher”, ”journals”:[ ”11”, ”12” ], ”name”:”IEEE Publications”, ”_id”:”907863” }]}
  • 7. NoSQL Database Model ▶ Objects (Entities) and Entity Versions ▶ Attributes ▶ Relationships ▶ Aggregation ▶ References { ”type”:”publisher”, ”city”:”Newton”, ”name”:”O’Reilly Media”, ”_id”:”928672” }, { ”type”:”book”, ”author”:{ ”_id”:”101”, ”name”:”Bradley Holt”, ”company”:{ ”country”:”USA”, ”name”:”IBM Cloudant”, } }, ”title”:”Writing and Querying MapReduce Views in CouchDB”, ”publisher_id”:”928672”, ”_id”:”2” },
  • 8. Schema & Entity Versions Description Entity Publisher { Version 1 { name: String city: String } Version 2 { name: String } Version 3 { name: String journal[+]: [Ref]->[Journal] (opposite=False) } } Entity Journal { Version 1 { issn: Tuple [String, String] name: String discipline: String } Version 2 { issn: Tuple [String, String] name: String discipline: String number: int } } Entity Book { Version 1 { title: String year: int publisher[1]: [Ref]->[Publisher] (opossite=False) content[1]: [Aggregate]Content1 author[+]: [Aggregate]Author1 } Version 2 { title: String publisher[1]: [Ref]->[Publisher] (opossite=False) author[1]: [Aggregate]Author1 } } Entity Author { Version 1 { name: String company[1]: [Aggregate]Company } Version 2 { country: String company: String name: String } } Entity Company { Version 1 { name: String country: String } } Entity Content { Version 1 { chapters: int pages: int } } (a) (b) [1..1] company [1..1] publisher[1..1] content[1..*] authors [1..*] journals
  • 9. Solution Design Considerations ▶ We have to process all the objects in the Database ⇒ Map-Reduce ▶ Natural data processing on NoSQL databases ▶ Leverage MDE technologies ▶ Reuse EMF/Ecore tooling to show entity diagrams ▶ Automation & Code Generation by Metamodeling & Model Transformations
  • 11. Reverse Engineering Process (i) ▶ Map-Reduce process ▶ Map: obtains the Raw Schema for each object ▶ Reduce: selects an archetype for each Entity Version ▶ Entity Type ▶ Root objects ⇒ “type” field or collection name ▶ Aggregated objects ⇒ key of the pair (e.g. “author”) JSON object Raw Schema {name:“Omega”, city:“Barcelona”} {name:String, city:String} {title:“Writing and...”, publisher_id:“928672”, author:{name:“Bradley Holt”, company:{country:“USA”, name:“IBM Cloudant”} } } {title:String, publisher_id:String, author:{name:String, company:{country:String, name:String} } }
  • 12. Reverse Engineering Process (ii) ▶ Attributes: primitive or tuple ▶ Aggregated Entities ▶ Value of the pair is an Object (or array of objects) ▶ Entity type inferred from the key ▶ References ▶ Heuristics/Conventions ▶ Key: <entity_name>_id ▶ Value: MongoDB’s DBRef abstraction: {”$ref”: ”<entity_name>”, ”$id”, <id_value>} ▶ Honor cardinalities (arrays)
  • 13.
  • 14. Example NoSQL Applications ▶ From the DBSchema model, using Model Transformations and Model-to-Text transformations (Code Generation), we can: ▶ Generate models that Characterize each Entity Version ▶ That characterization can be used to Visualize the Database ▶ And also to generate code to Validate objects entering the Database ▶ Generate models that allow Database Migration to the desired Entity Versions
  • 16. function isOfExactTypeBook_2(obj) { if (! (”type” in obj)) { return false; } if (obj[type] !== ”Book”) { return false; } if (! (”title” in obj)) { return false; } if (! (”author” in obj)) { return false; } if (”publisher” in obj) { return false; } if (”content” in obj) { return false; } if (”year” in obj) { return false; } return true; } Generated using a Model- to-Text transformation from an instance of the previous Type Discrimina- tion Metamodel
  • 17.
  • 18.
  • 21. Type Transformation Metamodel db.<collection >. update( <query >, <update >, { multi: true } ) Obtained by Entity Type Characterization Generate the correct update MongoDB statement using $set, $push, etc., maybe via user assis- tance through a DSL. For example, for Journal_1 to Journal_2: $set: { ”number”: 1 }
  • 22. Conclusions & Future work ▶ A process for obtaining Conceptual Model Schemas for NoSQL Databases is shown ▶ The process takes into account the different Entity Versions present in the Database ▶ A MDE process allows us to automate the production of several applications from the Schemas ▶ Example applications that allow Database Visualization and Migration are shown
  • 23. Conclusions & Future work (ii) ▶ Future work includes: ▶ Building a NoSQL Database Tool Set (NoSQL Data Engineering) ▶ DSL for Entity Version migration ▶ Refining the Schema to allow a richer Type System ▶ Allow value ranges or enumerated sets ▶ Infer attribute dependencies (derived attributes, i.e. the value of an attribute dictates the value of another attribute) ▶ etc.