Code-generators and low-code tools need to be able to target a combination of SQL and NoSQL databases as storage mechanisms for the apps they generate. Our UMLtoNoSQL solution enables this.
2. Why Heterogeneous Datastores?
Many data storage solutions
Relational databases are still massively used
> 250 solutions in the only family of NoSQL databases
Efficient data storage and processing solutions
Specific representations
Graphs, Documents, Key-Values …
Flexible schemas, high scalability, availability …
Specific use cases
Atomic accesses, highly-connected data, temporal versioning …
2
3. Why Heterogeneous Datastores?
Applications use multiple solutions to maximize their benefit
Multi-store infrastructure
Cloud-based data storage
Add new features stored in new databases
3
4. So what’s the problem?
Defining such applications is a complex task
Need to take into account multiple storage types
Query languages
Data representation
Implicit schemas
Manually implemented in the application
The elephant in the room: query multiple data sources
4
5. Conceptual Modeling to the rescue
Several solutions to map conceptual schemas to specific data stores
UML/ER to Relational DB
UML to GraphDB
UML to HBase
Limited support for multi-store systems
Split data in multiple storage solutions
Integrity constraints
Uniform data access
5
6. UmlTo[No]SQL
MDA approach for multi-store systems
From UML/OCL
Conceptual schema partitioning
« Logical » schema generation
Constraint to query translation
Code generation
6
7. UmlTo[No]SQL – Starting Point
Conceptual schema
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
Integrity constraints
7
8. UmlTo[No]SQL – Model Partitioning
Model partitioning
UML profile
Use packages to define regions over the
model
Provide datastore-specific information:
relational, document, etc
Defined at the element level
8
9. UmlTo[No]SQL – Logical Schema
Metamodels representing families of data stores
Relational Metamodel
9
10. UmlTo[No]SQL – Logical Schema
Metamodels representing families of data stores
Graph Metamodel
10
11. UmlTo[No]SQL – Logical Schema
Metamodels representing families of data stores
Document Metamodel
11
12. UmlTo[No]SQL – Logical Schema
Model Transformations
(Annotated) class diagram to
Relational metamodel
Graph metamodel
Document metamodel
A common UUID type to represent cross-datastore associations
Optional transformations from family metamodel to specific platform
E.g. DocumentDB to MongoDB / GraphDB to Neo4j
Integrate advanced optimizations (indexes, specific data structures, etc)
12
13. UmlTo[No]SQL – Mapping Constraints
Model Transformations
Metamodels for specific query languages
Single datastore constraints mapping
OCL → SQL
OCL → Gremlin
OCL → MongoQL
13
14. UmlTo[No]SQL - Mapping Constraints
Model Transformations
Cross-datastore constraints mapping
Split the query into datastore-specific sub-queries
Generate intermediate variables to store the results
Join functions based on the UUID datatype
14
16. UmlTo[No]SQL - Mapping Constraints
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
16
17. UmlTo[No]SQL - Mapping Constraints
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
17
Client.allInstances().orders orders->select(o | not o.paid).size() < 3
18. UmlTo[No]SQL - Mapping Constraints
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
18
Client.allInstances().orders orders->select(o | not o.paid).size() < 3
sql_orders := select order_id from orders where
client_id in (select id from client)
19. UmlTo[No]SQL - Mapping Constraints
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
19
Client.allInstances().orders orders->select(o | not o.paid).size() < 3
sql_orders := select order_id from orders where
client_id in (select id from client)
mongo_orders := relToDoc(sql_orders)
20. UmlTo[No]SQL - Mapping Constraints
context Client inv maxUnpaidOrders:
self.orders->select(
o | not o.paid)
->size()< 3
20
Client.allInstances().orders orders->select(o | not o.paid).size() < 3
sql_orders := select order_id from orders where
client_id in (select id from client)
db.order.find({_id : {$in: mongo_orders},
$where : “!o.paid"}).length > 3
mongo_orders := relToDoc(sql_orders)
21. UmlTo[No]SQL – Code Generation
Deploy the database
Relational world: DDL scripts
NoSQL world: configuration scripts when possible
Runtime data access
Custom data access API
From conceptual model (e.g. getClient, createOrder)
Manages the concrete datastore
Constraint checking
Native query execution
Query orchestration for cross-datastore constraints
21
22. Conclusion
Top-down approach for multi-database application design
Model partitioning
Region mapping
Constraint mapping (including cross-datastore)
Code generation + runtime data access
22
23. Future Work
Performance evaluation
Integrate existing cross-datastore query languages
CloudMdsQL, Apache Drill …
Automatic Schema partitioning
Reverse direction is also promising: extracting conceptual schemas from
heterogeneous datastores
Applications in legacy systems / open data / …
23