SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Efficient Persistence, Query, and
Transformation of Large Models
Gwendal DANIEL
Inria, IMT Atlantique, LS2N
gwendal.daniel@inria.fr
Supervisor: Jordi Cabot
Co-Supervisor: Gerson Sunyé
Co-Supervisor: Massimo Tisi
Context
“A model of a system or process is a theoretical description that can help you understand how the
system or process works, or how it might work.’’ (Collins 2017)
2
Credit: https://www.autodesk.com/solutions/bim
Context
• Software Modeling
3
Forward Engineering
Reverse Engineering
Context
4
Injection
Transformations
Queries
Generation
Model Driven
Reverse Engineering
Context
5
Injection
Transformations
Queries
Generation
Model Driven
Reverse Engineering
Execution Time
Latencies
Memory
Consumption
Limited adoption in the industry
• Scalability of existing solutions
• Large generated models
[J. Whittle et. al., 2014]
Research Challenges
1. Storing large models
• Efficient data structures
• Efficient persistence solutions
• Efficient memory management
6
2. Accessing stored models efficiently
• Data representation
• Caching and prefetching
3. Querying large models
• Complex model navigations
• Efficient constraint checking
4. Transforming large models
• Refactoring operations
• Code generation
Contributions
• A novel scalable modeling
ecosystem
7
MogwaïGremlin-ATL
NeoEMF PrefetchML
Scalable Model Persistence
Efficient
Transformations
Efficient Queries
1. A multi-database model
persistence solution
4. A scalable model transformation
engine
3. An OCL to database query
language approach
2. A model prefetching and caching
component
Research Process
8
State of the Art
Analysis
Problematic
Definition
Hypothesis &
Solution
Prototype
Implementation
Evaluation &
Analysis
Hypothesis
Validation and
Reformulation
Challenge
Identification
Efficient Model Persistence
9
NeoEMF
MogwaïGremlin-ATL
NeoEMF PrefetchML
Scalable Model Persistence
Efficient
Transformations
Efficient Queries
Model Persistence
• Default serialization mechanism: XMI
• Scalable model persistence frameworks: CDO, Morsa
• Use databases to store models
• Lazy loading
• Low memory footprint
10
NeoEMF
State of the Art
Analysis
Problematic
• Performance & memory issues
• [Pagan et.al., 2011; Gomez et. al., 2015]
• Mostly relational solutions
• Generic scalability improvements
• Single storage solutions
• Not aware of the modeling scenario
• External (non-standard) APIs
11
NeoEMF
Problematic
Definition
NeoEMF
• A rich infrastructure that allows to choose the database and data
representation to use.
• Compliant with existing modeling solutions
• Extensible architecture
• Memory efficiency
12
NeoEMF
Hypothesis
& Solution
NeoEMF
• NoSQL persistence
• Specific purpose
• Efficient to process large amount of data
• Built-in scaling capabilities
• Advanced query language
13
NeoEMF
Hypothesis
& solution
NeoEMF
14
Model-Based Tools
Modeling Framework
Persistence
Framework
NeoEMF Core
/Graph
NeoEMF Advanced API
Importers
/Map /Column
Data-store
Connector
Model Access API
Persistence API
Backend Connector
Backend API
Caching
Blueprints MapDB
BerkeleyDB
HBase +
Zookeeper
NeoEMF
Prototype
Implementation
NeoEMF
• Implementation
• Eclipse plugin
• Maven central
• Open source (EPL v2)
• 1700 Unit Tests
• 80% code coverage
15
NeoEMF
Prototype
Implementation
Evaluation
• Reference Benchmarks
• Transformation Tool Contest (TTC)
• Grabats
• Model Driven Engineering integration
• MoDisco
• Persistence layer for the MONDO platform
• European project on MDE scalability
• JMH Benchmarks
• Reproducible using a Docker image
16
NeoEMF
Evaluation
Evaluation
17
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
set1 set2 set3
Invisible Methods Results (ms)
EMF-XMI CDO NeoEMF/Graph NeoEMF/Map
OutOfMemory
549892
693537
… …
NeoEMF
3 models (6000 to 1.5M elements)
Constrained memory environment
4 17
Evaluation
& Analysis
Model Caching and Prefetching
18
PrefetchML
MogwaïGremlin-ATL
NeoEMF PrefetchML
Scalable Model Persistence
Efficient
Transformations
Efficient Queries
Prefetching & Caching
• Prefetching
• Bring objects in memory before they are requested
• Caching
• Retain objects in memory to speed-up their access
• Integrated in databases and file systems
• Speeds-up I/O intensive applications
19
PrefetchML
State of the Art
Analysis
Problematic
• Database prefetchers & caches
• Lack of fine-grained configuration
• Tightly coupled to data representation
• Not common in NoSQL stores
• Prefetching components in persistence frameworks
• Not expressive enough
• Lack of fine-grained strategies
20
PrefetchML
Problematic
Definition
Hypothesis
• We need to prefetch and cache models
• Scalable persistence frameworks
• Databases to store large models (relational, NoSQL)
• Latencies to bring elements from the database (lazy-loading)
• Metamodel information
• High-level prefetching and caching rules
• Decoupled from the data representation
21
PrefetchML
Hypothesis
& Solution
PrefetchML
• A prefetching and caching framework at the model level
• PrefetchML DSL (rule definition)
• Metamodel and model level
• Use case dependent
• Readable
• PrefetchML Engine (rule execution)
• Datastore independent
• Transparent to the persistence solution
22
PrefetchML
Hypothesis
& Solution
PrefetchML
Import ‘http://www.example.com/Java’
plan samplePlan {
rule r1: on starting
fetch Package.allInstances()
}
23
rule r2: on access type Class
fetch self.methods.modifier
use cache LRU[size=100, chunk=10]
(self.name = ‘MyClass’)
remove type Package
p1: Package
+ name = 'package1'
c1: Class
+ name = 'class1'
c2: Class
+ name = 'class2'
m1: Method
+ name = 'method1'
t1: Type
+ name = 'void'
mod1: Modifier
+ visibility = #private
classes
classes
imports
methods
returnType
modifier
PrefetchML
Prototype
Implementation
Evaluation
• Up to 18% improvement on NeoEMF/Graph
• Up to 95% improvement on NeoEMF/Map
• Invalid plans detected at runtime
• New rule suggestions
24
PrefetchML
Evaluation
& Analysis
Efficient Model Queries
25
Mogwaï
MogwaïGremlin-ATL
NeoEMF PrefetchML
Scalable Model Persistence
Efficient
Transformations
Efficient Queries
State of the Art
26
API Call1
…
API Calln
OCL
Query
OCL
Interpreter
EMF API Database
Package.allInstances().name
get(p1)
get(p1,name)
…
get(pn)
get(pn,name)
Modeler
Point of View
Under the Hood
Low-level modeling API
→ Not aligned with the
database capabilities
Fragmented queries
→ Not efficient
→ Remote database
Intermediate objects
→ Memory consumption
→ Execution time overhead
Mogwaï
State of the Art
Analysis
Problematic
• Not efficient to compute model queries
• Database queries are more efficient but
• Modern persistence frameworks typically rely on NoSQL databases
• Multiple query languages
• Multiple data representations
• Low-level queries are hard to understand and maintain
• Modeling expertise vs. Database expertise
• Solution: generate them!
27
Mogwaï
Problematic
Definition
Mogwaï
28
OCL
Query
Model
Transformation
Gremlin
Traversal
Blueprints
Database
Reification
Modeler
v1
v3
v2
p1: Package
p2: Package
p3: Package
• Generate graph database
queries from OCL expressions
• Bypass modeling framework
API
• Gremlin Query Language
Mogwaï
Package.allInstances()
• Single execution of the queries
• Compatible with existing
modeling frameworks
Hypothesis
& Solution
Mogwai
• Gremlin metamodel
• ~ 100 classes
• ATL Transformation
• OCL to Gremlin mapping
• Query composition
• 70 rules and helpers
29
Mogwaï
Package.allInstances().classes
->select(c | c.name = ‘class1’)
g.idx(‘’metaclasses’’)[[name:’’Package’’]]
inE(‘’instanceof’’).outV.
outE(‘’classes’’).inV.
filter{it.name=‘class1’}
OCL Expression Gremlin Steps
Type g.Idx(‘’metaclasses’’)[[name:’Type’]]
AllInstances() inE(‘instanceof’).outV
Collect(att) Att
Collect(ref) outE(‘ref’).inV
Select(cond) Filter{cond}
oclIsTypeOf(Type) outE(‘instanceof’).inV.transform{it.next() ==
Type}
Prototype
Implementation
30
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
set2 set3
Invisible Methods Results (ms)
OCL IncQuery Mogwaï NeoEMF/Map
175000
……
606000
Mogwaï
2 models (80000 to 1.5M elements)
Constrained memory environment
NeoEMF/Graph
Evaluation
Evaluation
& Analysis
Scalable Model Transformations
31
Gremlin-ATL
MogwaïGremlin-ATL
NeoEMF PrefetchML
Scalable Model Persistence
Efficient
Transformations
Efficient Queries
• Extension of Mogwaï for model transformations
• ATL to Gremlin transformation
• Multiple database support
• Advanced configuration
Gremlin-ATL
32
ATLtoGremlin
Transformation
Gremlin
Traversal
Output
DatabaseATL Transformation
Input
Database
• Evaluation: reference transformation
• Comparison with the standard ATL engine
• Reduces the execution time by a factor of 2 to 134
• Reduces the memory consumption by a factor of 10
Gremlin-ATL
Conclusion
Conclusion
• Novel modeling ecosystem
• Scalable persistence approach
• Multi-databases: adapted to the expected modeling scenario
• Compliant with existing modeling solutions
• On-demand loading
• Prefetching and caching component
• Improve large model access efficiency
• Define precise plans adapted to a given model access pattern
34
Scalable Model
Persistence
Conclusion
• Novel modeling ecosystem
• Translation-based approach for query and
transformation computation
• Standard language support
• Generate database-specific operations
• Bypasses modeling framework API
• Positive performances on large models
35
Efficient Model
Queries & Transformations
Perspectives & Future Work
• NeoEMF
• Combine multiple databases
• Multi-resource database
• Integrate collaborative modeling techniques
• PrefetchML
• Automatic plan generation
• Static analysis: existing code
• Dynamic analysis: execution logs
• Advanced monitoring component
• Rule creation / deletion
• Rule modifications
36
Perspectives & Future Work
• Application outside the MDE community
• UML to GraphDB
• Use NeoEMF internal mapping and Mogwaï to generate graph code from UML
• Gremlin-ATL to express data migration operations
• Promising results on the Neo4j panama paper database
• Native database prefetching with PrefetchML
• A model-based prefetcher for NoSQL databases
37
Publications
• Journals
• G. Daniel, G. Sunyé, A. Benelallam, M. Tisi, Y. Vernageau, A. Gomez & J. Cabot
NeoEMF, a Multi-Database Model Persistence Framework for Very Large Models. SCP 2017
• G. Daniel, G. Sunyé, J. Cabot
Advanced Prefetching and Caching of Models with PrefetchML. SoSym 2018 (accepted)
• International Conferences
• G. Daniel, G. Sunyé, J. Cabot
Mogwaï, a Framework to Handle Complex Queries on Large Models. RCIS 2016
• G. Daniel, G. Sunyé, J. Cabot
PrefetchML, a Framework for Prefetching and Caching Models. MoDELS 2016. Distinguished paper award
• G. Daniel, G. Sunyé, J. Cabot
UMLtoGraphDB: Mapping Conceptual Schemas to Graph Databases. ER 2016
• G. Daniel, F. Jouault, G. Sunyé, J. Cabot
Gremlin-ATL: a Scalable Model Transformation Framework. ASE 2017
• 3 Workshops
38
Websites & Software Repositories
NeoEMF: neoemf.com
PrefetchML: github.com/atlanmod/Prefetching_Caching_DSL
Mogwaï/Gremlin-ATL: github.com/atlanmod/mogwai
UmltoGraphDB : github.com/atlanmod/uml2nosql
39
40
Question?
Thank you for your attention!
☺ P, F, M, A, P-Y, R, M, G, E, J, S

Weitere ähnliche Inhalte

Ähnlich wie Thesis Defense (Gwendal DANIEL) - Nov 2017

StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkSri Ambati
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudGábor Szárnyas
 
Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemDatabricks
 
Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing SystemsSearch-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing SystemsLionel Briand
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with kerasMOHITKUMAR1379
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...Benoit Combemale
 
Gremlin-ATL: a Scalable Model Transformation Framework
Gremlin-ATL: a Scalable Model Transformation FrameworkGremlin-ATL: a Scalable Model Transformation Framework
Gremlin-ATL: a Scalable Model Transformation FrameworkGwendal Daniel
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime PlatformAlexey Kharlamov
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptxArthur240715
 
DDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCDDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCAndy Butland
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storagejccastrejon
 
Modular ObjectLens
Modular ObjectLensModular ObjectLens
Modular ObjectLensESUG
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuantUniversity
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305Amazon Web Services
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...inside-BigData.com
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdDatabricks
 
Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Dan Crankshaw
 
New types of tests for Java projects
New types of tests for Java projectsNew types of tests for Java projects
New types of tests for Java projectsVincent Massol
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Benoit Combemale
 

Ähnlich wie Thesis Defense (Gwendal DANIEL) - Nov 2017 (20)

StackNet Meta-Modelling framework
StackNet Meta-Modelling frameworkStackNet Meta-Modelling framework
StackNet Meta-Modelling framework
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the Cloud
 
Clipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving SystemClipper: A Low-Latency Online Prediction Serving System
Clipper: A Low-Latency Online Prediction Serving System
 
Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing SystemsSearch-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing Systems
 
Deep learning with keras
Deep learning with kerasDeep learning with keras
Deep learning with keras
 
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
On Modeling and Testing When Unpredictability Becomes the Pattern (April 2nd,...
 
Gremlin-ATL: a Scalable Model Transformation Framework
Gremlin-ATL: a Scalable Model Transformation FrameworkGremlin-ATL: a Scalable Model Transformation Framework
Gremlin-ATL: a Scalable Model Transformation Framework
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
DDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVCDDD, CQRS and testing with ASP.Net MVC
DDD, CQRS and testing with ASP.Net MVC
 
Model-Driven Cloud Data Storage
Model-Driven Cloud Data StorageModel-Driven Cloud Data Storage
Model-Driven Cloud Data Storage
 
Modular ObjectLens
Modular ObjectLensModular ObjectLens
Modular ObjectLens
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
 
(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305(CMP305) Deep Learning on AWS Made EasyCmp305
(CMP305) Deep Learning on AWS Made EasyCmp305
 
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
Abstractions and Directives for Adapting Wavefront Algorithms to Future Archi...
 
Tensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with HummingbirdTensors Are All You Need: Faster Inference with Hummingbird
Tensors Are All You Need: Faster Inference with Hummingbird
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017Clipper at UC Berkeley RISECamp 2017
Clipper at UC Berkeley RISECamp 2017
 
New types of tests for Java projects
New types of tests for Java projectsNew types of tests for Java projects
New types of tests for Java projects
 
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)
 

Kürzlich hochgeladen

Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸mathanramanathan2005
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationNathan Young
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSebastiano Panichella
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comsaastr
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxmavinoikein
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSebastiano Panichella
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxaryanv1753
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Escort Service
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringSebastiano Panichella
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxJohnree4
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...Henrik Hanke
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this periodSaraIsabelJimenez
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRachelAnnTenibroAmaz
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Krijn Poppe
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEMCharmi13
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptxogubuikealex
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRRsarwankumar4524
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGYpruthirajnayak525
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...漢銘 謝
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...marjmae69
 

Kürzlich hochgeladen (20)

Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸Mathan flower ppt.pptx slide orchids ✨🌸
Mathan flower ppt.pptx slide orchids ✨🌸
 
The Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism PresentationThe Ten Facts About People With Autism Presentation
The Ten Facts About People With Autism Presentation
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.comSaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
SaaStr Workshop Wednesday w/ Kyle Norton, Owner.com
 
Work Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptxWork Remotely with Confluence ACE 2.pptx
Work Remotely with Confluence ACE 2.pptx
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Event 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptxEvent 4 Introduction to Open Source.pptx
Event 4 Introduction to Open Source.pptx
 
Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170Call Girls In Aerocity 🤳 Call Us +919599264170
Call Girls In Aerocity 🤳 Call Us +919599264170
 
The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Genshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptxGenshin Impact PPT Template by EaTemp.pptx
Genshin Impact PPT Template by EaTemp.pptx
 
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
DGT @ CTAC 2024 Valencia: Most crucial invest to digitalisation_Sven Zoelle_v...
 
Early Modern Spain. All about this period
Early Modern Spain. All about this periodEarly Modern Spain. All about this period
Early Modern Spain. All about this period
 
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATIONRACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
RACHEL-ANN M. TENIBRO PRODUCT RESEARCH PRESENTATION
 
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
Presentation for the Strategic Dialogue on the Future of Agriculture, Brussel...
 
Quality by design.. ppt for RA (1ST SEM
Quality by design.. ppt for  RA (1ST SEMQuality by design.. ppt for  RA (1ST SEM
Quality by design.. ppt for RA (1ST SEM
 
Chizaram's Women Tech Makers Deck. .pptx
Chizaram's Women Tech Makers Deck.  .pptxChizaram's Women Tech Makers Deck.  .pptx
Chizaram's Women Tech Makers Deck. .pptx
 
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRRINDIAN GCP GUIDELINE. for Regulatory  affair 1st sem CRR
INDIAN GCP GUIDELINE. for Regulatory affair 1st sem CRR
 
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC  - NANOTECHNOLOGYPHYSICS PROJECT BY MSC  - NANOTECHNOLOGY
PHYSICS PROJECT BY MSC - NANOTECHNOLOGY
 
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
THE COUNTRY WHO SOLVED THE WORLD_HOW CHINA LAUNCHED THE CIVILIZATION REVOLUTI...
 
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
Gaps, Issues and Challenges in the Implementation of Mother Tongue Based-Mult...
 

Thesis Defense (Gwendal DANIEL) - Nov 2017

  • 1. Efficient Persistence, Query, and Transformation of Large Models Gwendal DANIEL Inria, IMT Atlantique, LS2N gwendal.daniel@inria.fr Supervisor: Jordi Cabot Co-Supervisor: Gerson Sunyé Co-Supervisor: Massimo Tisi
  • 2. Context “A model of a system or process is a theoretical description that can help you understand how the system or process works, or how it might work.’’ (Collins 2017) 2 Credit: https://www.autodesk.com/solutions/bim
  • 3. Context • Software Modeling 3 Forward Engineering Reverse Engineering
  • 5. Context 5 Injection Transformations Queries Generation Model Driven Reverse Engineering Execution Time Latencies Memory Consumption Limited adoption in the industry • Scalability of existing solutions • Large generated models [J. Whittle et. al., 2014]
  • 6. Research Challenges 1. Storing large models • Efficient data structures • Efficient persistence solutions • Efficient memory management 6 2. Accessing stored models efficiently • Data representation • Caching and prefetching 3. Querying large models • Complex model navigations • Efficient constraint checking 4. Transforming large models • Refactoring operations • Code generation
  • 7. Contributions • A novel scalable modeling ecosystem 7 MogwaïGremlin-ATL NeoEMF PrefetchML Scalable Model Persistence Efficient Transformations Efficient Queries 1. A multi-database model persistence solution 4. A scalable model transformation engine 3. An OCL to database query language approach 2. A model prefetching and caching component
  • 8. Research Process 8 State of the Art Analysis Problematic Definition Hypothesis & Solution Prototype Implementation Evaluation & Analysis Hypothesis Validation and Reformulation Challenge Identification
  • 9. Efficient Model Persistence 9 NeoEMF MogwaïGremlin-ATL NeoEMF PrefetchML Scalable Model Persistence Efficient Transformations Efficient Queries
  • 10. Model Persistence • Default serialization mechanism: XMI • Scalable model persistence frameworks: CDO, Morsa • Use databases to store models • Lazy loading • Low memory footprint 10 NeoEMF State of the Art Analysis
  • 11. Problematic • Performance & memory issues • [Pagan et.al., 2011; Gomez et. al., 2015] • Mostly relational solutions • Generic scalability improvements • Single storage solutions • Not aware of the modeling scenario • External (non-standard) APIs 11 NeoEMF Problematic Definition
  • 12. NeoEMF • A rich infrastructure that allows to choose the database and data representation to use. • Compliant with existing modeling solutions • Extensible architecture • Memory efficiency 12 NeoEMF Hypothesis & Solution
  • 13. NeoEMF • NoSQL persistence • Specific purpose • Efficient to process large amount of data • Built-in scaling capabilities • Advanced query language 13 NeoEMF Hypothesis & solution
  • 14. NeoEMF 14 Model-Based Tools Modeling Framework Persistence Framework NeoEMF Core /Graph NeoEMF Advanced API Importers /Map /Column Data-store Connector Model Access API Persistence API Backend Connector Backend API Caching Blueprints MapDB BerkeleyDB HBase + Zookeeper NeoEMF Prototype Implementation
  • 15. NeoEMF • Implementation • Eclipse plugin • Maven central • Open source (EPL v2) • 1700 Unit Tests • 80% code coverage 15 NeoEMF Prototype Implementation
  • 16. Evaluation • Reference Benchmarks • Transformation Tool Contest (TTC) • Grabats • Model Driven Engineering integration • MoDisco • Persistence layer for the MONDO platform • European project on MDE scalability • JMH Benchmarks • Reproducible using a Docker image 16 NeoEMF Evaluation
  • 17. Evaluation 17 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 set1 set2 set3 Invisible Methods Results (ms) EMF-XMI CDO NeoEMF/Graph NeoEMF/Map OutOfMemory 549892 693537 … … NeoEMF 3 models (6000 to 1.5M elements) Constrained memory environment 4 17 Evaluation & Analysis
  • 18. Model Caching and Prefetching 18 PrefetchML MogwaïGremlin-ATL NeoEMF PrefetchML Scalable Model Persistence Efficient Transformations Efficient Queries
  • 19. Prefetching & Caching • Prefetching • Bring objects in memory before they are requested • Caching • Retain objects in memory to speed-up their access • Integrated in databases and file systems • Speeds-up I/O intensive applications 19 PrefetchML State of the Art Analysis
  • 20. Problematic • Database prefetchers & caches • Lack of fine-grained configuration • Tightly coupled to data representation • Not common in NoSQL stores • Prefetching components in persistence frameworks • Not expressive enough • Lack of fine-grained strategies 20 PrefetchML Problematic Definition
  • 21. Hypothesis • We need to prefetch and cache models • Scalable persistence frameworks • Databases to store large models (relational, NoSQL) • Latencies to bring elements from the database (lazy-loading) • Metamodel information • High-level prefetching and caching rules • Decoupled from the data representation 21 PrefetchML Hypothesis & Solution
  • 22. PrefetchML • A prefetching and caching framework at the model level • PrefetchML DSL (rule definition) • Metamodel and model level • Use case dependent • Readable • PrefetchML Engine (rule execution) • Datastore independent • Transparent to the persistence solution 22 PrefetchML Hypothesis & Solution
  • 23. PrefetchML Import ‘http://www.example.com/Java’ plan samplePlan { rule r1: on starting fetch Package.allInstances() } 23 rule r2: on access type Class fetch self.methods.modifier use cache LRU[size=100, chunk=10] (self.name = ‘MyClass’) remove type Package p1: Package + name = 'package1' c1: Class + name = 'class1' c2: Class + name = 'class2' m1: Method + name = 'method1' t1: Type + name = 'void' mod1: Modifier + visibility = #private classes classes imports methods returnType modifier PrefetchML Prototype Implementation
  • 24. Evaluation • Up to 18% improvement on NeoEMF/Graph • Up to 95% improvement on NeoEMF/Map • Invalid plans detected at runtime • New rule suggestions 24 PrefetchML Evaluation & Analysis
  • 25. Efficient Model Queries 25 Mogwaï MogwaïGremlin-ATL NeoEMF PrefetchML Scalable Model Persistence Efficient Transformations Efficient Queries
  • 26. State of the Art 26 API Call1 … API Calln OCL Query OCL Interpreter EMF API Database Package.allInstances().name get(p1) get(p1,name) … get(pn) get(pn,name) Modeler Point of View Under the Hood Low-level modeling API → Not aligned with the database capabilities Fragmented queries → Not efficient → Remote database Intermediate objects → Memory consumption → Execution time overhead Mogwaï State of the Art Analysis
  • 27. Problematic • Not efficient to compute model queries • Database queries are more efficient but • Modern persistence frameworks typically rely on NoSQL databases • Multiple query languages • Multiple data representations • Low-level queries are hard to understand and maintain • Modeling expertise vs. Database expertise • Solution: generate them! 27 Mogwaï Problematic Definition
  • 28. Mogwaï 28 OCL Query Model Transformation Gremlin Traversal Blueprints Database Reification Modeler v1 v3 v2 p1: Package p2: Package p3: Package • Generate graph database queries from OCL expressions • Bypass modeling framework API • Gremlin Query Language Mogwaï Package.allInstances() • Single execution of the queries • Compatible with existing modeling frameworks Hypothesis & Solution
  • 29. Mogwai • Gremlin metamodel • ~ 100 classes • ATL Transformation • OCL to Gremlin mapping • Query composition • 70 rules and helpers 29 Mogwaï Package.allInstances().classes ->select(c | c.name = ‘class1’) g.idx(‘’metaclasses’’)[[name:’’Package’’]] inE(‘’instanceof’’).outV. outE(‘’classes’’).inV. filter{it.name=‘class1’} OCL Expression Gremlin Steps Type g.Idx(‘’metaclasses’’)[[name:’Type’]] AllInstances() inE(‘instanceof’).outV Collect(att) Att Collect(ref) outE(‘ref’).inV Select(cond) Filter{cond} oclIsTypeOf(Type) outE(‘instanceof’).inV.transform{it.next() == Type} Prototype Implementation
  • 30. 30 0 10000 20000 30000 40000 50000 60000 70000 80000 90000 100000 set2 set3 Invisible Methods Results (ms) OCL IncQuery Mogwaï NeoEMF/Map 175000 …… 606000 Mogwaï 2 models (80000 to 1.5M elements) Constrained memory environment NeoEMF/Graph Evaluation Evaluation & Analysis
  • 31. Scalable Model Transformations 31 Gremlin-ATL MogwaïGremlin-ATL NeoEMF PrefetchML Scalable Model Persistence Efficient Transformations Efficient Queries
  • 32. • Extension of Mogwaï for model transformations • ATL to Gremlin transformation • Multiple database support • Advanced configuration Gremlin-ATL 32 ATLtoGremlin Transformation Gremlin Traversal Output DatabaseATL Transformation Input Database • Evaluation: reference transformation • Comparison with the standard ATL engine • Reduces the execution time by a factor of 2 to 134 • Reduces the memory consumption by a factor of 10 Gremlin-ATL
  • 34. Conclusion • Novel modeling ecosystem • Scalable persistence approach • Multi-databases: adapted to the expected modeling scenario • Compliant with existing modeling solutions • On-demand loading • Prefetching and caching component • Improve large model access efficiency • Define precise plans adapted to a given model access pattern 34 Scalable Model Persistence
  • 35. Conclusion • Novel modeling ecosystem • Translation-based approach for query and transformation computation • Standard language support • Generate database-specific operations • Bypasses modeling framework API • Positive performances on large models 35 Efficient Model Queries & Transformations
  • 36. Perspectives & Future Work • NeoEMF • Combine multiple databases • Multi-resource database • Integrate collaborative modeling techniques • PrefetchML • Automatic plan generation • Static analysis: existing code • Dynamic analysis: execution logs • Advanced monitoring component • Rule creation / deletion • Rule modifications 36
  • 37. Perspectives & Future Work • Application outside the MDE community • UML to GraphDB • Use NeoEMF internal mapping and Mogwaï to generate graph code from UML • Gremlin-ATL to express data migration operations • Promising results on the Neo4j panama paper database • Native database prefetching with PrefetchML • A model-based prefetcher for NoSQL databases 37
  • 38. Publications • Journals • G. Daniel, G. Sunyé, A. Benelallam, M. Tisi, Y. Vernageau, A. Gomez & J. Cabot NeoEMF, a Multi-Database Model Persistence Framework for Very Large Models. SCP 2017 • G. Daniel, G. Sunyé, J. Cabot Advanced Prefetching and Caching of Models with PrefetchML. SoSym 2018 (accepted) • International Conferences • G. Daniel, G. Sunyé, J. Cabot Mogwaï, a Framework to Handle Complex Queries on Large Models. RCIS 2016 • G. Daniel, G. Sunyé, J. Cabot PrefetchML, a Framework for Prefetching and Caching Models. MoDELS 2016. Distinguished paper award • G. Daniel, G. Sunyé, J. Cabot UMLtoGraphDB: Mapping Conceptual Schemas to Graph Databases. ER 2016 • G. Daniel, F. Jouault, G. Sunyé, J. Cabot Gremlin-ATL: a Scalable Model Transformation Framework. ASE 2017 • 3 Workshops 38
  • 39. Websites & Software Repositories NeoEMF: neoemf.com PrefetchML: github.com/atlanmod/Prefetching_Caching_DSL Mogwaï/Gremlin-ATL: github.com/atlanmod/mogwai UmltoGraphDB : github.com/atlanmod/uml2nosql 39
  • 40. 40 Question? Thank you for your attention! ☺ P, F, M, A, P-Y, R, M, G, E, J, S