The document describes research on data model evolution. It discusses how data models change over time through the addition, removal, and modification of entities and properties. The research aims to specify data model evolutions through a language that describes transformations at the level of entities and properties. This includes basic transformations like adding/removing entities/properties as well as more advanced transformations like moving properties. The transformations are applied using program rewriting to migrate data between different versions of a data model.
Model Driven Software Development - Data Model Evolution
1. Sander Vermolen
Eelco Visser
Data
Model
Evolution
This research is supported by NWO/JACQUARD project
638.001.610, MoDSE: ModelDriven Software Evolution.
9. No page count
No revisions
User 1
name bob
real name Bob Johnson
email b.johnson@mail.com
Page 1
title "The first page"
isRedirect false
text "Hello world"
13. Costly
High risk
Holds back the development process
Large infrequent development steps
14. User
id : integer
name : varchar
realName : varchar
email : tinytext
Page
id : integer
title : varchar
author - User *
isRedirect : boolean
content : text
set of
15. User
id : integer Unique
name : varchar
realName : varchar ?
email : tinytext
Page : Medium
author - User min(1) max(8)
content : text
refs : url * Indexed
abstract Medium
id : integer Unique
title : ANY_NAME
22. User User
id :: integer id :: integer
name :: varchar name :: varchar
realName :: varchar realName :: varchar
email :: tinytext email :: tinytext
Page Page
id :: integer id :: integer
title :: varchar title :: varchar
author User counter :: biginteger
isRedirect :: boolean isRedirect :: boolean
content :: text revisions set of Revision
Revision
id :: integer
page Page
comment :: tinyblob
timestamp :: time
revisionText :: text
author User
23. What happened?
Added type revisions Revision
id :: integer
page Page
comment :: tinyblob
timestamp :: time
author User
Added attribute revisions Page
revisions set of Revision
Moved content to revision text Revision
revisionText :: text
Added attribute counter Page
counter :: biginteger
24. 8 Basic Transformations
add or remove entity add or remove property
change name of entity change name of property
change type of set change type of property
31. Revision
revisionText :: text
At / Entity Revision / Property timeStamp
move page.content
to revisionText :: text
32. at Entity Page / Property Title
add counter :: biginteger
;
at Entity Revision / Property timeStamp
add revisionText :: text
from page.content
33. Evolving
Data Models
8 basic transformations
1 advanced transformation
Language to specify transformations
Positioning sub language
Specify data model evolutions
37. Program transformation for data migration
Because
we really like program transformations
generally richer than regular data acessing languages (SQL)
data migration = model transformation
38. User( User
id(1), id :: integer
name(“John”), name :: varchar
email(“johnnyboy@mail.com”) email :: tinytext
)
Page( Page
id(2), id :: integer
title(“Hello World”), title :: varchar
[author(1)] author User
)
40. Remove Attribute (1)
User( User
id(1), id :: integer
name(“John”), name :: varchar
..... email :: tinytext
)
Page( Page
id(2), id :: integer
title(“Hello World”), title :: varchar
[author(1)] author User
)
Signature:
User := Id * Name * Email
41. Remove Attribute (2)
User( User
id(1), id :: integer
name(“John”), name :: varchar
email(“johnnyboy@mail.com”) email :: tinytext
)
Page( Page
id(2), id :: integer
title(“Hello World”), title :: varchar
[author(1)] author User
)
42. Generic Aterm (GTerm)
User( 0, User
[ id :: integer
id(1), name :: varchar
name(“John”), email :: tinytext
email(“johnnyboy@mail.com”)
]
)
Page( 1, Page
[ id :: integer
id(2), title :: varchar
title(“Hello World”), author User
author(0)
]
)
45. GTerm Transformation
Gterm library
Object creation
Modifying attributes
(add, remove, change, rename, ...)
Object equivalence
Object traversals
(Object graph traversals)
Data model library
Type examination
Super/Sub type handling
Abstract type handling
46. GTerm Storage
Large quantities of data...
Storage engine:
In memory list based ~10K
In memory hash table based ~500K
In database ~25M ...
47. GTerm Storage – In database (1)
User( 0,
[
0 User id 1
id(1),
0 User name John
name(“John”),
0 User email jb@m.com
email(“jb@m.com”)
1 Page id 2
]
1 Page title Hello World
)
1 Page author 0
Page( 1,
[
id(2),
0 User
title(“Hello World”),
1 Page
author(0)
]
)
48. GTerm Storage – In database
GTerm Storage – In database (2)
CREATE TABLE Attributes ( CREATE TABLE Objects (
id varchar(16), id varchar(16),
type varchar(30), type varchar(30)
name varchar(30), )
value text,
INDEX USING HASH (id (5)),
INDEX USING BTREE (v(10))
)
50. GTerm Storage – Regular database
0 User id 1 User:
0 User name John 1 John jb@m.com
0 User email jb@m.com
1 Page id 2 Page:
1 Page title Hello World 2 Hello World
1 Page author 0
PageUser:
2 1
51. Data model
SQL Script GTerm 2 SQL
Old Database Generic Database SQL Script
Migration (Stratego) New Database
64. onType
1. find objects of type
2. divide into chunks
3. per chunk in parallel
Load objects in chunk
per object
s
save object if changed
65. Publication { Publication {
key : string Unique key : string Unique
title : string ? title : string ?
authors : string + Indexed authors - Author + Indexed
year : string year : string
... ...
} }
Author {
alias : string
}
?Transformation(path, Substitution(DeclType(Name(newTypeName))), _)
...
Author alias mandatory
Author alias not unique
onType(
for each author
create author object
set author attribute
)
66. Author { Author {
alias : string alias : Alias
} }
Alias {
name : string Unique
?Transformation(path, Substitution(DeclType(Name(newTypeName))), _)
...
Alias name mandatory
Alias name unique
onType(
if alias exists then
set alias attribute to existing id
else
create alias object
set alias attribute to new id
)
67. Supported transformations
Identity
Primitive attribute addition (3)
Complex attribute addition
Attribute removal
Attribute name change
Attribute move (2)
Primitive type change
Implicit reference resolution
Attribute wrapping
Type addition
Type removal
Type name change
Abstract type handling
Inverse annotation handling (2)
Cardinality changes (2)
68. In memory vs. Database transformations
Easy to define Hard to define
Easy to optimize No need to optimize
Expressive Limited expressiveness
Easy to abstract Abstraction near impossible
Performance OK Performance great
87. User
name :: varchar
realName :: varchar
email :: tinytext Entities
Properties
Page
title :: varchar Types
author User
isRedirect :: boolean
Meta model / Grammar
88.
89. Lists
More types
Inverse associations
Abstract types
101. Entity* > DataM Model
Id "{" Prop* "}" > Entity Entity
Id "::" Type > Prop Prop
"int" > Type Int
"bool" > Type Bool
Id > Type
"set of" Type > Type Set
NAME > Id Id
103. Lexicals ''...''
NAME > Id Id
"substitute" NAME > LocalTransformation
104. Multiple productions >*
"int" > Type Int
"bool" > Type Bool
Id > Type
"set of" Type > Type Set
"substitute" Type > LocalTransformation
105. Type checking .../...
"at" APath LocalTransformation > Transformation
Generation of local transformation domains
APath type derivation
Type checking