SlideShare ist ein Scribd-Unternehmen logo
1 von 283
Downloaden Sie, um offline zu lesen
Create Your Own Information Systems
            on the Basis of DDI-Lifecycle

                    4th Annual European DDI User Conference
                                              (EDDI 2012)


                                              03.12.2012

               Thomas Bosch                                           Matthäus Zloch
                  M.Sc. (TUM)                                               M.Sc.
                  Ph.D. student                                         Ph.D. student
          boschthomas.blogspot.com
GESIS - Leibniz Institute for the Social Sciences      GESIS - Leibniz Institute for the Social Sciences
Introduction of Attendees
• What is your name?
• What’s your organization?
• Why are you here?
   • What are you mostly interested in?
   • What do you expect from this tutorial?
   • …




                     Appr. 30 seconds
                                              2
Goals of This Presentation
• Introduction to DDI-Lifecycle
   •   Give a DDI Lifecycle overview
   •   Show you (basic) elements of the DDI-Lifecycle and the main structure
   •   How does Identification, Versioning, Maintenance work?
   •   DDI Discovery Vocabulary “disco”
   •   How to use the DDI Documentation
• Individual software project
   •   Introduce you to basics of software architecture design
   •   Show you how the software architecture of your project may look like
   •   Show you how a software project leverages DDI and the disco model
   •   Introduce you to a project template for using the disco model in the
       back-end
                                                                               3
Outline
Matthäus                               Thomas

 • Missy – General Information            •     DDI-L Overview
     •   Requirements to Developers       •     Identification, Versioning, Maintenance
     •   Use-Cases in Missy               •     DDI-L Main Structures
                                          •     DDI Instance
                                          •     Study Unit


 • Software Architecture                  •     Conceptual Component
     •   Multitier                        •     Logical Products
     •   MVC                              •     Data Collection
 • Missy Data Model                       •     disco-model
     •   Extendable Data Model

                                                                                          4
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile

• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB




                                                                          5
Outline
Matthäus                                Thomas

 • Missy – General Information             •     DDI-L Overview
      •   Requirements to Developers       •     Identification, Versioning, Maintenance
      •   Use-Cases in Missy               •     DDI-L Main Structures
                                           •     DDI Instance
Presentation
                                           •     Study Unit


 • Software Architecture                   •     Conceptual Component
      •   Multitier                        •     Logical Products
      •   MVC                              •     Data Collection
 • Missy Data Model                        •     disco-model
      •   Extendable Data Model                                            Business Layer

                                                                                            6
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile          Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                         7
Feel free to ask questions at any time!




                                          8
What comes next?
• Before we start with the use-case Missy…

• Thomas, can you give us a top-level overview of DDI?
   • You might also include how DDI treats identification, versioning and
     maintenance…




                                                                            9
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      10
DDI-L Stages




               11
using

                                   Survey
           Study
                                Instruments



measures                                      made up of

                        about




Concepts                                       Questions


                   Universes
with values of
                                                         Categories/
                                                           Codes,
                                                          Numbers
Questions


                                   Variables

      collect                               made up of




 Responses
                               Data Files
                resulting in
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                     Business Layer

                                                                                     14
Element Types




           IDs must be unique within maintainables
Not identified elements are contained by other element types
                                                               15
Identification


     Agency, ID, Version

             or

URN (= Agency + ID + Version)

                                16
Referencing Maintainables




                            17
Schemes




Very often, identifiable and versionable elements are maintained in parent schemes


                                                                                18
Maintenance Rules
Maintenance Agency
•   identified by a reserved code (based on its domain name; similar to it’s
    website and e-mail)
•   Registration of maintenance agency (e.g. GESIS)

Ownership
•   Maintenance agencies own the objects they maintain
•   Only they are allowed to change or version the objects

Reference
• Other organizations may reference external items in their own schemes,
  but may not change those items
• You can make a copy which you change and maintain, but once you do
  that, you own it!
                                                                               19
Versioning Rules




                   20
Identifiable (DDI 3.1 XML)
<DataItem id=“AB347”>
 …
</DataItem>




                                      21
Versionable (DDI 3.1 XML)

<Variable id=“V1” version=“1.1.0” versionDate=“20012-11-12”>
  <VersionResponsibility>
    Thomas Bosch
  </VersionResponsibility>
 <VersionRationale>
    Spelling Correction
  </VersionRationale>
 …
</Variable >
                                                               22
Maintainable (DDI 3.1 XML)
<VariableScheme
  id=“VarSch01”
  agency =“us.mpc”
  version=“1.4.0”
  versionDate=“2009-02-12”>
 …
</VariableScheme>



                                     23
URN (DDI 3.1)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0




                                   24
URN (DDI 3.1)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0




                                   25
URN (DDI 3.1)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0

          Maintaining agency




                                   26
URN (DDI 3.1)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0

    Maintainable (element type, ID, version)




                                               27
URN (DDI 3.1)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0

  Identifiable (element type, ID, version number)

      ID must be unique within maintainable




                                                    28
URN Resolution
• DNS-based resolution for DDI services
   • URN can be resolved by DNS means
   • http://registry.ddialliance.org/
       • details and library
       • list of available DNS services for this DDI agency

• URNs are mapped to local URLs
• XML Catalog can be used as simple resolution mechanism




                                                              29
Referencing (DDI 3.1)




            (Agency, ID, Version) vs. URN
Often, these are inherited from a maintainable object
                                                        30
Referencing (DDI 3.1 + 3.2)

• Internal
   • In same DDI Instance
   • (agency, ID, version)
       • Maintainable (agency, ID, version)
       • Identifiable (ID), versionable (ID, version)

• External
   • To other DDI Instances
   • URN > (agency, ID, version)




                                                        31
Internal Reference (DDI 3.1)
<VariableReference isReference=“true” isExternal=“false”
   lateBound=“false”>
 <Scheme isReference=“true” isExternal=“false”
   lateBound=“false”>
  <ID>VarSch01</ID>
  <IdenftifyingAgency>us.mpc</IdentifyingAgency>
  <Version>1.4.0</Version>
 </Scheme>
 <ID>V1</ID>
 <IdenftifyingAgency>us.mpc</IdentifyingAgency>
 <Version>1.1.0</Version>
</VariableReference>                                       32
External Reference (DDI 3.1)

<VariableReference isReference=“true” isExternal=“true”
  lateBound=“false”>

  <urn>
      urn:ddi:us.mpc:VariableScheme.VarSch01.1.4.0:Variable.
      V1.1.1.0
  </urn>

</VariableReference>
                                                               33
Identifiers (DDI 3.2)

• Why new IDs?
   • better support of persistent identification (less semantics, context
     information)
• URN or (agency, ID, version)
• Version changes?
   • Administrative metadata  No
   • Payload metadata  yes
• No inheritance
   • Why? (implementation problems, …)
• Deprecated vs. Canonical IDs
   • Deprecated (no version of maintainables, colons as separators)
   • Mechanism for roundtrip available between DDI 3.2 canonical ID, DDI 3.2
     deprecated ID, DDI 3.1 ID
                                                                               34
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:VariableScheme.
VarSch01.1.4.0:Variable.V1.1.1.0

          No maintainables




                                   35
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1.1.1.0


          No maintainables




                                   36
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1.1.1.0



  Agency, ID, version (Identifiable, Versionable)




                                                    37
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1.1.1.0




    ID unique in agency (not maintainable)



                                             38
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1.1.1.0




          No element types

                                   39
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:V1.1.1.0




      No element types

                          40
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1.1.1.0




          Colon is separator
                                   41
Canonical URN (DDI 3.2)

urn:ddi:us.mpc:Variable.V1:1.1.0




          Colon is separator
                                   42
Canonical URN (DDI 3.2)

  urn:ddi:us.mpc:V1:1.1.0


              No maintainables
Agency, ID, version (Identifiable, Versionable)
  ID unique in agency (not maintainable)
              No element types
              Colon is separator
                                                  43
What comes next?
• This was the DDI Overview with identification, versioning, and
  maintenance

• But, what is the use-case actually?
• What do you want to use DDI for?




                                                                   44
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      45
General Information About Missy
• Missy – Microdata Information System

• Largest household survey in Europe
• Provides detailed information about individual datasets
• Currently consists of “microcensus" survey, which is
  comprised of statistics about, e.g.
   • general population in Germany
   • situation about employment market
       • occupation, professional education, income, legal insurance, etc.



                                                                             46
General Information About Missy
• May be split in two parts
   • Missy Web (end-user front-end part)
   • Missy Editor for documentation (back-end part)


• Consists of approx. 500 Variables & Questions
• Captures 25 years, since 1973




                                                      47
Missy 3
• That’s what it’s all about! The “next generation Missy”

• Integration of further surveys, e.g. EU-SILC, EU-LFS, …
• Implementation of Missy Editor as a Web-Application




                                                            48
Use-Cases in Missy
•   General Information about survey “microcensus”
•   Variables by thematic classification and year
•   List of variables by year
•   List of generated variables by year
•   Show details of a variable with statistics
•   Variable-Time Matrix
    • List of variables by thematic classification and year (selectable)


• Questionnaire Catalogue
                                                                           49
Use-Cases in Missy
•   General Information about survey “microcensus”
•   Variables by thematic classification and year
•   List of variables by year
•   List of generated variables by year
•   Show details of a variable with statistics
•   Variable-Time Matrix
    • List of variables by thematic classification and year (selectable)


• Questionnaire Catalogue
• In future: browse by survey and country!
                                                                           50
Use-Cases in Missy
•   General Information about survey “microcensus”
•   Variables by thematic classification and year
•   List of variables by year
•   List of generated variables by year
•   Show details of a variable with statistics
•   Variable-Time Matrix
    • List of variables by thematic classification and year (selectable)


• Questionnaire Catalogue
                                                                           51
52
53
54
55
56
Requirements to Software Developers
• Complex data model to represent use-cases
• Focus lies on reusability
• Flexible Web Application Framework and modern architecture
   • Service-oriented
   • Semantic Web technologies
• Creation of an abstract framework and architecture
   • should be well designed to be able to be extended and reusable




                                                                      57
What comes next?
• We have seen Missy as an use-case with several fields which
  holds values

• I would like to use DDI
• What are the main structures of DDI-L, Thomas?




                                                                58
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      59
60
61
Modules




          62
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      63
DDIInstance




              64
DDIInstance




              65
ResourcePackage




                  66
ResourcePackage




Allows packaging of any maintainable item as a resource item
 Structure to publish non-study-specific materials for reuse
                                                               67
ResourcePackage




Any module (except: StudyUnit, Group, LocalHoldingPackage)
                                                             68
ResourcePackage




                               Any Scheme
Internal or external references to schemes and elements within schemes
                                                                         69
Group




        70
Group




Groups allow inheritance
                           71
Group




        72
Organization




               73
Organization




               74
LocalHoldingPackage




                      75
LocalHoldingPackage




 • Add local content to a deposited study unit or group
• without changing the version of the study unit, group
                                                          76
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      77
StudyUnit




            78
StudyUnit




            79
StudyUnit




            80
Coverage




           81
Coverage




           82
TemporalCoverage




                   83
TopicalCoverage




                  84
SpatialCoverage




                  85
SpatialCoverage




                  86
TopicalCoverage




                  87
TemporalCoverage




                   88
FundingInformation




                     89
FundingInformation




                     90
Citation




           91
Citation




           92
Citation




           93
Citation




           94
How the presentation is continued
• Ok, I have shown you
   • the main structures as well as
   • the main modules of DDI-Lifecycle


• If you want to use DDI you have to have a model or idea of
  your software architecture
• How would your software architecture look like?




                                                               95
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                      96
What makes Missy interesting
• Cross-linking between different multilingual studies
   • enabled by the common data model
   • provides different use-cases
• Missy leverages DDI as its back-end data model and exchange
  format
• Modern web project architecture, e.g. Multitier, MVC, etc.

• Is designed to be published as open-source project with an
  API to persist DDI data

                                                                97
Software Architecture
• Standard technologies to develop software
   • a multitier architecture
   • Model-View-Controller (MVC-Pattern)
   • Project management software, e.g. Maven


• Multitier architecture separates the project into logical parts
   •   presentation
   •   business logic or application processing
   •   data management
   •   persistence
   •   …
                                                                    98
99
Model-View-Controller
• separation of
   • representation of information
   • interactions with the data
   • components


• the key is to have logically separated parts, where people
  might work collaboratively




                                                               100
MVC – Interactions

                           View
               controls




Controller                         accesses




             manipulates


                           Model




                                              101
Model
• Represents the business objects, i.e. real life concepts, and
  their relations to each other
• Sometimes also includes some business logic

• Independent of the presentation and the controls




                                                                  102
View
• Responsible for the presentation of information to the user
• Provides actions, so that the user can interact with the
  application

• "Knows" the model and can access it

• Usually does not display the whole model, but rather special
  "views" or use-cases of it


                                                                 103
Controller
• Controls the presentations, i.e. views
• Receives actions from users
   • interprets/evaluates them
   • acts accordingly


• Manipulates the data / model
• Usually each view has its own controller

• Be warned of code reuse of controllers
   • controllers are most often the throwaway components
                                                           104
View Technologies


                      View
         controls
                                         …

Controller                    accesses




        manipulates


                      Model


                                             105
Data Model


                        View
         controls
                                           …

Controller                      accesses




        manipulates


                        Model


                                               106
Missy Technologies


                      View
         controls




Controller                    accesses




        manipulates


                      Model


                                         107
What we have discussed so far
• Brief introduction to Missy and the use-cases
• With which technologies a web application framework might
  be build up today
• How a modern software architecture looks like




                                                              108
How the presentation is continued
• What the DDI-L core modules are and how they are organized

• Introduction the disco model and
• How to extend it




                                                               109
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                     110
ConceptualComponent

ConceptualComponent




                      111
ConceptualComponent




                      112
ConceptualComponent




                      113
ConceptScheme




                114
ConceptScheme




                115
Concept




          116
Concept




          117
Universe




           118
Universe




           119
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                  120
LogicalProduct




                 121
LogicalProduct




                 122
LogicalProduct




                 123
CodeScheme




             124
CodeScheme




             125
Code




       126
Code




       127
CategoryScheme




                 128
CategoryScheme




                 129
Category




           130
Category




           131
VariableScheme




                 132
VariableScheme




                 133
Variable




           134
Variable




           135
Representation




                 136
Representation




                 137
CodeRepresentation




                     138
CodeRepresentation




                     139
NumericRepresentation




                        140
NumericRepresentation




                        141
TextRepresentation




                     142
TextRepresentation




                     143
VariableGroup




                144
VariableGroup




                145
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                  146
DataCollection




                 147
DataCollection




                 148
DataCollection




                 149
Methodology




              150
Methodology




              151
CollectionEvent




                  152
CollectionEvent




                  153
QuestionScheme




                 154
QuestionScheme




                 155
QuestionItem




               156
QuestionItem




               157
Response Domain




                  158
Response Domain




                  159
Instrument




             160
Instrument




             161
ControlConstruct




                   162
ControlConstructScheme




                         163
ControlConstruct




                   164
Sequence




           165
Sequence (bahavioral)




                        166
Sequence (XML)

<Sequence>
  <ControlConstructReference> [Q1] </ControlConstructReference>
  <ControlConstructReference> [Q2] </ControlConstructReference>
  <ControlConstructReference> [Q3] </ControlConstructReference>
</Sequence>




                                                                  167
Sequence (structural)




                        168
StatementItem




                169
StatementItem (behavioral)




                             170
StatementItem (XML)
<StatementItem>
  <DisplayText>
    statement
  </DisplayText>
</StatementItem>




                                  171
StatementItem (structural)




                             172
QuestionConstruct




                    173
QuestionConstruct (behavioral)




                                 174
QuestionConstruct (XML)
<QuestionConstruct>
  <QuestionReference> [Q1] </QuestionReference>
</QuestionConstruct>




                                                  175
QuestionConstruct (structural)




                                 176
IfThenElse




             177
IfThenElse (behavioral)




                          178
IfThenElse (XML)
<IfThenElse>
  <IfCondition>
     <Code programmingLanguage="…"> yes </Code>
     <SourceQuestionReference> [Q1] </SourceQuestionReference>
  </IfCondition>
  <ThenConstructReference> [Q2] </ThenConstructReference>
  <ElseIf>
     <IfCondition>
       <Code programmingLanguage="…"> no </Code>
       <SourceQuestionReference> [Q1] </SourceQuestionReference>
     </IfCondition>
     <ThenConstructReference> [Q3] </d:ThenConstructReference>
  </ElseIf>
</IfThenElse>
                                                                   179
IfThenElse (structural)




                          180
Loop




       181
Loop (behavioral)




                    182
Loop (XML)
<Loop>
  <InitialValue>
    <Code programmingLanguage=“Java"> i = 0 </Code>
  </InitialValue>
  <LoopWhile>
    <Code programmingLanguage=“Java"> i < 10 </Code>
  </LoopWhile>
  <StepValue>
    <Code programmingLanguage=“Java"> i = i + 1 </Code>
  </StepValue>
  <ControlConstructReference> [Q1] </ControlConstructReference>
</Loop>


                                                                  183
Loop (structural)




                    184
RepeatWhile




              185
RepeatWhile (behavioral)




                           186
RepeatWhile (XML)
<RepeatWhile>
  <WhileCondition>
    <Code programmingLanguage=“Neutral"> no </Code>
    <SourceQuestionReference> [Q1] </ SourceQuestionReference >
  </WhileCondition>
  <WhileConstructReference> [Q1] </WhileConstructReference>
</RepeatWhile>



                                                                  187
RepeatWhile (structural)




                           188
RepeatUntil




              189
RepeatUntil (behavioral)




                           190
RepeatUntil (XML)
<RepeatUntil>
  <UntilCondition>
    <Code programmingLanguage="Neutral"> yes </Code>
    <SourceQuestionReference> [Q1] </SourceQuestionReference>
  </UntilCondition>
  <UntilConstructReference> [Q1] </UntilConstructReference>
</d:RepeatUntil>



                                                                191
RepeatUntil (structural)




                           192
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                  193
194
How to extend the disco model?




                                 195
What we have seen so far
• An introduction to the core DDI modules

• A short introduction to the disco-model, which covers the
  most important parts of DDI-Lifecycle and DDI-Codebook




                                                              196
How the presentation is continued
• Now that we know the data model…

• How can it be used for the Missy use-case, Matthäus?

• Can we create a mapping of the fields in the disco-model and
  the fields in Missy?




                                                                 197
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                  198
Use-case “variable details”




                              199
200
201
202
203
204
Sub-Sample:
• is-a Sub-Universe with the
    UniverseType „subSample“
Filter Statement
• is-a Sub-Universe with the
    UniverseType „filterStatement“
                                     205
206
207
208
209
210
211
212
213
What we have seen so far
• disco-model
   • designed for the discovery use-case
   • provides object types, properties and data type properties designed
     for discovery use-case


• One Missy use-case and all the data type and object
  properties that are covered

• The idea of how theoretically the disco can be extended

                                                                           214
How the presentation is continued
• But, what if a project has specific requirements to the model,
  that are not covered by the disco model?

• How are we going to implement that?




                                                                   215
What comes next?

 • Missy – General Information         •   DDI-L Overview
      •   Requirements to Developers   •   Identification, Versioning, Maintenance
      •   Use-Cases in Missy           •   DDI-L Main Structures
                                       •   DDI Instance
Presentation
                                       •   Study Unit


 • Software Architecture               •   Conceptual Component
      •   Multitier                    •   Logical Products
      •   MVC                          •   Data Collection
 • Missy Data Model                    •   disco-model
      •   Extendable Data Model                                      Business Layer

                                                                                  216
Business Logic
                 217
218
219
disco-model API
• The disco-model implementation is designed to be extended
• For one project’s individual requirements

• Speaking in MVC-pattern language: the model is separated
  and independent from the views and the controllers




                                                              220
disco-model API
• abstract classes
   • define the basic elements, i.e. data type and object properties, that
     are provided and covered by the disco-model
   • may be extended, but cannot be instantiated
   • e.g. AbstractVariable
• model classes
   • represent the actual entity types of disco-model
   • may be instantiated and reused
   • e.g. Variable



                                                                             221
disco-model API
• abstract classes
   • define the basic elements, i.e. data type and object properties, that
     are provided and covered by the disco-model
   • may be extended, but cannot be instantiated
   • e.g. AbstractVariable
• model classes
   • represent the actual entity types of disco-model
   • may be instantiated and reused
   • e.g. Variable


• Implementation of the Template Pattern
                                                                             222
223
Provided Types




          224
Project specific extensions


                        225
Project Specific Extensions
• Projects may just simply use
• and extend the defined classes
   • Introduce new fields and methods, such as getter/setter
   • Change the type of existing fields when they overwrite the return type
     (in Java this is only possible in case of covariant returns)




                                                                              226
How is the presentation continued
• Now that I have the data model… and I can extend it
• I just need to store my data somewhere…

• Are there any persistence mechanisms that are offered by the
  DDI-Lifecycle model?




                                                                 227
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                     228
PhysicalDataProduct




                      229
PhysicalDataProduct




                      230
PhysicalDataProduct




                      231
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                     232
PhysicalInstance




                   233
PhysicalInstance




                   234
PhysicalInstance




                   235
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                     236
Archive




          237
Archive




          238
Archive




                   Contains persistent lifecycle events
Contains archive-specific information (archive reference, access, funding)
                                                                         239
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                     240
DDIProfile




             241
DDIProfile




             242
DDIProfile




Describes which DDI-L elements you use in your institution’s DDI flavor
                                                                          243
How the presentation is continued
• How can the disco-model be persisted?

and

• How can it be done in a project with individual data model
  elements?
• How do we do this in the Missy application?



                                                               244
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                     245
Data Storage Access




                      246
247
248
Persistence Layer – Motivation
• In a well-defined software architecture the application itself
  does not need to know how the data is stored
• It just needs to know the API
   • Methods that are provided to access and store objects
   • May be abstracted away from the actual implementation


• An actual implementation or strategy can just be a matter of
  configuration



                                                                   249
Persistence Layer – Strategy Pattern
• Implementation of the Strategy Pattern
   • simple interface
   • actual implementations are the different strategies


• A strategy is an implementation of the actual type of
  persistence or physical storage, respectively
   • e.g. DDI-L-XML, DDI-RDF, XML-DB, Relational-DB, etc.




                                                            250
251
Modules
• disco-persistence-api
   • Defines persistence functionality for model components regardless of the
     actual type of physical persistence
• disco-persistence-relational
   • Implements the persistence functionality defined in disco-persistence-api
     with respect to the usage of relational DBs
• disco-persistence-xml
   • Implements the persistence functionality defined in disco-persistence-api
     with respect to the usage of DDI-XML
• disco-persistence-rdf
   • Implements the persistence functionality defined in disco-persistence-api
     with respect to the usage of the disco-specification

                                                                                 252
Data Access Object
• A DAO is an object that is responsible for providing methods
  for reading and storing of objects in our model
• A DAO is again implemented against an interface

• For each business object in the data model there exists a DAO

• Persistence strategy interface defines methods for obtaining
  specific DAOs (data access objects)


                                                                  253
DAO Pattern




              254
disco-persistence-api
• Provides an API and implementations for the disco-model

• PersistenceService
   • Available in the disco-persistence-api
   • Encapsulates the actual strategy


   • Strategy can be instantiated and injected into the PersistenceService




                                                                             255
Extendable disco-persistence-api
• But, projects have own requirements to the data model
   • => new business objects are included
   • => DAOs need to be written for reading and storing these objects
   • => these DAOs also have to implement the appropriate type of
     persistence used by that project


• => this API has to be extendable




                                                                        256
Extendable disco-persistence-api
• But, projects have own requirements to the data model
   • => new business objects are included
   • => DAOs need to be written for reading and storing these objects
   • => these DAOs also have to implement the appropriate type of
     persistence used by that project


• => this API has to be extendable

• Luckily it is, because of the interface structure!
• Projects only need to implement the defined methods
                                                                        257
258
259
Missy Specific Modules
• missy-persistence-api
   • Defines additional persistence functionality for model components
     regardless of the actual type of physical persistence
• missy-persistence-relational
   • Implements additional persistence functionality for model
     components with respect to the usage of relational DBs
• missy-persistence-xml
   • Implements additional persistence functionality for model
     components with respect to the usage of DDI-XML
• missy-persistence-rdf
   • Implements additional persistence functionality for model
     components with respect to the usage of the disco-specification
                                                                         260
What we have seen so far
• Implementation of a persistence API for the disco model
• How the persistence API can be extended in individual
  projects




                                                            261
How the presentation is continued
• How does, on the persistence level, the serialization of data
  stored in DDI really look like?




                                                                  262
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                    263
Questionnaire - Question




                           264
Questionnaire – Question (DDI 3.1 XML)




                                     265
DataCollection

  Instrument
      IntrumentName
      ControlConstructReference

  ControlCostructScheme
     QuestionConstruct
        QuestionReference

  QuestionScheme
    QuestionItem
        QuestionItemName
        QuestionText
                                  266
Questionnaire – Question (DDI-RDF)




                                     267
Variable - SummaryStatistics




                               268
Variable – SummaryStatistics (DDI 3.1 XML)
 PhysicalInstance
   Statistics
     VariableStatistics
        VariableReference
        SummaryStatistic
           SummaryStatisticType
              'Minimum'
           Value
              10

 Variable
   VariableName
      'v1'
   Label
      'Age'                             269
Variable – SummaryStatistics (DDI-RDF)




                                         270
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                    271
Relational-DB Example




                        272
Relational-DB Example




                        273
Relational-DB Example




                        274
Inheritance of Properties
                            From Identifiable




                                           275
Inheritance of Properties
                            From abstract class




                                           276
Declaration of own Properties
                          Own properties




                                       277
What comes next?
• What are you going to do with the MISSY project?

• Is there ongoing work?




                                                     278
Outline

• Persistence Layer                       •   Physical Data Product
    •   Programming Interface             •   Physical Instance
    •   Types of physical data storages
                                          •   Archive
    •   Examples
                                          •   DDIProfile         Data Storage Access


• Outlook                                 • DDI Serialization Examples
                                               •   DDI-XML
                                               •   DDI-RDF / disco-spec
                                               •   Relational-DB
                                                                          Data Storage



                                                                                    279
Outlook
• YES! There is ongoing work!

• Use the API in other projects, e.g. GESIS internal projects like
  StarDat
• Publish the persistence API as open-source on github

• Two presentations at IASSIST 2013
• One poster at IASSIST 2013


                                                                     280
Poster
• Shows detailed information regarding Missy
   •   Multitier Architecture, MVC
   •   Employed Technologies
   •   Specific Maven Modules
   •   Project Home
• Excerpts from disco-model and the mapping of the fields




                                                            281
282
Thank you for your attention…
  • Feel free to download the sources from GitHub:
                      https://github.com/missy-project


  • Give us feedback!
  • Feel free to criticize!


               Thomas Bosch                                        Matthäus Zloch
GESIS - Leibniz Institute for the Social Sciences   GESIS - Leibniz Institute for the Social Sciences
          thomas.bosch@gesis.org                             matthaeus.zloch@gesis.org
         boschthomas.blogspot.com

                                                                                                  283

Weitere ähnliche Inhalte

Andere mochten auch

Strength Based Organizational Change
Strength Based Organizational ChangeStrength Based Organizational Change
Strength Based Organizational ChangeBusiness901
 
Strength Based leadership for AIESEC + Prism
Strength Based leadership for AIESEC + PrismStrength Based leadership for AIESEC + Prism
Strength Based leadership for AIESEC + PrismJason Kwan
 
Strength-based by Design: a high-engagement approach to sparking innovation a...
Strength-based by Design: a high-engagement approach to sparking innovation a...Strength-based by Design: a high-engagement approach to sparking innovation a...
Strength-based by Design: a high-engagement approach to sparking innovation a...Jen Silbert
 
Appreciative Inquiry and Strength-Based Systems
Appreciative Inquiry and Strength-Based SystemsAppreciative Inquiry and Strength-Based Systems
Appreciative Inquiry and Strength-Based SystemsRobyn Stratton-Berkessel
 
Graft and corruption
Graft and corruptionGraft and corruption
Graft and corruptionJoash Medina
 
Corruption in the Philippines
Corruption in the PhilippinesCorruption in the Philippines
Corruption in the Philippinesbrianbelen
 
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...DDI | Development Dimensions International
 
Strategic planning powerpoint
Strategic planning powerpointStrategic planning powerpoint
Strategic planning powerpointrobdude9626
 

Andere mochten auch (10)

Strength Based Organizational Change
Strength Based Organizational ChangeStrength Based Organizational Change
Strength Based Organizational Change
 
Strength Based leadership for AIESEC + Prism
Strength Based leadership for AIESEC + PrismStrength Based leadership for AIESEC + Prism
Strength Based leadership for AIESEC + Prism
 
Strength-based Management
Strength-based ManagementStrength-based Management
Strength-based Management
 
Strength-based by Design: a high-engagement approach to sparking innovation a...
Strength-based by Design: a high-engagement approach to sparking innovation a...Strength-based by Design: a high-engagement approach to sparking innovation a...
Strength-based by Design: a high-engagement approach to sparking innovation a...
 
Appreciative Inquiry and Strength-Based Systems
Appreciative Inquiry and Strength-Based SystemsAppreciative Inquiry and Strength-Based Systems
Appreciative Inquiry and Strength-Based Systems
 
Graft and corruption
Graft and corruptionGraft and corruption
Graft and corruption
 
Corruption in the Philippines
Corruption in the PhilippinesCorruption in the Philippines
Corruption in the Philippines
 
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...
6 Charts You Can't Afford to Ignore: Talent Management Intelligence for the F...
 
Graft & corruption in the government
Graft & corruption in the governmentGraft & corruption in the government
Graft & corruption in the government
 
Strategic planning powerpoint
Strategic planning powerpointStrategic planning powerpoint
Strategic planning powerpoint
 

Ähnlich wie 2012.12 - EDDI 2012 - Workshop

2012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 32012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 3Dr.-Ing. Thomas Hartmann
 
DRI Introductory Training: Preparing Your Collections for the DRI
DRI Introductory Training: Preparing Your Collections for the DRIDRI Introductory Training: Preparing Your Collections for the DRI
DRI Introductory Training: Preparing Your Collections for the DRIdri_ireland
 
ESWC 2011 - Designing an Ontology for the Data Documentation Initiative
ESWC 2011 -  Designing an Ontology for the Data Documentation InitiativeESWC 2011 -  Designing an Ontology for the Data Documentation Initiative
ESWC 2011 - Designing an Ontology for the Data Documentation InitiativeDr.-Ing. Thomas Hartmann
 
Sakai And The Academic Enterprise
Sakai And The Academic EnterpriseSakai And The Academic Enterprise
Sakai And The Academic EnterpriseMichael Feldstein
 
Kevin Long - Preparing your collection for DRI
Kevin Long - Preparing your collection for DRIKevin Long - Preparing your collection for DRI
Kevin Long - Preparing your collection for DRIdri_ireland
 
2012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 12012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 1Dr.-Ing. Thomas Hartmann
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 
Large scale computing
Large scale computing Large scale computing
Large scale computing Bhupesh Bansal
 
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...Dr.-Ing. Thomas Hartmann
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012InfiniteGraph
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachDATAVERSITY
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012Jimmy Lai
 

Ähnlich wie 2012.12 - EDDI 2012 - Workshop (20)

2012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 32012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 3
 
2013.05 - IASSIST 2013 - 2
2013.05 - IASSIST 2013 - 22013.05 - IASSIST 2013 - 2
2013.05 - IASSIST 2013 - 2
 
Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective... Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective...
 
2013.05 - IASSIST 2013
2013.05 - IASSIST 20132013.05 - IASSIST 2013
2013.05 - IASSIST 2013
 
2012.12 - EDDI 2012 - Poster Demo
2012.12 - EDDI 2012 - Poster Demo2012.12 - EDDI 2012 - Poster Demo
2012.12 - EDDI 2012 - Poster Demo
 
DRI Introductory Training: Preparing Your Collections for the DRI
DRI Introductory Training: Preparing Your Collections for the DRIDRI Introductory Training: Preparing Your Collections for the DRI
DRI Introductory Training: Preparing Your Collections for the DRI
 
ESWC 2011 - Designing an Ontology for the Data Documentation Initiative
ESWC 2011 -  Designing an Ontology for the Data Documentation InitiativeESWC 2011 -  Designing an Ontology for the Data Documentation Initiative
ESWC 2011 - Designing an Ontology for the Data Documentation Initiative
 
Sakai And The Academic Enterprise
Sakai And The Academic EnterpriseSakai And The Academic Enterprise
Sakai And The Academic Enterprise
 
Kevin Long - Preparing your collection for DRI
Kevin Long - Preparing your collection for DRIKevin Long - Preparing your collection for DRI
Kevin Long - Preparing your collection for DRI
 
2012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 12012.10 - Workshop on Semantic Statistics - 1
2012.10 - Workshop on Semantic Statistics - 1
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 
2013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 20132013.05 - LDOW 2013 @ WWW 2013
2013.05 - LDOW 2013 @ WWW 2013
 
Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web
 
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...
DC 2012 - Leveraging the DDI Model for Linked Statistical Data in the Social...
 
Silicon valley nosql meetup april 2012
Silicon valley nosql meetup  april 2012Silicon valley nosql meetup  april 2012
Silicon valley nosql meetup april 2012
 
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot ApproachChoosing the Right Big Data Tools for the Job - A Polyglot Approach
Choosing the Right Big Data Tools for the Job - A Polyglot Approach
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
 
VRA 2010, MDID users group
VRA 2010, MDID users groupVRA 2010, MDID users group
VRA 2010, MDID users group
 
DOI in HE
DOI in HEDOI in HE
DOI in HE
 

Mehr von Dr.-Ing. Thomas Hartmann

Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Dr.-Ing. Thomas Hartmann
 
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...Dr.-Ing. Thomas Hartmann
 
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)Dr.-Ing. Thomas Hartmann
 
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...Dr.-Ing. Thomas Hartmann
 
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)Dr.-Ing. Thomas Hartmann
 
2014.10 - How to Formulate and Validate Constraints (DC 2014)
2014.10 - How to Formulate and Validate Constraints (DC 2014)2014.10 - How to Formulate and Validate Constraints (DC 2014)
2014.10 - How to Formulate and Validate Constraints (DC 2014)Dr.-Ing. Thomas Hartmann
 
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...Dr.-Ing. Thomas Hartmann
 
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)Dr.-Ing. Thomas Hartmann
 
The Next Generation of the Microdata Information System MISSY - An Integrated...
The Next Generation of the Microdata Information System MISSY - An Integrated...The Next Generation of the Microdata Information System MISSY - An Integrated...
The Next Generation of the Microdata Information System MISSY - An Integrated...Dr.-Ing. Thomas Hartmann
 
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...The New Microdata Information System (MISSY) - Integration of DDI-based Data ...
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...Dr.-Ing. Thomas Hartmann
 
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...Dr.-Ing. Thomas Hartmann
 
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]Dr.-Ing. Thomas Hartmann
 
2013.02 - 7th Workshop of German Panel Surveys
2013.02 - 7th Workshop of German Panel Surveys2013.02 - 7th Workshop of German Panel Surveys
2013.02 - 7th Workshop of German Panel SurveysDr.-Ing. Thomas Hartmann
 
2012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 32012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 3Dr.-Ing. Thomas Hartmann
 

Mehr von Dr.-Ing. Thomas Hartmann (20)

Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
Doctoral Examination at the Karlsruhe Institute of Technology (08.07.2016)
 
KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016KIT Graduiertenkolloquium 11.05.2016
KIT Graduiertenkolloquium 11.05.2016
 
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
2016.02 - Validating RDF Data Quality using Constraints to Direct the Develop...
 
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)
2015.09. - The Role of Reasoning for RDF Validation (SEMANTiCS 2015)
 
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...
2015.09 - Guidance, Please! Towards a Framework for RDF-Based Constraint Lang...
 
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)
2015.03 - The RDF Validator - A Tool to Validate RDF Data (KIM)
 
2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)2014.12 - Let's Disco - 2 (EDDI 2014)
2014.12 - Let's Disco - 2 (EDDI 2014)
 
2014.12 - Let's Disco (EDDI 2014)
2014.12 - Let's Disco (EDDI 2014)2014.12 - Let's Disco (EDDI 2014)
2014.12 - Let's Disco (EDDI 2014)
 
2014.10 - How to Formulate and Validate Constraints (DC 2014)
2014.10 - How to Formulate and Validate Constraints (DC 2014)2014.10 - How to Formulate and Validate Constraints (DC 2014)
2014.10 - How to Formulate and Validate Constraints (DC 2014)
 
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...
2014.10 - Towards Description Set Profiles for RDF Using SPARQL as Intermedia...
 
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
2014.10 - Requirements on RDF Constraint Formulation and Validation (DC 2014)
 
The Next Generation of the Microdata Information System MISSY - An Integrated...
The Next Generation of the Microdata Information System MISSY - An Integrated...The Next Generation of the Microdata Information System MISSY - An Integrated...
The Next Generation of the Microdata Information System MISSY - An Integrated...
 
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...The New Microdata Information System (MISSY) - Integration of DDI-based Data ...
The New Microdata Information System (MISSY) - Integration of DDI-based Data ...
 
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...
Use Cases and Vocabularies Related to the DDI-RDF Discovery Vocabulary (EDDI ...
 
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]
Towards the Discovery of Person-Level Data (SemStats, ISWC 2013) [2013.10]
 
2013.05 - IASSIST 2013 - 3
2013.05 - IASSIST 2013 - 32013.05 - IASSIST 2013 - 3
2013.05 - IASSIST 2013 - 3
 
2013.02 - 7th Workshop of German Panel Surveys
2013.02 - 7th Workshop of German Panel Surveys2013.02 - 7th Workshop of German Panel Surveys
2013.02 - 7th Workshop of German Panel Surveys
 
2012.11 - ISWC 2012 - DC - 2
2012.11 - ISWC 2012 - DC -  22012.11 - ISWC 2012 - DC -  2
2012.11 - ISWC 2012 - DC - 2
 
2012.11 - ISWC 2012 - DC - 1
2012.11 - ISWC 2012 - DC - 12012.11 - ISWC 2012 - DC - 1
2012.11 - ISWC 2012 - DC - 1
 
2012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 32012.10 - DDI Lifecycle - Moving Forward - 3
2012.10 - DDI Lifecycle - Moving Forward - 3
 

2012.12 - EDDI 2012 - Workshop

  • 1. Create Your Own Information Systems on the Basis of DDI-Lifecycle 4th Annual European DDI User Conference (EDDI 2012) 03.12.2012 Thomas Bosch Matthäus Zloch M.Sc. (TUM) M.Sc. Ph.D. student Ph.D. student boschthomas.blogspot.com GESIS - Leibniz Institute for the Social Sciences GESIS - Leibniz Institute for the Social Sciences
  • 2. Introduction of Attendees • What is your name? • What’s your organization? • Why are you here? • What are you mostly interested in? • What do you expect from this tutorial? • … Appr. 30 seconds 2
  • 3. Goals of This Presentation • Introduction to DDI-Lifecycle • Give a DDI Lifecycle overview • Show you (basic) elements of the DDI-Lifecycle and the main structure • How does Identification, Versioning, Maintenance work? • DDI Discovery Vocabulary “disco” • How to use the DDI Documentation • Individual software project • Introduce you to basics of software architecture design • Show you how the software architecture of your project may look like • Show you how a software project leverages DDI and the disco model • Introduce you to a project template for using the disco model in the back-end 3
  • 4. Outline Matthäus Thomas • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model 4
  • 5. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB 5
  • 6. Outline Matthäus Thomas • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 6
  • 7. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 7
  • 8. Feel free to ask questions at any time! 8
  • 9. What comes next? • Before we start with the use-case Missy… • Thomas, can you give us a top-level overview of DDI? • You might also include how DDI treats identification, versioning and maintenance… 9
  • 10. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 10
  • 12. using Survey Study Instruments measures made up of about Concepts Questions Universes
  • 13. with values of Categories/ Codes, Numbers Questions Variables collect made up of Responses Data Files resulting in
  • 14. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 14
  • 15. Element Types IDs must be unique within maintainables Not identified elements are contained by other element types 15
  • 16. Identification Agency, ID, Version or URN (= Agency + ID + Version) 16
  • 18. Schemes Very often, identifiable and versionable elements are maintained in parent schemes 18
  • 19. Maintenance Rules Maintenance Agency • identified by a reserved code (based on its domain name; similar to it’s website and e-mail) • Registration of maintenance agency (e.g. GESIS) Ownership • Maintenance agencies own the objects they maintain • Only they are allowed to change or version the objects Reference • Other organizations may reference external items in their own schemes, but may not change those items • You can make a copy which you change and maintain, but once you do that, you own it! 19
  • 21. Identifiable (DDI 3.1 XML) <DataItem id=“AB347”> … </DataItem> 21
  • 22. Versionable (DDI 3.1 XML) <Variable id=“V1” version=“1.1.0” versionDate=“20012-11-12”> <VersionResponsibility> Thomas Bosch </VersionResponsibility> <VersionRationale> Spelling Correction </VersionRationale> … </Variable > 22
  • 23. Maintainable (DDI 3.1 XML) <VariableScheme id=“VarSch01” agency =“us.mpc” version=“1.4.0” versionDate=“2009-02-12”> … </VariableScheme> 23
  • 28. URN (DDI 3.1) urn:ddi:us.mpc:VariableScheme. VarSch01.1.4.0:Variable.V1.1.1.0 Identifiable (element type, ID, version number) ID must be unique within maintainable 28
  • 29. URN Resolution • DNS-based resolution for DDI services • URN can be resolved by DNS means • http://registry.ddialliance.org/ • details and library • list of available DNS services for this DDI agency • URNs are mapped to local URLs • XML Catalog can be used as simple resolution mechanism 29
  • 30. Referencing (DDI 3.1) (Agency, ID, Version) vs. URN Often, these are inherited from a maintainable object 30
  • 31. Referencing (DDI 3.1 + 3.2) • Internal • In same DDI Instance • (agency, ID, version) • Maintainable (agency, ID, version) • Identifiable (ID), versionable (ID, version) • External • To other DDI Instances • URN > (agency, ID, version) 31
  • 32. Internal Reference (DDI 3.1) <VariableReference isReference=“true” isExternal=“false” lateBound=“false”> <Scheme isReference=“true” isExternal=“false” lateBound=“false”> <ID>VarSch01</ID> <IdenftifyingAgency>us.mpc</IdentifyingAgency> <Version>1.4.0</Version> </Scheme> <ID>V1</ID> <IdenftifyingAgency>us.mpc</IdentifyingAgency> <Version>1.1.0</Version> </VariableReference> 32
  • 33. External Reference (DDI 3.1) <VariableReference isReference=“true” isExternal=“true” lateBound=“false”> <urn> urn:ddi:us.mpc:VariableScheme.VarSch01.1.4.0:Variable. V1.1.1.0 </urn> </VariableReference> 33
  • 34. Identifiers (DDI 3.2) • Why new IDs? • better support of persistent identification (less semantics, context information) • URN or (agency, ID, version) • Version changes? • Administrative metadata  No • Payload metadata  yes • No inheritance • Why? (implementation problems, …) • Deprecated vs. Canonical IDs • Deprecated (no version of maintainables, colons as separators) • Mechanism for roundtrip available between DDI 3.2 canonical ID, DDI 3.2 deprecated ID, DDI 3.1 ID 34
  • 35. Canonical URN (DDI 3.2) urn:ddi:us.mpc:VariableScheme. VarSch01.1.4.0:Variable.V1.1.1.0 No maintainables 35
  • 36. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1.1.1.0 No maintainables 36
  • 37. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1.1.1.0 Agency, ID, version (Identifiable, Versionable) 37
  • 38. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1.1.1.0 ID unique in agency (not maintainable) 38
  • 39. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1.1.1.0 No element types 39
  • 40. Canonical URN (DDI 3.2) urn:ddi:us.mpc:V1.1.1.0 No element types 40
  • 41. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1.1.1.0 Colon is separator 41
  • 42. Canonical URN (DDI 3.2) urn:ddi:us.mpc:Variable.V1:1.1.0 Colon is separator 42
  • 43. Canonical URN (DDI 3.2) urn:ddi:us.mpc:V1:1.1.0 No maintainables Agency, ID, version (Identifiable, Versionable) ID unique in agency (not maintainable) No element types Colon is separator 43
  • 44. What comes next? • This was the DDI Overview with identification, versioning, and maintenance • But, what is the use-case actually? • What do you want to use DDI for? 44
  • 45. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 45
  • 46. General Information About Missy • Missy – Microdata Information System • Largest household survey in Europe • Provides detailed information about individual datasets • Currently consists of “microcensus" survey, which is comprised of statistics about, e.g. • general population in Germany • situation about employment market • occupation, professional education, income, legal insurance, etc. 46
  • 47. General Information About Missy • May be split in two parts • Missy Web (end-user front-end part) • Missy Editor for documentation (back-end part) • Consists of approx. 500 Variables & Questions • Captures 25 years, since 1973 47
  • 48. Missy 3 • That’s what it’s all about! The “next generation Missy” • Integration of further surveys, e.g. EU-SILC, EU-LFS, … • Implementation of Missy Editor as a Web-Application 48
  • 49. Use-Cases in Missy • General Information about survey “microcensus” • Variables by thematic classification and year • List of variables by year • List of generated variables by year • Show details of a variable with statistics • Variable-Time Matrix • List of variables by thematic classification and year (selectable) • Questionnaire Catalogue 49
  • 50. Use-Cases in Missy • General Information about survey “microcensus” • Variables by thematic classification and year • List of variables by year • List of generated variables by year • Show details of a variable with statistics • Variable-Time Matrix • List of variables by thematic classification and year (selectable) • Questionnaire Catalogue • In future: browse by survey and country! 50
  • 51. Use-Cases in Missy • General Information about survey “microcensus” • Variables by thematic classification and year • List of variables by year • List of generated variables by year • Show details of a variable with statistics • Variable-Time Matrix • List of variables by thematic classification and year (selectable) • Questionnaire Catalogue 51
  • 52. 52
  • 53. 53
  • 54. 54
  • 55. 55
  • 56. 56
  • 57. Requirements to Software Developers • Complex data model to represent use-cases • Focus lies on reusability • Flexible Web Application Framework and modern architecture • Service-oriented • Semantic Web technologies • Creation of an abstract framework and architecture • should be well designed to be able to be extended and reusable 57
  • 58. What comes next? • We have seen Missy as an use-case with several fields which holds values • I would like to use DDI • What are the main structures of DDI-L, Thomas? 58
  • 59. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 59
  • 60. 60
  • 61. 61
  • 62. Modules 62
  • 63. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 63
  • 67. ResourcePackage Allows packaging of any maintainable item as a resource item Structure to publish non-study-specific materials for reuse 67
  • 68. ResourcePackage Any module (except: StudyUnit, Group, LocalHoldingPackage) 68
  • 69. ResourcePackage Any Scheme Internal or external references to schemes and elements within schemes 69
  • 70. Group 70
  • 72. Group 72
  • 76. LocalHoldingPackage • Add local content to a deposited study unit or group • without changing the version of the study unit, group 76
  • 77. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 77
  • 78. StudyUnit 78
  • 79. StudyUnit 79
  • 80. StudyUnit 80
  • 81. Coverage 81
  • 82. Coverage 82
  • 91. Citation 91
  • 92. Citation 92
  • 93. Citation 93
  • 94. Citation 94
  • 95. How the presentation is continued • Ok, I have shown you • the main structures as well as • the main modules of DDI-Lifecycle • If you want to use DDI you have to have a model or idea of your software architecture • How would your software architecture look like? 95
  • 96. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 96
  • 97. What makes Missy interesting • Cross-linking between different multilingual studies • enabled by the common data model • provides different use-cases • Missy leverages DDI as its back-end data model and exchange format • Modern web project architecture, e.g. Multitier, MVC, etc. • Is designed to be published as open-source project with an API to persist DDI data 97
  • 98. Software Architecture • Standard technologies to develop software • a multitier architecture • Model-View-Controller (MVC-Pattern) • Project management software, e.g. Maven • Multitier architecture separates the project into logical parts • presentation • business logic or application processing • data management • persistence • … 98
  • 99. 99
  • 100. Model-View-Controller • separation of • representation of information • interactions with the data • components • the key is to have logically separated parts, where people might work collaboratively 100
  • 101. MVC – Interactions View controls Controller accesses manipulates Model 101
  • 102. Model • Represents the business objects, i.e. real life concepts, and their relations to each other • Sometimes also includes some business logic • Independent of the presentation and the controls 102
  • 103. View • Responsible for the presentation of information to the user • Provides actions, so that the user can interact with the application • "Knows" the model and can access it • Usually does not display the whole model, but rather special "views" or use-cases of it 103
  • 104. Controller • Controls the presentations, i.e. views • Receives actions from users • interprets/evaluates them • acts accordingly • Manipulates the data / model • Usually each view has its own controller • Be warned of code reuse of controllers • controllers are most often the throwaway components 104
  • 105. View Technologies View controls … Controller accesses manipulates Model 105
  • 106. Data Model View controls … Controller accesses manipulates Model 106
  • 107. Missy Technologies View controls Controller accesses manipulates Model 107
  • 108. What we have discussed so far • Brief introduction to Missy and the use-cases • With which technologies a web application framework might be build up today • How a modern software architecture looks like 108
  • 109. How the presentation is continued • What the DDI-L core modules are and how they are organized • Introduction the disco model and • How to extend it 109
  • 110. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 110
  • 116. Concept 116
  • 117. Concept 117
  • 118. Universe 118
  • 119. Universe 119
  • 120. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 120
  • 124. CodeScheme 124
  • 125. CodeScheme 125
  • 126. Code 126
  • 127. Code 127
  • 130. Category 130
  • 131. Category 131
  • 134. Variable 134
  • 135. Variable 135
  • 146. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 146
  • 150. Methodology 150
  • 151. Methodology 151
  • 156. QuestionItem 156
  • 157. QuestionItem 157
  • 160. Instrument 160
  • 161. Instrument 161
  • 165. Sequence 165
  • 167. Sequence (XML) <Sequence> <ControlConstructReference> [Q1] </ControlConstructReference> <ControlConstructReference> [Q2] </ControlConstructReference> <ControlConstructReference> [Q3] </ControlConstructReference> </Sequence> 167
  • 171. StatementItem (XML) <StatementItem> <DisplayText> statement </DisplayText> </StatementItem> 171
  • 175. QuestionConstruct (XML) <QuestionConstruct> <QuestionReference> [Q1] </QuestionReference> </QuestionConstruct> 175
  • 177. IfThenElse 177
  • 179. IfThenElse (XML) <IfThenElse> <IfCondition> <Code programmingLanguage="…"> yes </Code> <SourceQuestionReference> [Q1] </SourceQuestionReference> </IfCondition> <ThenConstructReference> [Q2] </ThenConstructReference> <ElseIf> <IfCondition> <Code programmingLanguage="…"> no </Code> <SourceQuestionReference> [Q1] </SourceQuestionReference> </IfCondition> <ThenConstructReference> [Q3] </d:ThenConstructReference> </ElseIf> </IfThenElse> 179
  • 181. Loop 181
  • 183. Loop (XML) <Loop> <InitialValue> <Code programmingLanguage=“Java"> i = 0 </Code> </InitialValue> <LoopWhile> <Code programmingLanguage=“Java"> i < 10 </Code> </LoopWhile> <StepValue> <Code programmingLanguage=“Java"> i = i + 1 </Code> </StepValue> <ControlConstructReference> [Q1] </ControlConstructReference> </Loop> 183
  • 185. RepeatWhile 185
  • 187. RepeatWhile (XML) <RepeatWhile> <WhileCondition> <Code programmingLanguage=“Neutral"> no </Code> <SourceQuestionReference> [Q1] </ SourceQuestionReference > </WhileCondition> <WhileConstructReference> [Q1] </WhileConstructReference> </RepeatWhile> 187
  • 189. RepeatUntil 189
  • 191. RepeatUntil (XML) <RepeatUntil> <UntilCondition> <Code programmingLanguage="Neutral"> yes </Code> <SourceQuestionReference> [Q1] </SourceQuestionReference> </UntilCondition> <UntilConstructReference> [Q1] </UntilConstructReference> </d:RepeatUntil> 191
  • 193. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 193
  • 194. 194
  • 195. How to extend the disco model? 195
  • 196. What we have seen so far • An introduction to the core DDI modules • A short introduction to the disco-model, which covers the most important parts of DDI-Lifecycle and DDI-Codebook 196
  • 197. How the presentation is continued • Now that we know the data model… • How can it be used for the Missy use-case, Matthäus? • Can we create a mapping of the fields in the disco-model and the fields in Missy? 197
  • 198. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 198
  • 200. 200
  • 201. 201
  • 202. 202
  • 203. 203
  • 204. 204
  • 205. Sub-Sample: • is-a Sub-Universe with the UniverseType „subSample“ Filter Statement • is-a Sub-Universe with the UniverseType „filterStatement“ 205
  • 206. 206
  • 207. 207
  • 208. 208
  • 209. 209
  • 210. 210
  • 211. 211
  • 212. 212
  • 213. 213
  • 214. What we have seen so far • disco-model • designed for the discovery use-case • provides object types, properties and data type properties designed for discovery use-case • One Missy use-case and all the data type and object properties that are covered • The idea of how theoretically the disco can be extended 214
  • 215. How the presentation is continued • But, what if a project has specific requirements to the model, that are not covered by the disco model? • How are we going to implement that? 215
  • 216. What comes next? • Missy – General Information • DDI-L Overview • Requirements to Developers • Identification, Versioning, Maintenance • Use-Cases in Missy • DDI-L Main Structures • DDI Instance Presentation • Study Unit • Software Architecture • Conceptual Component • Multitier • Logical Products • MVC • Data Collection • Missy Data Model • disco-model • Extendable Data Model Business Layer 216
  • 218. 218
  • 219. 219
  • 220. disco-model API • The disco-model implementation is designed to be extended • For one project’s individual requirements • Speaking in MVC-pattern language: the model is separated and independent from the views and the controllers 220
  • 221. disco-model API • abstract classes • define the basic elements, i.e. data type and object properties, that are provided and covered by the disco-model • may be extended, but cannot be instantiated • e.g. AbstractVariable • model classes • represent the actual entity types of disco-model • may be instantiated and reused • e.g. Variable 221
  • 222. disco-model API • abstract classes • define the basic elements, i.e. data type and object properties, that are provided and covered by the disco-model • may be extended, but cannot be instantiated • e.g. AbstractVariable • model classes • represent the actual entity types of disco-model • may be instantiated and reused • e.g. Variable • Implementation of the Template Pattern 222
  • 223. 223
  • 226. Project Specific Extensions • Projects may just simply use • and extend the defined classes • Introduce new fields and methods, such as getter/setter • Change the type of existing fields when they overwrite the return type (in Java this is only possible in case of covariant returns) 226
  • 227. How is the presentation continued • Now that I have the data model… and I can extend it • I just need to store my data somewhere… • Are there any persistence mechanisms that are offered by the DDI-Lifecycle model? 227
  • 228. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 228
  • 232. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 232
  • 236. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 236
  • 237. Archive 237
  • 238. Archive 238
  • 239. Archive Contains persistent lifecycle events Contains archive-specific information (archive reference, access, funding) 239
  • 240. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 240
  • 241. DDIProfile 241
  • 242. DDIProfile 242
  • 243. DDIProfile Describes which DDI-L elements you use in your institution’s DDI flavor 243
  • 244. How the presentation is continued • How can the disco-model be persisted? and • How can it be done in a project with individual data model elements? • How do we do this in the Missy application? 244
  • 245. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 245
  • 247. 247
  • 248. 248
  • 249. Persistence Layer – Motivation • In a well-defined software architecture the application itself does not need to know how the data is stored • It just needs to know the API • Methods that are provided to access and store objects • May be abstracted away from the actual implementation • An actual implementation or strategy can just be a matter of configuration 249
  • 250. Persistence Layer – Strategy Pattern • Implementation of the Strategy Pattern • simple interface • actual implementations are the different strategies • A strategy is an implementation of the actual type of persistence or physical storage, respectively • e.g. DDI-L-XML, DDI-RDF, XML-DB, Relational-DB, etc. 250
  • 251. 251
  • 252. Modules • disco-persistence-api • Defines persistence functionality for model components regardless of the actual type of physical persistence • disco-persistence-relational • Implements the persistence functionality defined in disco-persistence-api with respect to the usage of relational DBs • disco-persistence-xml • Implements the persistence functionality defined in disco-persistence-api with respect to the usage of DDI-XML • disco-persistence-rdf • Implements the persistence functionality defined in disco-persistence-api with respect to the usage of the disco-specification 252
  • 253. Data Access Object • A DAO is an object that is responsible for providing methods for reading and storing of objects in our model • A DAO is again implemented against an interface • For each business object in the data model there exists a DAO • Persistence strategy interface defines methods for obtaining specific DAOs (data access objects) 253
  • 254. DAO Pattern 254
  • 255. disco-persistence-api • Provides an API and implementations for the disco-model • PersistenceService • Available in the disco-persistence-api • Encapsulates the actual strategy • Strategy can be instantiated and injected into the PersistenceService 255
  • 256. Extendable disco-persistence-api • But, projects have own requirements to the data model • => new business objects are included • => DAOs need to be written for reading and storing these objects • => these DAOs also have to implement the appropriate type of persistence used by that project • => this API has to be extendable 256
  • 257. Extendable disco-persistence-api • But, projects have own requirements to the data model • => new business objects are included • => DAOs need to be written for reading and storing these objects • => these DAOs also have to implement the appropriate type of persistence used by that project • => this API has to be extendable • Luckily it is, because of the interface structure! • Projects only need to implement the defined methods 257
  • 258. 258
  • 259. 259
  • 260. Missy Specific Modules • missy-persistence-api • Defines additional persistence functionality for model components regardless of the actual type of physical persistence • missy-persistence-relational • Implements additional persistence functionality for model components with respect to the usage of relational DBs • missy-persistence-xml • Implements additional persistence functionality for model components with respect to the usage of DDI-XML • missy-persistence-rdf • Implements additional persistence functionality for model components with respect to the usage of the disco-specification 260
  • 261. What we have seen so far • Implementation of a persistence API for the disco model • How the persistence API can be extended in individual projects 261
  • 262. How the presentation is continued • How does, on the persistence level, the serialization of data stored in DDI really look like? 262
  • 263. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 263
  • 265. Questionnaire – Question (DDI 3.1 XML) 265
  • 266. DataCollection Instrument IntrumentName ControlConstructReference ControlCostructScheme QuestionConstruct QuestionReference QuestionScheme QuestionItem QuestionItemName QuestionText 266
  • 267. Questionnaire – Question (DDI-RDF) 267
  • 269. Variable – SummaryStatistics (DDI 3.1 XML) PhysicalInstance Statistics VariableStatistics VariableReference SummaryStatistic SummaryStatisticType 'Minimum' Value 10 Variable VariableName 'v1' Label 'Age' 269
  • 271. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 271
  • 275. Inheritance of Properties From Identifiable 275
  • 276. Inheritance of Properties From abstract class 276
  • 277. Declaration of own Properties Own properties 277
  • 278. What comes next? • What are you going to do with the MISSY project? • Is there ongoing work? 278
  • 279. Outline • Persistence Layer • Physical Data Product • Programming Interface • Physical Instance • Types of physical data storages • Archive • Examples • DDIProfile Data Storage Access • Outlook • DDI Serialization Examples • DDI-XML • DDI-RDF / disco-spec • Relational-DB Data Storage 279
  • 280. Outlook • YES! There is ongoing work! • Use the API in other projects, e.g. GESIS internal projects like StarDat • Publish the persistence API as open-source on github • Two presentations at IASSIST 2013 • One poster at IASSIST 2013 280
  • 281. Poster • Shows detailed information regarding Missy • Multitier Architecture, MVC • Employed Technologies • Specific Maven Modules • Project Home • Excerpts from disco-model and the mapping of the fields 281
  • 282. 282
  • 283. Thank you for your attention… • Feel free to download the sources from GitHub: https://github.com/missy-project • Give us feedback! • Feel free to criticize! Thomas Bosch Matthäus Zloch GESIS - Leibniz Institute for the Social Sciences GESIS - Leibniz Institute for the Social Sciences thomas.bosch@gesis.org matthaeus.zloch@gesis.org boschthomas.blogspot.com 283