7. SDMX SDMX is primarily focused on the exchange and dissemination of statistical data and metadata. We have normally two different approach to exchange data: PUSH and PULL
8. SDMX PUSH mode means that the data provider takes action to send the data to the party collecting the data. PULL mode implies that the data provider makes the data available via the Internet. The data consumer then fetches the data on his own initiative.
9. SDMX SDMX promotes a “ data sharing ” model to facilitate low-cost, high-quality statistical data and metadata exchange. Data Providers publishes the availability of data/metadata to Data Consumers and the latter are responsible for fetching the data/metadata at will. .
20. Structural Definitions Topic A Brady Bonds B Bank Loans C Debt Securities Country AR Argentina MX Mexico SA South Africa Stock/Flow 1 Stock 2 Flow Concepts TOPIC COUNTRY FLOW
34. IPA 2007 - Tirana - INSTAT Data Set Identifier Variables Form Description STSIND_PROD (_M, _Q) 110 I Production in industry STSIND_TURN (_M, _Q) 120, 121, 122 N, I Turnover in industry, total, domestic and non-domestic (total, Euro-zone, non-Euro-zone) STSIND_ORD (_M, _Q) 130, 131, 132 N, I New orders received in industry, total, domestic and non-domestic (total, Euro-zone, non-Euro-zone) STSIND_EMPL (_M, _Q) 210 N, I Number of persons employed, Number of employees, in industry STSIND_HOUR (_M, _Q) 220 N, I Hours worked in industry STSIND_EARN (_M, _Q) 230 N, I Gross wages and salaries in industry STSIND_PRIC (_M, _Q) 310, 311, 312, 340 I Output prices in industry, total, domestic market, non-domestic market (total, Euro-zone, non Euro-zone), import prices (total, Euro-zone, non-Euro-zone) STSCONS_PROD (_M, _Q) 110, 115, 116 I Production in construction, total, building construction, civil engineering STSCONS_ORD (_M, _Q) 130, 135, 136 N, I New orders received in construction, total, building construction and civil engineering STSCONS_EMPL (_M, _Q) 210, 211 N, I Number of persons employed, Number of employees, in construction STSCONS_HOUR (_M, _Q) 220 N, I Hours worked in construction STSCONS_EARN (_M, _Q) 230 N, I Gross wages and salaries in construction STSCONS_PRIC (_M, _Q) 310, 320, 321, 322 I Output prices in construction, construction costs, material costs, labour costs STSCONS_PERM (_M, _Q) 411, 412 N, I Building permits, number of dwellings or square metres of useful floor area STSRTD_TURN (_M, _Q) 120, 123 N, I Turnover in retail trade, value or deflated STSRTD_EMPL (_M, _Q) 210, 211 N, I Number of persons employed, Number of employees, in retail trade STSSERV_TURN (_M, _Q) 120, 123 N, I Turnover in repair and other services, value or deflated STSSERV_PRIC (_M, _Q) 310 I Outut prices in other services STSSERV_EMPL (_M, _Q) 210, 211 N, I Number of persons employed, Number of employees, in repair and other services STSSERV_CAR (_M, _Q) Number of car registrations STSOTHER_OTH (_M, _Q) Any other indicator not mentioned in the list above
35. IPA 2007 - Tirana - INSTAT Concept Mnemonic Concept Name Format Description Code list ADJUSTMENT Adjustment AN1 Code defining the adjustment of data such as working day or seasonally adjusted, etc. CL_ADJUSTMENT FREQ Frequency AN1 Frequency of the series (e.g. A, Q, M). CL_FREQ OBS_CONF Confidentiality flag AN1 Confidentiality status of the observation CL_OBS_CONF OBS_PRE_BREAK Pre-break observation value AN…15 Observation value if the reason of the "break" did not show up. [Conditional] OBS_STATUS Status flag AN1 S tatus of the observation, such as normal, estimated or provisional CL_OBS_STATUS OBS_VALUE Value AN…15 The value of the index. ORGANISATION Organisation AN3 Reporting/sending or receiving organisation used in the message administration section. CL_ORGANISATION REF_AREA Reference area AN2 Reporting Country in ISO code (The country, or geographical/political group of countries that the measured economic phenomenon relates to) CL_AREA_EE STS_ACTIVITY Economic Activity code AN6 NACE Rev. 1.1 & special STS aggregates CL_STS_ACTIVITY STS_BASE_YEAR Series variation in short-term stats context AN4 Concept to distinguish series variations in a short-term stats context CL_STS_BASE_YEAR STS_INDICATOR STS Indicator AN4 Type of indicator, such as production, turnover, etc. CL_STS_INDICATOR STS_INSTITUTION Institution originating STS dataflow AN1 Institution originating STS dataflow CL_STS_INSTITUTION TIME_FORMAT Time Format Code AN3 Technical use in message. TIME_PERIOD Time Period AN…35 The time period of the data.
36. IPA 2007 - Tirana - INSTAT Code List Mnemonic Code List Name Format CL_ADJUSTMENT Adjustment code AN1 CL_AREA_EE Country code AN2 CL_FREQ Frequency code AN1 CL_OBS_CONF Confidentiality flag AN1 CL_OBS_STATUS Observation status flag AN1 CL_ORGANISATION Organisation code list AN3 CL_STS_ACTIVITY STS Economic Activity code list AN6 CL_STS_BASE_YEAR Suffix in short-term stats context code list AN4 CL_STS_INDICATOR Indicators index code AN4 CL_STS_INSTITUTION Institution originating STS dataflow code list AN1
37. IPA 2007 - Tirana - INSTAT Value Description Variable PROD Production 110, 115, 116 TOVT Turnover (total turnover, non-deflated) 120 TOVD Turnover, domestic market (non-deflated) 121 TOVE Turnover, non-domestic market (non-deflated) 122 TOVV Turnover deflated (volume of sales) 123 TOVX Turnover, non-domestic market (non-deflated) (non-Euro-zone) 122 TOVZ Turnover, non-domestic market (non-deflated) (Euro-zone) 122 DEFL Deflator of sales 330 ORDT New orders received (total) 130, 135, 136 ORDD New orders received, domestic market 131 ORDE New orders received, non-domestic market 132 ORDX New orders received, non-domestic market (non-Euro-zone) 132 PRON Output prices for industry and services (total) 310 PRIN Output prices, domestic market 311 PREN Output prices, non-domestic market (can be approximated by unit value index , variable 313) 312, 313 PREX Output prices, non-domestic market (non-Euro-zone) 312 PREZ Output prices, non-domestic market (Euro-zone) 312 IMPR Import prices (total) 340 IMPX Import prices (non-Euro-zone) 340 IMPZ Import prices (Euro-zone) 340 EMPL Number of persons employed (can be approximated by number of employees, variable 211) 210, 211 HOWK Hours worked 220 WAGE Gross wages and salaries 230 PNUM Building permits, number of dwellings 411 PSQM Building permits: square metres of useful floor area 412 CSTI Construction costs (total) 320 CSTM Construction costs, material costs 321 CSTL Construction costs, labour costs 322 CSTO Output prices for construction (approximation for construction costs, variable 320) 310 CREG Car registrations (not in STS Regulation)
46. SDMX-ML: Six standard messages Fixed To query a database to obtain an SDMX-ML message as the result Query message 6 Derived from data structure definition message Exchange of many observation types in a data structure definition-dependent form Cross-sectional Data Message 5 Derived from data structure definition message For schema-based functions, such as validation, in a data structure definition-dependent form Utility Data Message 4 Derived from data structure definition message Exchange of large data sets in a data structure definition-dependent form Compact Data Message 3 Fixed Conveys data in a form independent of a data structure definition. It is designed for data provision on websites and in any scenario where applications receiving the data may not have detailed understanding of the data set's structure before they obtain the data set itself. Generic Data Message 2 Fixed Contains a data structure definition Structure Definition Message 1 Schema file Short description Name of message
47. Cross-Sectional Data Set <demo:DataSet REV_NUM = "1" TAB_NUM = "RQFI05V1" > < demo:Group COUNTRY = "FI" FREQ = "A" TIME = "2005" TIME_FORMAT = "P1Y" > < demo:Section DECI = "0" UNIT = "PERS" UNIT_MULT = "0" > < demo:ADJT OBS_STATUS = "P" SEX = "F" value = "35" /> < demo:DEATHST OBS_STATUS = "P" SEX = "F" value = "23871" /> < demo:LBIRTHST OBS_STATUS = "P" SEX = "F" value = "28345" /> < demo:NETMT OBS_STATUS = "P" SEX = "F" value = "4187" /> < demo:PJAN1T OBS_STATUS = "P" SEX = "F" value = "2683230" /> < demo:PJANT OBS_STATUS = "P" SEX = "F" value = "2674534" /> < demo:ADJT OBS_STATUS = "P" SEX = "M" value = "131" /> < demo:DEATHST OBS_STATUS = "P" SEX = "M" value = "24057" /> < demo:LBIRTHST OBS_STATUS = "P" SEX = "M" value = "29400" /> < demo:NETMT OBS_STATUS = "P" SEX = "M" value = "4799" /> < demo:PJAN1T OBS_STATUS = "P" SEX = "M" value = "2572350" /> < demo:PJANT OBS_STATUS = "P" SEX = "M" value = "2562077" /> < demo:ADJT OBS_STATUS = "P" SEX = "T" value = "166" /> < demo:DEATHST OBS_STATUS = "P" SEX = "T" value = "47928" /> < demo:LBIRTHST OBS_STATUS = "P" SEX = "T" value = "57745" /> < demo:NETMT OBS_STATUS = "P" SEX = "T" value = "8986" /> < demo:PJAN1T OBS_STATUS = "P" SEX = "T" value = "5255580" /> < demo:PJANT OBS_STATUS = "P" SEX = "T" value = "5236611" /> < /demo:Section> < demo:Section DECI = "0" UNIT = "PURE_NUMB" UNIT_MULT = "0" > < demo:DIV OBS_STATUS = "P" SEX = "T" value = "13383" /> < demo:MAR OBS_STATUS = "P" SEX = "T" value = "29283" /> < /demo:Section> < demo:Section DECI = "3" UNIT = "PURE_NUMB" UNIT_MULT = "0" > < demo:TFRNSI SEX = "T" value = "1800" /> < /demo:Section> < /demo:Group> </demo:DataSet>
58. Thank you for your attention Vincenzo Patruno: [email_address]
Hinweis der Redaktion
<pagebreak> This diagram depicts one of the most important SDMX Artefacts, the Data Structure Definition (aka Key Family). NOTE: The three classes (DataSet, DataflowDefintion, Category) above the DSD class are not part of the Data Structure Definition, but are included in this diagram in order to show how the DSD is used by Dataflows that define DataSets and are linked to Categories within a Category Scheme. Three conceptual levels can be identified within the DSD diagram. The first level is the Structure level where the DSD is identified (id, version and other attributes not shown in this diagram). The second level comprises ComponentLists. These are conceptual groupings of components (i.e. Key, Groups). The third level includes the Components used by the DSD in order to define the structure of DataSet(s). Moreover, Components reference Concepts residing in ConceptSchemes and may have specific Roles within the DSD. Finally, a Dimension may be a MeasureType dimension, thus defining a Cross-Sectional Measure (XSMeasure class) for every Code included in the Codelist related to this Dimension.
<pagebreak> The SDMX standard specifies a single Information Model in order to describe what and how data and metadata can be exchanged in the context of SDMX. Based on this Information Model, SDMX defines two ways of representing its messages. Two different formats are available for expressing SDMX messages. The first format is SDMX-EDI and is equivalent to GESMES/TS. This format is based on an EDIFACT syntax and is Time-Series oriented. This means that only one type of observation throughout time can be carried in a single DataSet. Moreover, the DataSet messages have only one format. This format covers a subset of the SDMX-IM. For example, non time-series messages or reference metadata messages are not supported. The second format is SDMX-ML. It is an XML format that covers the whole SDMX-IM. The SDMX-ML format supports four different (although equivalent) formats, for data messages, in order to serve different purposes. It also supports reference metadata messages as well as messages for querying SDMX Web Services and Registry Interface messages. Two simple “rules” in order to go from the SDMX-IM to the SDMX-ML implementation are that: (concrete) classes become XML elements and their attributes become XML attributes. Of course, there are exceptions in these “rules”. In the SDMX-ML implementation there are XML elements that do not correspond directly to classes from the SDMX-IM and vice versa.
<pagebreak> The Cross-Sectional format is the only one capable of expressing non-time series DSDs. In case a TimeDimension is defined in the DSD, the Cross-Sectional measure is equivalent to the rest of the formats. In the namespace declarations of this message, a DSD specific XSD should be included in order to enable syntactical validation of the message. This XSD file can be downloaded from Eurostat’s SDMX Registry (https://webgate.ec.europa.eu/sdmxregistry). The layout of the Cross-Sectional type adopts the nesting used in the Generic format but expressing SDMX Artefacts like the Compact format. A major difference in this format is that the elements have different semantic than the previous formats. Although four levels are also defined in this format, only the <DataSet> element is used in the same way. The <Group> element is independent of the Groups included in the DSD. The <Section> element is different from the <Series> element in the sense that it does not include a time-series but a vertical slice across a time-series. Finally, the observation values are now included in DSD specific elements that correspond to the CrossSectionalMeasures defined in the DSD. The SDMX Dimensions and Attributes can be attached at any of the four aforementioned levels as defined in the DSD. Moreover, in the DSD more than one possible levels can be defined per SDMX Component. Of course, only one instance of each SDMX Component should exist in each combination of the four “level” elements. Apart from the capability of reporting data without Time, the Cross-Sectional format is also a useful message for reporting more than one measures at a single DataSet. Thus, in some cases the produced file is even smaller in size than the equivalent Compact message.
<pagebreak> The SDMX standard, in its 5 th document of specifications (http://www.sdmx.org/docs/2_0/SDMX_2_0%20SECTION_05_RegistrySpecification.pdf), gives details on the interfaces that an SDMX compliant Registry should implement. Based on these specifications, Eurostat’s SDMX Registry has been developed and is now deployed in the European Commission’s production environment ( https://webgate.ec.europa.eu/sdmxregistry/ ). Eurostat’s SDMX Registry has been developed in order to be used a central repository for: Structural metadata: Code Lists, Concept Schemes, Data Structure Definitions, Metadata Structure Definitions, Category Schemes, Organisation Schemes, Hierarchical Code Lists Provisioning metadata: Data flows, Metadata flows, Provision Agreements The major interface of Eurostat’s SDMX Registry is a Web Service implementing the SDMX Registry Interface messages. These are SDMX-ML messages and are specified within the standard and by the following XSD: http://www.sdmx.org/docs/2_0/SDMXRegistry.xsd A Graphical User Interface (GUI) has been also implemented in order to enable human interaction of the World Wide Web. The user-authentication is realized using CIRCA accounts. A standalone tool implementing almost all functionality of the SDMX registry is the DSW (already presented).