Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Cedem16 keynote tambouris

160 Aufrufe

Veröffentlicht am

Cedem16 keynote tambouris

Veröffentlicht in: Internet
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Cedem16 keynote tambouris

  1. 1. CeDEM 2016 Keynote 18-20 May 2016, Danube University Krems Multidimensional Open Government Data Efthimios Tambouris Associate Professor, University of Macedonia, Greece tambouris@uom.gr
  2. 2. 18-20 May 2016, Danube University Krems CeDEM16 2  Assistant professor of “Information Systems and eGovernment”, University of Macedonia, Thessaloniki, Greece  Technical academic background  Almost 2 decades of work on eGovernment  Coordination of many eGov projects: FP5 EURO-CITI, FP5 eGOV, FP6 OneStopGov, FP7 OpenCube, H2020 OpenGovIntelligence  Participation in numerous other research projects  Expert for European Center for Standardisation (CEN) and European Commission  More than 150 publications  Co-chair of the IFIP ePart conference About me
  3. 3. 18-20 May 2016, Danube University Krems CeDEM16 3  Statistical Open Government Data  Multidimensional Data - The Data Cube Model  Projects  Technologies and Tools  Current and Future Work 3 Table of Contents
  4. 4. 18-20 May 2016, Danube University Krems CeDEM16 4  More than 180 Open Data portals around the globe provide data that “can be freely used, reused and redistributed by anyone” A lot of data out there …
  5. 5. 18-20 May 2016, Danube University Krems CeDEM16 5
  6. 6. 18-20 May 2016, Danube University Krems CeDEM16 6 A classification of OGD portals A stage model for OGD … and a lot of conceptual research … Kalampokis, E., Tambouris, E., Tarabanis, K.: A Classification Scheme for Open Government Data: Towards Linking Decentralized Data. International Journal of Web Engineering and Technology 6(3), 266–285 (2011) Kalampokis, Ε., Tambouris Ε. and Tarabanis Κ., Open Government Data: A Stage Model. In: M. Janssen et al. (Eds): EGOV2011. LNCS 6846, 235-246, 2011.
  7. 7. 18-20 May 2016, Danube University Krems CeDEM16 7  … first evidence suggests that OGD supply does not automatically bring value  So, let’s examine in more detail the actual data in all these OGD portals … but, where is the value?
  8. 8. 18-20 May 2016, Danube University Krems CeDEM16 8  A big part of Open Data concerns statistics, such as demographics, economic and social data  The role of statistical data is also recognised in EC reports etc.  For example, 6238 out of 8610 datasets of the EU Open Data Portal are statistical data Open Statistical Data
  9. 9. 18-20 May 2016, Danube University Krems CeDEM16 9  We can also do statistical analyses  We can visualise data  We can develop apps  What’s new in this OGD portals environment?  We can now exploit data from one or multiple data portals …  … and do all the above! Open Statistical Data Potential and Benefits
  10. 10. 18-20 May 2016, Danube University Krems CeDEM16 10  Internet usage (Digital Agenda) vs Individuals' level of internet skills (Eurostat)  Y axis - Internet usage: Percentage of households that have daily access to the Internet at home (ages 16-24) [scale 0-100]  X – axis Individuals' level of internet skills: % of the total number of individuals (ages 16-24) [scale 0-1] Example – 1
  11. 11. 18-20 May 2016, Danube University Krems CeDEM16 11  Lifelong learning (Eurostat) vs Internet usage (Digital Agenda)  Lifelong learning: number of persons aged 25 to 64 who stated that they received education or training in the four weeks preceding the survey [scale 0-100]  X axis - Internet usage: Percentage of households that have daily access to the Internet at home (ages 16-74) [scale 0-1] Example – 2
  12. 12. 18-20 May 2016, Danube University Krems CeDEM16 12  Combining data about a single measure from a single OGD portal is non- trivial  For example, search data.gov.uk for “unemployment”  How many files you expect and from how many portals?  More than 1700 files from 9 different portals  Data need to be manually extracted and processed  We need a better alternative! Challenge: Data Integration (One portal)
  13. 13. 18-20 May 2016, Danube University Krems CeDEM16 13  The situation is much worse when we need to exploit data from multiple portals  It seems we are building open data silos Challenge: Data Integration (Many portals)
  14. 14. 18-20 May 2016, Danube University Krems CeDEM16 14  Statistical Open Government Data  Multidimensional Data - The Data Cube Model  Projects  Technologies and Tools  Current and Future Work 14 Table of Contents
  15. 15. 18-20 May 2016, Danube University Krems CeDEM16 15  Data: 7.8  Is it good or bad?  Obviously context is relevant. Relevant questions:  What does it measure?  It measures unemployment  What is the unit of measurement (%, million people etc)  It is %  Which place?  Greece  Can’t be. Something is wrong!  Which year?  2008  What sex? What age group? Etc. etc. Understanding Statistical Data: Let’s take an example
  16. 16. 18-20 May 2016, Danube University Krems CeDEM16 16  Statistical data is often organised as data cubes, where each cell contains a measure described based on a number of dimensions  More than 3 dimensions may exist  Data characterised by many dimensions are also called multidimensional data Modeling Statistical Data
  17. 17. 18-20 May 2016, Danube University Krems CeDEM16 17  In this example:  Measure: unemployment rate  Dimensions: Year, Country, Age group  Value: 10.2  Unit of measurement: % Example
  18. 18. 18-20 May 2016, Danube University Krems CeDEM16 18 Multidimensional data are not new!  Data cubes are essential for Data Mining, Business Intelligence etc.  OLAP Operations:  Roll up (drill up): summarize by climbing up a hierarchy or by dimension reduction, e.g. unemployment at EU level  Drill down (roll down): going down a hierarchy or dimension introduction, e.g. unemployment at regional level or introducing sex as a forth dimension  Dice: select, e.g. only data for France and Greece and for 2000 and 2001 are shown  Slice: project data, e.g. only 2000 data are shown  Pivot: rotate the cube for visualisation purposes  All this knowledge can be exploited!  But, surprisingly, there is even more!!
  19. 19. 18-20 May 2016, Danube University Krems CeDEM16 19  Open Statistical Data integration is now the problem of combining cubes  An interesting case: expanding a cube with data from other cubes  E.g. expand a dataset about unemployment in European countries in different years by reusing cubes with unemployment data in lower geographical level Challenge: Data integration Example 1 E. Kalampokis, E. Tambouris, A. Karamanou, K. Tarabanis (2016) Open Statistics: The Rise of a new Era for Open Data?, EGOV2016, LNCS, Springer [accepted for publication]
  20. 20. 18-20 May 2016, Danube University Krems CeDEM16 20  Another interesting case: create a cube from the intersection of two other cubes  E.g. we integrate two cubes with the same dimensions but with different measures (unemployment and poverty). The resulted cube contains two measures along the same dimentions. Challenges: Data integration Example 2 E. Kalampokis, E. Tambouris, A. Karamanou, K. Tarabanis (2016) Open Statistics: The Rise of a new Era for Open Data?, EGOV2016, LNCS, Springer [accepted for publication]
  21. 21. 18-20 May 2016, Danube University Krems CeDEM16 21  There are even more challenges and opportunities  Study statistical analysis methods and techniques in the context of data cubes and define specific requirements that would enable automated and massive analyses  Open up and connect statistical analyses methods and models to Open Statistical data  This will enable gathering, understanding and comparing different statistical methods and/or data proposed for the same phenomenon Challenges: Data Analysis
  22. 22. 18-20 May 2016, Danube University Krems CeDEM16 22  Statistical Open Government Data  Multidimensional Data - The Data Cube Model  Projects  Technologies and Tools  Current and Future Work 22 Table of Contents
  23. 23. 18-20 May 2016, Danube University Krems CeDEM16 23 Projects
  24. 24. 18-20 May 2016, Danube University Krems CeDEM16 24  OpenCube was a 2-year project funded by the EU within FP7  The project developed and tested processes and tools for managing statistical linked open data  The results:  Facilitated data publishers to create linked data cubes from legacy formats  Empowered data users to browse, visualise, link, expand and analyse data cubes.  Enabled analysis not possible before (merging data cubes at a Web scale) The OpenCube project
  25. 25. 18-20 May 2016, Danube University Krems CeDEM16 25 OpenCube Toolkit  Creating components  TARQL extension  D2RQ /R2RML-QB extension  JSON-stat  Grafter  Exploiting components  OpenCube Browser  OpenCube MapView  R Analysis Chart  Expanding components  OpenCube tools support the whole lifecycle of linked statistical data.  License scheme  OpenCube components are provided under open source licenses  Check http://opencube-toolkit.eu  But, commercial solutions are also offered by consortium members http://opencube-toolkit.eu/
  26. 26. 18-20 May 2016, Danube University Krems CeDEM16 26  Statistical Open Government Data  Multidimensional Data - The Data Cube Model  Projects  Technologies and Tools  Current and Future Work 26 Table of Contents
  27. 27. 18-20 May 2016, Danube University Krems CeDEM16 27  Resource Description Framework (RDF): Data are triples  E.g. dbpedia: RDF version of Wikipedia  Data (RDF) can be processed by applications using SPARQL  E.g. Select from dbpedia the population and area of all European countries  It is now easy to develop applications, do visualisations, perform statistical or other analyses  Many programming languages and even applications (e.g. R) have modules to easily handle RDF Some basics – 1 Internet Data in ONE Linked Open Data store are just a SPARQL query away!
  28. 28. 18-20 May 2016, Danube University Krems CeDEM16 28  You can link concepts in one location with concepts in another location  So, you can have distributed open data portals where data are actually linked to each other Some basics – 2 Data in ALL Linked Open Data stores are just a SPARQL query away! Internet
  29. 29. 18-20 May 2016, Danube University Krems CeDEM16 29 Linked Open Data: TBL 5-star model http://5stardata.info/en/
  30. 30. 18-20 May 2016, Danube University Krems CeDEM16 30 Linked Data and Open Statistics  Linked data has the potential to create value to society, enterprises, and public administration through combining statistics from various sources and performing analytics on top of integrated statistical data.
  31. 31. 18-20 May 2016, Danube University Krems CeDEM16 31 Technical Prerequisites  Metadata for data discovery  Vocabularies  Code lists, concept schemes and classifications  Typed links (e.g. olws:sameAs) between  Dimensions definitions  Values of dimensions  Categories of measures
  32. 32. 18-20 May 2016, Danube University Krems CeDEM16 32  The RDF Data Cube Vocabulary has been proposed for modelling multi-dimensional data as RDF graphs. RDF Data Cube Vocabulary
  33. 33. 18-20 May 2016, Danube University Krems CeDEM16 33  A lifecycle for statistical LD  The lifecycle is divided into three phases: create, expand and exploit (or consume)  The lifecycle prescribes the steps that raw data cubes* should go through in order to create value. Linked Statistical Data Lifecycle E. Tambouris, E. Kalampokis, K. Tarabanis (2015) Processing Linked Open Data Cubes, Electronic Government Volume 9248 of the series Lecture Notes in Computer Science pp 130-143. * We assume statistical data is organized as data cubes, where each cell contains a measure described based on a number of dimensions.
  34. 34. 18-20 May 2016, Danube University Krems CeDEM16 34  Need to transform their data into RDF data cubes  Need to link their concepts (e.g. measures) with the same concepts elsewhere Data Producers
  35. 35. 18-20 May 2016, Danube University Krems CeDEM16 35  Need tools to find, visualise, merge and analyse data Data Consumers
  36. 36. 18-20 May 2016, Danube University Krems CeDEM16 36 Software for Creating data cubes
  37. 37. 18-20 May 2016, Danube University Krems CeDEM16 37 OLAP Browser for Consuming Linked Open Statistical Data
  38. 38. 18-20 May 2016, Danube University Krems CeDEM16 38 OLAP Browser for Linked Open Statistical Data  When the user selects at least one measure and one dimension… The geo dimension has 4 levels
  39. 39. 18-20 May 2016, Danube University Krems CeDEM16 39  When the user selects a second level in a dimension… OLAP Browser (Drill-down & roll-up) 2 levels have been selected
  40. 40. 18-20 May 2016, Danube University Krems CeDEM16 40 OLAP Browser (Select more dimensions)  The third dimension is fixed to a specific value… We set fixed values for the other dimensions
  41. 41. 18-20 May 2016, Danube University Krems CeDEM16 41 OLAP Browser (Add more cubes)
  42. 42. 18-20 May 2016, Danube University Krems CeDEM16 42 OLAP Browser (Add more cubes) A new measure has been added in the initial cube
  43. 43. 18-20 May 2016, Danube University Krems CeDEM16 43 OLAP Browser (Add more cubes) Select another cube Add new values to the Geographical dimension
  44. 44. 18-20 May 2016, Danube University Krems CeDEM16 44 OLAP Browser (Browsing Multiple Cubes) Employment from two pilots (Flemish, Scottish)
  45. 45. 18-20 May 2016, Danube University Krems CeDEM16 45  Visualization of RDF data cubes on a map.  It supports:  Markers  Bubble  Choropleth maps Exploiting data: OpenCube MapView
  46. 46. 18-20 May 2016, Danube University Krems CeDEM16 46  Visualisation of analysis results (charts & tables)  Reuse of analysis results: preserving R output as linked data Exploiting data: Integration with R
  47. 47. 18-20 May 2016, Danube University Krems CeDEM16 47 Exploiting data: Other Visualizations Analytics and ReportingVisualization and Exploration Stock chart
  48. 48. 18-20 May 2016, Danube University Krems CeDEM16 48  Statistical Open Government Data  Multidimensional Data - The Data Cube Model  Projects  Technologies and Tools  Current and Future Work 48 Table of Contents
  49. 49. 18-20 May 2016, Danube University Krems CeDEM16 49  The OpenGovIntelligence project was launched in February 2016. It will run for three years until early 2019.  Better use of multidimensional statistical data can help governments to improve the design and provision of public services  It is funded by the EU under Horizon 2020 Current and Future work
  50. 50. 18-20 May 2016, Danube University Krems CeDEM16 50  Pilots in 6 countries where government authorities will co-provide innovative, data-driven public services to citizens, businesses and public authorities by exploiting multidimensional statistical data Current and Future work
  51. 51. 18-20 May 2016, Danube University Krems CeDEM16 51  URL http://www.opengovintelligence.eu/  Twitter @OpenGovInt Thank you for your attention!! Themis Tambouris tambouris@uom.gr Current and Future work

×