SlideShare a Scribd company logo
1 of 11
Download to read offline
Formats for Open Data
    François Bancilhon
   twitter.com/fbancilhon
   www.data-publica.com

    Share-PSI Workshop
         Brussels
       May 10, 2011
Data Publica
●   Develop the most complete and in-depth
    knowledge of French electronic data. Provide a
    complete directory of public data in France.
●   Set up a DataStore, where people can find
    data provided by us (data hunting) and by
    outside vendors (data reseller)
CAVEAT
●   I strongly support the 10 principles of the
    Sunlight foundation
●   From bad to good, there is a spectrum, I
    support improvement rather than rejection of
    everything that is not perfect
●   This work derived from the recommendation of
    GFII (Groupement Français de l'Industrie de
    l'Information)
Summary
●   Open formats at the physical level
●   Standard formats at the conceptual level
●   Agreement on anonymization
●   Providing source data with pdf data
●   Privileging XML
●   Definition of exchange formats
Physical level
●   At the physical level (text, image, video, etc.),
    provide
    ●   an open format (a standard for which anyone can
        build tools)
    ●   a format compatible with the commonly used tools
Conceptual level
●   For every vertical, define standards that take
    into account the specificity of the area
●   Standards to be elaborated by researchers,
    users and industry representatives, at the
    European level
●   Examples: Inspire, ITS, XBRL, OAI
Anonymization
●   Provide an operational definition of
    anonymization
●   Standards for it and operational qualification
●   Make up ways to anonymize while keeping
    some meaning
●   Need for European standard and technology
Providing source data with pdf
●   PDF is a good format for consumer display
●   PDF is a bad format for re-use
●   Most of the time PDF is produced from some
    other source format
●   Request that PDF is provided together with its
    source (not always that simple)
Pushing for XML
●   Principle of improvement: the move to XML
    from organizations that were publishing in
    some other unfriendly format (eg PDF), is a
    good thing
Define exchange formats
●   Most open data formats are based on the use
    that the public body is making internally of this
    data
●   Define instead an exchange format based on
    transmission rather that on internal usage
Questions?


francois.bancilhon@data-publica.com
       www.data-publica.com
       twitter.com/fbancilhon

More Related Content

What's hot

What's hot (13)

FAIR in relation to drone and geosaptial data
FAIR in relation to drone and geosaptial dataFAIR in relation to drone and geosaptial data
FAIR in relation to drone and geosaptial data
 
Sensitive Data Workshop
Sensitive Data WorkshopSensitive Data Workshop
Sensitive Data Workshop
 
FutureTDM Roadmap
FutureTDM RoadmapFutureTDM Roadmap
FutureTDM Roadmap
 
FutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOSFutureTDM Symposium_DEMOS
FutureTDM Symposium_DEMOS
 
Fair data - dinkum research - by Andy Turner
Fair data -  dinkum research - by Andy TurnerFair data -  dinkum research - by Andy Turner
Fair data - dinkum research - by Andy Turner
 
Collaborations with Collection Holding Institutions
Collaborations with Collection Holding InstitutionsCollaborations with Collection Holding Institutions
Collaborations with Collection Holding Institutions
 
Toward FAIR Semantic Resources
Toward FAIR Semantic ResourcesToward FAIR Semantic Resources
Toward FAIR Semantic Resources
 
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
Text and Data Mining : Making the Most of a Copyright Exception. Julien Roche...
 
DMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionDMDW Lesson 01 - Introduction
DMDW Lesson 01 - Introduction
 
Datalift lod2-paris-24032011
Datalift lod2-paris-24032011Datalift lod2-paris-24032011
Datalift lod2-paris-24032011
 
Overview of the Sustainability Plans of the ICT-29b) Projects
Overview of the Sustainability Plans of the ICT-29b) ProjectsOverview of the Sustainability Plans of the ICT-29b) Projects
Overview of the Sustainability Plans of the ICT-29b) Projects
 
Frank Salliau, iMinds @Frankfurt Bookfair 2015, TISP workshop
Frank Salliau, iMinds @Frankfurt Bookfair 2015, TISP workshopFrank Salliau, iMinds @Frankfurt Bookfair 2015, TISP workshop
Frank Salliau, iMinds @Frankfurt Bookfair 2015, TISP workshop
 
Session 1.6 fostering interoperability of european qualifications: the qual...
Session 1.6   fostering interoperability of european qualifications: the qual...Session 1.6   fostering interoperability of european qualifications: the qual...
Session 1.6 fostering interoperability of european qualifications: the qual...
 

Viewers also liked (6)

BizСontacts.net Screenshots
BizСontacts.net ScreenshotsBizСontacts.net Screenshots
BizСontacts.net Screenshots
 
การเปลี่ยนสีสไลด์โดยใช้เมนู Background costom standard
การเปลี่ยนสีสไลด์โดยใช้เมนู Background costom standardการเปลี่ยนสีสไลด์โดยใช้เมนู Background costom standard
การเปลี่ยนสีสไลด์โดยใช้เมนู Background costom standard
 
Thesis i
Thesis iThesis i
Thesis i
 
Laspalabras
LaspalabrasLaspalabras
Laspalabras
 
Ortografia (I)
Ortografia (I)Ortografia (I)
Ortografia (I)
 
Ukraina, Valgevene, Moldova
Ukraina, Valgevene, MoldovaUkraina, Valgevene, Moldova
Ukraina, Valgevene, Moldova
 

Similar to Bancilhon

Open source a presentation
Open source   a presentationOpen source   a presentation
Open source a presentation
Amol Vidwans
 
Orange Labs R&D 2011
Orange Labs R&D 2011Orange Labs R&D 2011
Orange Labs R&D 2011
Yves Ezo
 
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
Matthias Stürmer
 

Similar to Bancilhon (20)

PROSE: Empowering FLOSS in European Projects
PROSE: Empowering FLOSS in European ProjectsPROSE: Empowering FLOSS in European Projects
PROSE: Empowering FLOSS in European Projects
 
Migrating ODF and LibreOffice in Taiwan
Migrating ODF and LibreOffice in TaiwanMigrating ODF and LibreOffice in Taiwan
Migrating ODF and LibreOffice in Taiwan
 
Free and Open Source Software technology: General Overview
Free and Open Source Software technology: General OverviewFree and Open Source Software technology: General Overview
Free and Open Source Software technology: General Overview
 
Free and Open Source Software technology: General Overview
Free and Open Source Software technology: General OverviewFree and Open Source Software technology: General Overview
Free and Open Source Software technology: General Overview
 
Open Document Format
Open Document FormatOpen Document Format
Open Document Format
 
"ODF in The Netherlands, What's Next ..."
"ODF in The Netherlands, What's Next ...""ODF in The Netherlands, What's Next ..."
"ODF in The Netherlands, What's Next ..."
 
Ibm
IbmIbm
Ibm
 
OWF13 - Is there an Open (Source) Europe?
OWF13 - Is there an Open (Source) Europe?OWF13 - Is there an Open (Source) Europe?
OWF13 - Is there an Open (Source) Europe?
 
FP7-ICT Programme
FP7-ICT ProgrammeFP7-ICT Programme
FP7-ICT Programme
 
2016 EDRLab roadmap at epubsummit
2016 EDRLab roadmap at epubsummit2016 EDRLab roadmap at epubsummit
2016 EDRLab roadmap at epubsummit
 
Open source a presentation
Open source   a presentationOpen source   a presentation
Open source a presentation
 
Orange Labs R&D 2011
Orange Labs R&D 2011Orange Labs R&D 2011
Orange Labs R&D 2011
 
Www sociam-2016-policy-reviews
Www sociam-2016-policy-reviewsWww sociam-2016-policy-reviews
Www sociam-2016-policy-reviews
 
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
User initiative for improving OOXML integration in LibreOffice/Apache Open Of...
 
How to start an open source project slides-dec2016
How to start an open source project   slides-dec2016How to start an open source project   slides-dec2016
How to start an open source project slides-dec2016
 
How Python Is Used In Machine Learning
How Python Is Used In Machine LearningHow Python Is Used In Machine Learning
How Python Is Used In Machine Learning
 
Data management plans and planning - a gentle introduction
Data management plans and planning - a gentle introductionData management plans and planning - a gentle introduction
Data management plans and planning - a gentle introduction
 
Freme general-overview-version-june-2015
Freme general-overview-version-june-2015Freme general-overview-version-june-2015
Freme general-overview-version-june-2015
 
Iiif to go iiif vatican (7 minutes)
Iiif to go   iiif vatican (7 minutes)Iiif to go   iiif vatican (7 minutes)
Iiif to go iiif vatican (7 minutes)
 
OSSF 2018 - Overcoming Compliance Barriers to Open Source Collaboration Infra...
OSSF 2018 - Overcoming Compliance Barriers to Open Source Collaboration Infra...OSSF 2018 - Overcoming Compliance Barriers to Open Source Collaboration Infra...
OSSF 2018 - Overcoming Compliance Barriers to Open Source Collaboration Infra...
 

More from ePSI Platform

E psi 22nd of february_warsaw_2013
E psi 22nd of february_warsaw_2013E psi 22nd of february_warsaw_2013
E psi 22nd of february_warsaw_2013
ePSI Platform
 
2013 02 22_w_wiewiorowski_epsi
2013 02 22_w_wiewiorowski_epsi2013 02 22_w_wiewiorowski_epsi
2013 02 22_w_wiewiorowski_epsi
ePSI Platform
 
E psi open data - rejseplanen
E psi   open data - rejseplanenE psi   open data - rejseplanen
E psi open data - rejseplanen
ePSI Platform
 
Ds.e psi conference.21 22.02.2013
Ds.e psi conference.21 22.02.2013Ds.e psi conference.21 22.02.2013
Ds.e psi conference.21 22.02.2013
ePSI Platform
 
Christian Laux on Liability
Christian Laux on LiabilityChristian Laux on Liability
Christian Laux on Liability
ePSI Platform
 
Liability for open data
Liability for open dataLiability for open data
Liability for open data
ePSI Platform
 
Otwarte zabytki epsi
Otwarte zabytki epsiOtwarte zabytki epsi
Otwarte zabytki epsi
ePSI Platform
 
E psi tomek-zielinski-transportoid-conference-slides
E psi tomek-zielinski-transportoid-conference-slidesE psi tomek-zielinski-transportoid-conference-slides
E psi tomek-zielinski-transportoid-conference-slides
ePSI Platform
 

More from ePSI Platform (20)

Iicensing open data
Iicensing open dataIicensing open data
Iicensing open data
 
Jjb e psi warsaw
Jjb e psi warsawJjb e psi warsaw
Jjb e psi warsaw
 
E psi 22nd of february_warsaw_2013
E psi 22nd of february_warsaw_2013E psi 22nd of february_warsaw_2013
E psi 22nd of february_warsaw_2013
 
2013 02 22_w_wiewiorowski_epsi
2013 02 22_w_wiewiorowski_epsi2013 02 22_w_wiewiorowski_epsi
2013 02 22_w_wiewiorowski_epsi
 
Transport Data Byrd
Transport Data ByrdTransport Data Byrd
Transport Data Byrd
 
Epsi conference
Epsi conferenceEpsi conference
Epsi conference
 
E psi open data - rejseplanen
E psi   open data - rejseplanenE psi   open data - rejseplanen
E psi open data - rejseplanen
 
Ds.e psi conference.21 22.02.2013
Ds.e psi conference.21 22.02.2013Ds.e psi conference.21 22.02.2013
Ds.e psi conference.21 22.02.2013
 
Christian Laux on Liability
Christian Laux on LiabilityChristian Laux on Liability
Christian Laux on Liability
 
Liability for open data
Liability for open dataLiability for open data
Liability for open data
 
Big Data Session Presentations
Big Data Session PresentationsBig Data Session Presentations
Big Data Session Presentations
 
Sl lgo
Sl lgoSl lgo
Sl lgo
 
Otwarte zabytki epsi
Otwarte zabytki epsiOtwarte zabytki epsi
Otwarte zabytki epsi
 
E psi tomek-zielinski-transportoid-conference-slides
E psi tomek-zielinski-transportoid-conference-slidesE psi tomek-zielinski-transportoid-conference-slides
E psi tomek-zielinski-transportoid-conference-slides
 
Moja polis basic
Moja polis basicMoja polis basic
Moja polis basic
 
PSI Re-use in Bulgaria
PSI Re-use in BulgariaPSI Re-use in Bulgaria
PSI Re-use in Bulgaria
 
Hamburg Transparency Law
Hamburg Transparency LawHamburg Transparency Law
Hamburg Transparency Law
 
Open Data: the state of the European Union
Open Data: the state of the European UnionOpen Data: the state of the European Union
Open Data: the state of the European Union
 
Psi group scoreboard
Psi group scoreboardPsi group scoreboard
Psi group scoreboard
 
Community Building as Scaffolding for a Working Public Sector
Community Building as Scaffolding for a Working Public SectorCommunity Building as Scaffolding for a Working Public Sector
Community Building as Scaffolding for a Working Public Sector
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Bancilhon

  • 1. Formats for Open Data François Bancilhon twitter.com/fbancilhon www.data-publica.com Share-PSI Workshop Brussels May 10, 2011
  • 2. Data Publica ● Develop the most complete and in-depth knowledge of French electronic data. Provide a complete directory of public data in France. ● Set up a DataStore, where people can find data provided by us (data hunting) and by outside vendors (data reseller)
  • 3. CAVEAT ● I strongly support the 10 principles of the Sunlight foundation ● From bad to good, there is a spectrum, I support improvement rather than rejection of everything that is not perfect ● This work derived from the recommendation of GFII (Groupement Français de l'Industrie de l'Information)
  • 4. Summary ● Open formats at the physical level ● Standard formats at the conceptual level ● Agreement on anonymization ● Providing source data with pdf data ● Privileging XML ● Definition of exchange formats
  • 5. Physical level ● At the physical level (text, image, video, etc.), provide ● an open format (a standard for which anyone can build tools) ● a format compatible with the commonly used tools
  • 6. Conceptual level ● For every vertical, define standards that take into account the specificity of the area ● Standards to be elaborated by researchers, users and industry representatives, at the European level ● Examples: Inspire, ITS, XBRL, OAI
  • 7. Anonymization ● Provide an operational definition of anonymization ● Standards for it and operational qualification ● Make up ways to anonymize while keeping some meaning ● Need for European standard and technology
  • 8. Providing source data with pdf ● PDF is a good format for consumer display ● PDF is a bad format for re-use ● Most of the time PDF is produced from some other source format ● Request that PDF is provided together with its source (not always that simple)
  • 9. Pushing for XML ● Principle of improvement: the move to XML from organizations that were publishing in some other unfriendly format (eg PDF), is a good thing
  • 10. Define exchange formats ● Most open data formats are based on the use that the public body is making internally of this data ● Define instead an exchange format based on transmission rather that on internal usage
  • 11. Questions? francois.bancilhon@data-publica.com www.data-publica.com twitter.com/fbancilhon