SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Richard Akerman NRC-CISTI Presented at Access 2009, Oct. 1, 2009 Will We Command Our Data?From the Petascale to the Personal
Overview Definitions / Assumptions How Big is Data? Four Sources of Data Drivers Activities
Definitions / Assumptions Petabyte = 1000 Terabytes data = datasets “data is”
How Big is Data?  http://www.instructables.com/file/FA9N61CF54HJ6GG/
How Big is Data? http://www.flickr.com/photos/doctorow/2731870631/
How Big is Data? http://en.wikipedia.org/wiki/File:Postduif.jpg
Four Sources of Data Research data Government data Library data Personal data
General Drivers Since 2000, a convergence of factors: Value of sharing Ease of sharing Level of sharing (machine level)
Specific Drivers: Research Data OECD Principles and Guidelines for Access to Research Data from Public Funding (April 2007) The Toronto Statement on prepublication data sharing (September 2009)
OECD Principles “Open access to research data from public funding should be easy, timely, user-friendly and preferably Internet-based.” http://www.flickr.com/photos/ben-zvan-photography/468487548/
Specific Drivers: Open Government Data US Memorandum on Transparency and Open Government (January 2009) US Memorandum on the Freedom of Information Act (January 2009)
Specific Drivers: Open Government Data UK Power of Information Task Force Report (March 2009) Modernise data publishing and reusehttp://poit.cabinetoffice.gov.uk/poit/category/data-final/ “public information held by for example the police, health bodies and local authorities is often not available. This is bad for democratic expression, the economy and citizen customers.” Data.gov (May 2009) UK PM Brown meets with Sir Berners-Lee (Sept. 2009)
Specific Drivers: Library Data ILS Customer Bill-of-Rights, John Blyberg (November 2005) “Berkeley Accord” (March 2008)
Specific Drivers: Personal Data Wired cover feature “Living by numbers” (July 2009) “Know Thyself: Tracking Every Facet of Life, from Sleep to Mood to Pain, 24/7/365” “Numbers are making their way into the smallest crevices of our lives. We have pedometers in the soles of our shoes and phones that can post our location as we move around town. We can tweet what we eat into a database and subscribe to Web services that track our finances. There are sites and programs for monitoring mood, pain, blood sugar, blood pressure, heart rate, 
 and prayers.”
Why Libraries Advocates Exemplars Experts
Research Data:DataCite http://www.datacite.org/ “DOIs for data” “The long term vision of the partnership is to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence.”
Research Data: Gateway to Data Sets NRC-CISTI, Gateway to (Canadian) Scientific Data Sets http://cisti-icist.nrc-cnrc.gc.ca/eng/services/cisti/scientific-data/data-sets/ e.g. Canadian Astronomy Data Centre (CADC), Large Synoptic Survey Telescope (LSST)
Government Data: Canada - Federal http://geogratis.cgdi.gc.ca/ StatsCanData Liberation Initiative (DLI) Ontario Data Documentation, Extraction Service and Infrastructure Initiative (ODESI) “The project will target Statistics Canada datasets... The files will be marked-up using DDI, an international, XML-based metadata tagging system which allows data resource discovery, distributed access, extraction and analysis.”
Government Data: Municipal - Vancouver http://data.vancouver.ca/
Government Data:Municipal - SF San Francisco http://datasf.org/
Library Data A million free covers from LibraryThing Open Library http://openlibrary.org/dev/docs/data Talis Connected Commons MESUR – Services http://id.loc.gov/ (LCSH)
APIs vs raw data APIs Always serve up latest data Control over access Tracking/stats Advanced/complex functionality on top of the data Raw data Unconstrained / can do things never imagined by API Hard to track / version Can lose metadata Allows choice of computing
Personal Data:Daytum http://www.daytum.com/
Personal Data:Total Recall http://totalrecallbook.com/(Sept. 2009)
Richard Akerman © 2009 Government of Canada Licensed in the Creative Commons Thank You http://creativecommons.org/licenses/by-nc-sa/2.5/ca/

Weitere Àhnliche Inhalte

Was ist angesagt?

Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odysseymhaendel
 
BL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - DatasetsBL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - Datasetsjohnkayebl
 
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...EUDAT
 
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
 
Manchester Business School Nov 2010
Manchester Business School Nov 2010Manchester Business School Nov 2010
Manchester Business School Nov 2010johnkayebl
 
ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ARDC
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipityAndrew Treloar
 
Introducing linked data
Introducing linked dataIntroducing linked data
Introducing linked dataAlison Hitchens
 
OzNome - Interoperable data as an example of FAIR data principlesfair
OzNome - Interoperable data as an example of FAIR data principlesfairOzNome - Interoperable data as an example of FAIR data principlesfair
OzNome - Interoperable data as an example of FAIR data principlesfairARDC
 
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchDataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchIAALD Community
 
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...ARDC
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVEUDAT
 
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
 
Data are the new black : Susan Robbins
Data are the new black : Susan RobbinsData are the new black : Susan Robbins
Data are the new black : Susan Robbinstherese nolan-brown
 
A Scientist's Perspective on Open Access and Data Management by Leigh Winowiecki
A Scientist's Perspective on Open Access and Data Management by Leigh WinowieckiA Scientist's Perspective on Open Access and Data Management by Leigh Winowiecki
A Scientist's Perspective on Open Access and Data Management by Leigh WinowieckiCIAT
 
Open, FAIR data and RDM
Open, FAIR data and RDMOpen, FAIR data and RDM
Open, FAIR data and RDMSarah Jones
 
Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19Richard Ferrers
 

Was ist angesagt? (19)

Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odyssey
 
BL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - DatasetsBL Social Sciences Post Graduate Training Day - Datasets
BL Social Sciences Post Graduate Training Day - Datasets
 
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...
Linked Data and Semantic Web - EUDAT Summer School (Yann Le Franc, e-Science ...
 
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)
 
Manchester Business School Nov 2010
Manchester Business School Nov 2010Manchester Business School Nov 2010
Manchester Business School Nov 2010
 
FAIR data
FAIR dataFAIR data
FAIR data
 
ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017ALSWH accessible webinar 6 Sep 2017
ALSWH accessible webinar 6 Sep 2017
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipity
 
Introducing linked data
Introducing linked dataIntroducing linked data
Introducing linked data
 
OzNome - Interoperable data as an example of FAIR data principlesfair
OzNome - Interoperable data as an example of FAIR data principlesfairOzNome - Interoperable data as an example of FAIR data principlesfair
OzNome - Interoperable data as an example of FAIR data principlesfair
 
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support ResearchDataStarR: A Data Sharing and Publication Infrastructure to Support Research
DataStarR: A Data Sharing and Publication Infrastructure to Support Research
 
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
 
Modeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROVModeling Data Life Cycles with PROV
Modeling Data Life Cycles with PROV
 
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...
 
Data are the new black : Susan Robbins
Data are the new black : Susan RobbinsData are the new black : Susan Robbins
Data are the new black : Susan Robbins
 
A Scientist's Perspective on Open Access and Data Management by Leigh Winowiecki
A Scientist's Perspective on Open Access and Data Management by Leigh WinowieckiA Scientist's Perspective on Open Access and Data Management by Leigh Winowiecki
A Scientist's Perspective on Open Access and Data Management by Leigh Winowiecki
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Open, FAIR data and RDM
Open, FAIR data and RDMOpen, FAIR data and RDM
Open, FAIR data and RDM
 
Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19
 

Andere mochten auch

Open Scientific Data
Open Scientific DataOpen Scientific Data
Open Scientific DataRichard Akerman
 
Science to the People
Science to the PeopleScience to the People
Science to the PeopleRichard Akerman
 
Springtime for publishers - 20120711
Springtime for publishers - 20120711Springtime for publishers - 20120711
Springtime for publishers - 20120711Richard Akerman
 
Service-Oriented Architecture for Libraries
Service-Oriented Architecture for LibrariesService-Oriented Architecture for Libraries
Service-Oriented Architecture for LibrariesRichard Akerman
 
When are we going to get to the science factory?
When are we going to get to the science factory?When are we going to get to the science factory?
When are we going to get to the science factory?Richard Akerman
 
Springtime for Publishers?
Springtime for Publishers?Springtime for Publishers?
Springtime for Publishers?Richard Akerman
 
Culture Shock: Managing the Change in Publishing
Culture Shock: Managing the Change in PublishingCulture Shock: Managing the Change in Publishing
Culture Shock: Managing the Change in PublishingRichard Akerman
 
DMNmedia - Our Capabilities - 072016
DMNmedia - Our Capabilities - 072016DMNmedia - Our Capabilities - 072016
DMNmedia - Our Capabilities - 072016walterpchen
 
Principled Performance
Principled PerformancePrincipled Performance
Principled PerformanceMartijn Zoet
 
Guest Lecture Business Rules Management / Decision Management Utrecht University
Guest Lecture Business Rules Management / Decision Management Utrecht UniversityGuest Lecture Business Rules Management / Decision Management Utrecht University
Guest Lecture Business Rules Management / Decision Management Utrecht UniversityMartijn Zoet
 
Library Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific InformationLibrary Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific InformationRichard Akerman
 
CISTI: Promoting Science Access
CISTI: Promoting Science AccessCISTI: Promoting Science Access
CISTI: Promoting Science AccessRichard Akerman
 
Fegas 04 wikis
Fegas 04 wikisFegas 04 wikis
Fegas 04 wikisDidac Margaix
 
Building SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge
Building SkyNet for Science: Discovering New Frontiers Using Embedded KnowledgeBuilding SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge
Building SkyNet for Science: Discovering New Frontiers Using Embedded KnowledgeRichard Akerman
 
Decision Management: Wendbaarheid
Decision Management: WendbaarheidDecision Management: Wendbaarheid
Decision Management: WendbaarheidMartijn Zoet
 
Accions des de la tecnologia per al PVFLL
Accions des de la tecnologia per al PVFLLAccions des de la tecnologia per al PVFLL
Accions des de la tecnologia per al PVFLLDidac Margaix
 

Andere mochten auch (20)

Open Scientific Data
Open Scientific DataOpen Scientific Data
Open Scientific Data
 
Science to the People
Science to the PeopleScience to the People
Science to the People
 
Springtime for publishers - 20120711
Springtime for publishers - 20120711Springtime for publishers - 20120711
Springtime for publishers - 20120711
 
Service-Oriented Architecture for Libraries
Service-Oriented Architecture for LibrariesService-Oriented Architecture for Libraries
Service-Oriented Architecture for Libraries
 
When are we going to get to the science factory?
When are we going to get to the science factory?When are we going to get to the science factory?
When are we going to get to the science factory?
 
Springtime for Publishers?
Springtime for Publishers?Springtime for Publishers?
Springtime for Publishers?
 
Google Wave
Google WaveGoogle Wave
Google Wave
 
Culture Shock: Managing the Change in Publishing
Culture Shock: Managing the Change in PublishingCulture Shock: Managing the Change in Publishing
Culture Shock: Managing the Change in Publishing
 
Taller spl
Taller splTaller spl
Taller spl
 
DMNmedia - Our Capabilities - 072016
DMNmedia - Our Capabilities - 072016DMNmedia - Our Capabilities - 072016
DMNmedia - Our Capabilities - 072016
 
Principled Performance
Principled PerformancePrincipled Performance
Principled Performance
 
Guest Lecture Business Rules Management / Decision Management Utrecht University
Guest Lecture Business Rules Management / Decision Management Utrecht UniversityGuest Lecture Business Rules Management / Decision Management Utrecht University
Guest Lecture Business Rules Management / Decision Management Utrecht University
 
Library Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific InformationLibrary Web Services for Discovery and Delivery of Scientific Information
Library Web Services for Discovery and Delivery of Scientific Information
 
CISTI: Promoting Science Access
CISTI: Promoting Science AccessCISTI: Promoting Science Access
CISTI: Promoting Science Access
 
Fegas 04 wikis
Fegas 04 wikisFegas 04 wikis
Fegas 04 wikis
 
Building SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge
Building SkyNet for Science: Discovering New Frontiers Using Embedded KnowledgeBuilding SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge
Building SkyNet for Science: Discovering New Frontiers Using Embedded Knowledge
 
Medes_Margaix
Medes_MargaixMedes_Margaix
Medes_Margaix
 
Decision Management: Wendbaarheid
Decision Management: WendbaarheidDecision Management: Wendbaarheid
Decision Management: Wendbaarheid
 
Paper Art
Paper ArtPaper Art
Paper Art
 
Accions des de la tecnologia per al PVFLL
Accions des de la tecnologia per al PVFLLAccions des de la tecnologia per al PVFLL
Accions des de la tecnologia per al PVFLL
 

Ähnlich wie Will We Command Our Data? From the Petascale to the Personal

Linked dataresearch
Linked dataresearchLinked dataresearch
Linked dataresearchTope Omitola
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhurymaredata
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle Kimberly Hoffman
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research RequirementsICPSR
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementARDC
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonAfrican Open Science Platform
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourKNOWeSCAPE2014
 
A Genealogy of an Open Data Assemblage
A Genealogy of an Open Data AssemblageA Genealogy of an Open Data Assemblage
A Genealogy of an Open Data AssemblageProgCity
 
An open data story
An open data storyAn open data story
An open data storyProgCity
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13Kristi Holmes
 
WOW13_RPITWC_Web Observatories
WOW13_RPITWC_Web ObservatoriesWOW13_RPITWC_Web Observatories
WOW13_RPITWC_Web Observatoriesgloriakt
 
Providing geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataProviding geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataPat Kenny
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data CommonsVivien Bonazzi
 
Goebel.jst.big.data.jan10 12.2017.4
Goebel.jst.big.data.jan10 12.2017.4Goebel.jst.big.data.jan10 12.2017.4
Goebel.jst.big.data.jan10 12.2017.4Randy Goebel
 
Foresight Analytics
Foresight AnalyticsForesight Analytics
Foresight Analyticssuresh sood
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 

Ähnlich wie Will We Command Our Data? From the Petascale to the Personal (20)

Linked dataresearch
Linked dataresearchLinked dataresearch
Linked dataresearch
 
Gobinda Chowdhury
Gobinda ChowdhuryGobinda Chowdhury
Gobinda Chowdhury
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
 
Meeting Federal Research Requirements
Meeting Federal Research RequirementsMeeting Federal Research Requirements
Meeting Federal Research Requirements
 
Research Integrity Advisor and Data Management
Research Integrity Advisor and Data ManagementResearch Integrity Advisor and Data Management
Research Integrity Advisor and Data Management
 
Wiser2009 Luis Martinez
Wiser2009 Luis MartinezWiser2009 Luis Martinez
Wiser2009 Luis Martinez
 
British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011British Library Datasets Programme Feb 2011
British Library Datasets Programme Feb 2011
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific EndeavourBeyond Meta-Data: Nano-Publications Recording Scientific Endeavour
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
 
A Genealogy of an Open Data Assemblage
A Genealogy of an Open Data AssemblageA Genealogy of an Open Data Assemblage
A Genealogy of an Open Data Assemblage
 
An open data story
An open data storyAn open data story
An open data story
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
WOW13_RPITWC_Web Observatories
WOW13_RPITWC_Web ObservatoriesWOW13_RPITWC_Web Observatories
WOW13_RPITWC_Web Observatories
 
Providing geospatial information as Linked Open Data
Providing geospatial information as Linked Open DataProviding geospatial information as Linked Open Data
Providing geospatial information as Linked Open Data
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
EMBL Australian Bioinformatics Resource AHM - Data Commons
EMBL Australian Bioinformatics Resource AHM   - Data CommonsEMBL Australian Bioinformatics Resource AHM   - Data Commons
EMBL Australian Bioinformatics Resource AHM - Data Commons
 
Goebel.jst.big.data.jan10 12.2017.4
Goebel.jst.big.data.jan10 12.2017.4Goebel.jst.big.data.jan10 12.2017.4
Goebel.jst.big.data.jan10 12.2017.4
 
Foresight Analytics
Foresight AnalyticsForesight Analytics
Foresight Analytics
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 

Mehr von Richard Akerman

Open science in the Government of Canada
Open science in the Government of CanadaOpen science in the Government of Canada
Open science in the Government of CanadaRichard Akerman
 
Web 2.0 timeline and future
Web 2.0 timeline and futureWeb 2.0 timeline and future
Web 2.0 timeline and futureRichard Akerman
 
Web Tools For Peer Reviewers... and Everyone
Web  Tools For  Peer  Reviewers... and EveryoneWeb  Tools For  Peer  Reviewers... and Everyone
Web Tools For Peer Reviewers... and EveryoneRichard Akerman
 
Library service-oriented architecture to enhance access to science
Library service-oriented architecture to enhance access to scienceLibrary service-oriented architecture to enhance access to science
Library service-oriented architecture to enhance access to scienceRichard Akerman
 
The Internet - A Scholarly Community?
The Internet - A Scholarly Community?The Internet - A Scholarly Community?
The Internet - A Scholarly Community?Richard Akerman
 
Service-Oriented Architecture Methods to Develop Networked Library Services
Service-Oriented Architecture Methods to Develop Networked Library ServicesService-Oriented Architecture Methods to Develop Networked Library Services
Service-Oriented Architecture Methods to Develop Networked Library ServicesRichard Akerman
 

Mehr von Richard Akerman (8)

Open science in the Government of Canada
Open science in the Government of CanadaOpen science in the Government of Canada
Open science in the Government of Canada
 
Web 2.0 timeline and future
Web 2.0 timeline and futureWeb 2.0 timeline and future
Web 2.0 timeline and future
 
Trendspotting
TrendspottingTrendspotting
Trendspotting
 
Web Tools For Peer Reviewers... and Everyone
Web  Tools For  Peer  Reviewers... and EveryoneWeb  Tools For  Peer  Reviewers... and Everyone
Web Tools For Peer Reviewers... and Everyone
 
Library service-oriented architecture to enhance access to science
Library service-oriented architecture to enhance access to scienceLibrary service-oriented architecture to enhance access to science
Library service-oriented architecture to enhance access to science
 
The Internet - A Scholarly Community?
The Internet - A Scholarly Community?The Internet - A Scholarly Community?
The Internet - A Scholarly Community?
 
Service-Oriented Architecture Methods to Develop Networked Library Services
Service-Oriented Architecture Methods to Develop Networked Library ServicesService-Oriented Architecture Methods to Develop Networked Library Services
Service-Oriented Architecture Methods to Develop Networked Library Services
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 

KĂŒrzlich hochgeladen

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

KĂŒrzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Will We Command Our Data? From the Petascale to the Personal

  • 1. Richard Akerman NRC-CISTI Presented at Access 2009, Oct. 1, 2009 Will We Command Our Data?From the Petascale to the Personal
  • 2. Overview Definitions / Assumptions How Big is Data? Four Sources of Data Drivers Activities
  • 3. Definitions / Assumptions Petabyte = 1000 Terabytes data = datasets “data is”
  • 4. How Big is Data? http://www.instructables.com/file/FA9N61CF54HJ6GG/
  • 5. How Big is Data? http://www.flickr.com/photos/doctorow/2731870631/
  • 6. How Big is Data? http://en.wikipedia.org/wiki/File:Postduif.jpg
  • 7. Four Sources of Data Research data Government data Library data Personal data
  • 8. General Drivers Since 2000, a convergence of factors: Value of sharing Ease of sharing Level of sharing (machine level)
  • 9. Specific Drivers: Research Data OECD Principles and Guidelines for Access to Research Data from Public Funding (April 2007) The Toronto Statement on prepublication data sharing (September 2009)
  • 10. OECD Principles “Open access to research data from public funding should be easy, timely, user-friendly and preferably Internet-based.” http://www.flickr.com/photos/ben-zvan-photography/468487548/
  • 11. Specific Drivers: Open Government Data US Memorandum on Transparency and Open Government (January 2009) US Memorandum on the Freedom of Information Act (January 2009)
  • 12. Specific Drivers: Open Government Data UK Power of Information Task Force Report (March 2009) Modernise data publishing and reusehttp://poit.cabinetoffice.gov.uk/poit/category/data-final/ “public information held by for example the police, health bodies and local authorities is often not available. This is bad for democratic expression, the economy and citizen customers.” Data.gov (May 2009) UK PM Brown meets with Sir Berners-Lee (Sept. 2009)
  • 13. Specific Drivers: Library Data ILS Customer Bill-of-Rights, John Blyberg (November 2005) “Berkeley Accord” (March 2008)
  • 14. Specific Drivers: Personal Data Wired cover feature “Living by numbers” (July 2009) “Know Thyself: Tracking Every Facet of Life, from Sleep to Mood to Pain, 24/7/365” “Numbers are making their way into the smallest crevices of our lives. We have pedometers in the soles of our shoes and phones that can post our location as we move around town. We can tweet what we eat into a database and subscribe to Web services that track our finances. There are sites and programs for monitoring mood, pain, blood sugar, blood pressure, heart rate, 
 and prayers.”
  • 15. Why Libraries Advocates Exemplars Experts
  • 16. Research Data:DataCite http://www.datacite.org/ “DOIs for data” “The long term vision of the partnership is to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence.”
  • 17. Research Data: Gateway to Data Sets NRC-CISTI, Gateway to (Canadian) Scientific Data Sets http://cisti-icist.nrc-cnrc.gc.ca/eng/services/cisti/scientific-data/data-sets/ e.g. Canadian Astronomy Data Centre (CADC), Large Synoptic Survey Telescope (LSST)
  • 18. Government Data: Canada - Federal http://geogratis.cgdi.gc.ca/ StatsCanData Liberation Initiative (DLI) Ontario Data Documentation, Extraction Service and Infrastructure Initiative (ODESI) “The project will target Statistics Canada datasets... The files will be marked-up using DDI, an international, XML-based metadata tagging system which allows data resource discovery, distributed access, extraction and analysis.”
  • 19. Government Data: Municipal - Vancouver http://data.vancouver.ca/
  • 20. Government Data:Municipal - SF San Francisco http://datasf.org/
  • 21. Library Data A million free covers from LibraryThing Open Library http://openlibrary.org/dev/docs/data Talis Connected Commons MESUR – Services http://id.loc.gov/ (LCSH)
  • 22. APIs vs raw data APIs Always serve up latest data Control over access Tracking/stats Advanced/complex functionality on top of the data Raw data Unconstrained / can do things never imagined by API Hard to track / version Can lose metadata Allows choice of computing
  • 24. Personal Data:Total Recall http://totalrecallbook.com/(Sept. 2009)
  • 25. Richard Akerman © 2009 Government of Canada Licensed in the Creative Commons Thank You http://creativecommons.org/licenses/by-nc-sa/2.5/ca/

Hinweis der Redaktion

  1. http://en.wikipedia.org/wiki/File:Postduif.jpg (public domain)
  2. http://www.flickr.com/photos/rakerman/2907065239/