SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
Jude Yew
                                                                                                                                                                                                                                                                   Ann Zimmerman

                                                                     SEAD: A system to support active and social data curation                                                                                                                                   Magaret Hedstrom



                               SE AD
                                                                                                                                                                                                                                                                    Praveen Kumar
                                                                                                                                                                                                                                                                  Robert McDonald


                                                                     in sustainability science
                                                                                                                                                                                                                                                                      James Myers
                                                                                                                                                                                                                                                                         Beth Plale
Sustainable Environment — Actionable Data
                                                                                                                                                                                                                                                                        University of Michigan
                                                                                                                                                                                                                                                 University of Illinois at Urbana-Champaign
                                                                                                                                                                                                                                                                            Indiana University
                                                                                                                                                                                                                                                            Rensselaer Polytechnic Institute




                  1. Problem/Domain                                                                                                            4. SEAD Strategy
                                                                                                                                                                                                                           Networked data
                                                                               Active curation                                            Social curation




                                                                                                               +                                                                                            -
                                                                                                                                                                                                            -
                                                                                                                                                                                                            -
                                                                                                                                                                                                            -
                                                                                                                                                                                                                Tag and annotate data
                                                                                                                                                                                                                Overlay it with reference data
                                                                                                                                                                                                                Organize it in domain terminology
                                                                                                                                                                                                                Link it to people, papers, projects, conversations



                                                                                                                                                                                                                    Long-term archive solution
                                                                                                                             - Leverage social media for discovery of data,
                                                                         - Move data curation upstream in the data             interest & expertise
                                                                           life-cycle                                        - Support annotation of data by users of data
- Sustainability science is a data-intensive area that focuses on        - Record metadata at ingest                         - Record conversations and comments surrounding
  the complex interactions between nature and human activities.                                                                the data
- Sustainability research requires access to data from the physical                                                          - Make connections between data & researchers                               - Take advantage of existing infrastructures (Institutional
  and social sciences.                                                                                                         through social networking                                                   Repositories, ICPSR) for long-term preservation
- But data are di cult to nd, obtain and use because di erent
  disciplines collect, describe and store their data in di erent ways.

                                                                                                                                                      5. SEAD Use Cases
2. Data Challenges in Sustainability Science                              i) Able to ingest a variety of data types                                 ii) Support data discovery                                             iii) Add value to existing data


       The long tail of scienti c research data:
                  - Small and derived data sets
                  - Heterogeneous data
                  - Multiple sources of data
                                                                                                                                           +                 =
                  - Short-lived data with long-term value
                  - Value of data grows when combined & integrated          - Users can store, manage and share
                                                                              heterogeneous data types (e.g. images,                                                                                              - Users of data can provide additional metadata and
                                                                                                                                         - Provide links between data, people & publications
                                                                              geo-spatial images, sensor data etc.)                                                                                                 annotations

                                                                                                              iv) Create new data                                                           v) Community curation of data

                     3. SEAD Goals
SEAD will address the needs of sustainability researchers to search
for, aggregate, and maintain valuable data for the long term.

To do this, the project seeks to build a prototype that:
                                                                                                                                                                                - The community identi es and curates data of value
                                                                                                                                                                                - These valued data will be moved to existing institutional repositories
- Applies existing tools and services to sustainability research                                  - Combine data from multiple sources and contribute derived                     for long-term storage
- Integrates these services into a generalizable “Active and Social                                 data back to SEAD
  Curation” infrastructure
- Enables researchers to collaborate and share data during active         - The SEAD team will work closely with the community of sustainability scientists to evolve these use cases.
  projects                                                                - In the rst two years of the project, SEAD will collaborate with scientists studying the Upper Great Lakes and Upper Mississippi River Basin.
- Packages and migrates data valued by the users to a                     - Through this collaboration, SEAD will prototype a system that helps researchers manage their data and motivates them to share data and information about their data with others.
  federated repository for long-term preservation
                                                                                                                                                                        SEAD is funded by the National Science Foundation under cooperative agreement #OCI0940824.

Weitere ähnliche Inhalte

Mehr von SEAD

An Overview of Plans for SEAD
An Overview of Plans for SEADAn Overview of Plans for SEAD
An Overview of Plans for SEAD
SEAD
 
NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14
SEAD
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
SEAD
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social CurationSEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD
 

Mehr von SEAD (11)

An Overview of Plans for SEAD
An Overview of Plans for SEADAn Overview of Plans for SEAD
An Overview of Plans for SEAD
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research Series
 
NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14NSF DataNet Partners Update at RDAP14
NSF DataNet Partners Update at RDAP14
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
SEAD Prototype: Data Curation and Preservation for Sustainability Science
SEAD Prototype: Data Curation and Preservation for Sustainability ScienceSEAD Prototype: Data Curation and Preservation for Sustainability Science
SEAD Prototype: Data Curation and Preservation for Sustainability Science
 
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social CurationSEAD: Opening Data in the "Long Tail" for Active and Social Curation
SEAD: Opening Data in the "Long Tail" for Active and Social Curation
 
Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012
Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012
Data 2012 -- Presentation by Margaret Hedstrom (Jan 2012
 
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
 
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
Digital Library Federation - DataNets Panel presentation (Nov. 1st, 2011)
 
SEAD slide set (October 2011)
SEAD slide set (October 2011)SEAD slide set (October 2011)
SEAD slide set (October 2011)
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

SEAD: A system to support social and active data curation

  • 1. Jude Yew Ann Zimmerman SEAD: A system to support active and social data curation Magaret Hedstrom SE AD Praveen Kumar Robert McDonald in sustainability science James Myers Beth Plale Sustainable Environment — Actionable Data University of Michigan University of Illinois at Urbana-Champaign Indiana University Rensselaer Polytechnic Institute 1. Problem/Domain 4. SEAD Strategy Networked data Active curation Social curation + - - - - Tag and annotate data Overlay it with reference data Organize it in domain terminology Link it to people, papers, projects, conversations Long-term archive solution - Leverage social media for discovery of data, - Move data curation upstream in the data interest & expertise life-cycle - Support annotation of data by users of data - Sustainability science is a data-intensive area that focuses on - Record metadata at ingest - Record conversations and comments surrounding the complex interactions between nature and human activities. the data - Sustainability research requires access to data from the physical - Make connections between data & researchers - Take advantage of existing infrastructures (Institutional and social sciences. through social networking Repositories, ICPSR) for long-term preservation - But data are di cult to nd, obtain and use because di erent disciplines collect, describe and store their data in di erent ways. 5. SEAD Use Cases 2. Data Challenges in Sustainability Science i) Able to ingest a variety of data types ii) Support data discovery iii) Add value to existing data The long tail of scienti c research data: - Small and derived data sets - Heterogeneous data - Multiple sources of data + = - Short-lived data with long-term value - Value of data grows when combined & integrated - Users can store, manage and share heterogeneous data types (e.g. images, - Users of data can provide additional metadata and - Provide links between data, people & publications geo-spatial images, sensor data etc.) annotations iv) Create new data v) Community curation of data 3. SEAD Goals SEAD will address the needs of sustainability researchers to search for, aggregate, and maintain valuable data for the long term. To do this, the project seeks to build a prototype that: - The community identi es and curates data of value - These valued data will be moved to existing institutional repositories - Applies existing tools and services to sustainability research - Combine data from multiple sources and contribute derived for long-term storage - Integrates these services into a generalizable “Active and Social data back to SEAD Curation” infrastructure - Enables researchers to collaborate and share data during active - The SEAD team will work closely with the community of sustainability scientists to evolve these use cases. projects - In the rst two years of the project, SEAD will collaborate with scientists studying the Upper Great Lakes and Upper Mississippi River Basin. - Packages and migrates data valued by the users to a - Through this collaboration, SEAD will prototype a system that helps researchers manage their data and motivates them to share data and information about their data with others. federated repository for long-term preservation SEAD is funded by the National Science Foundation under cooperative agreement #OCI0940824.