SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Prepared for:
Integrating Approaches to Privacy across the Research Lifecycle
Sept 2013
Introduction to Research Data
Privacy Use Cases
Micah Altman
<Micah_Altman@alumni.brown.edu>
Director of Research, MIT Libraries
Non-Resident Senior Fellow, Brookings Institution
DISCLAIMER
These opinions are my own, they are not the opinions
of MIT, Brookings, any of the project funders, nor (with
the exception of co-authored previously published
work) my collaborators.
Secondary disclaimer:
“It’s tough to make predictions, especially about the
future!”
-- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill,
Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi,
Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle,
George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White,
etc.
Introduction to Research Data Privacy Use Cases
About the ‘use cases”?
Technical definition:
A summary of a pattern of interactions between external actors within a
system under consideration to accomplish a goal.
Working definition:
Who does what, when; and what do they wish to accomplish?
Complemented by:
• User stories – simle generalized descriptions of specific interactions
• Scenarios – variations on a theme
• Examples/fact patterns – real life examples of the abstract use case
Introduction to Research Data Privacy Use Cases
Data InputOutput Model
Published Outputs
* Jones * * 1961 021*
* Jones * * 1961 021*
* Jones * * 1972 9404*
* Jones * * 1972 9404*
* Jones * * 1972 9404*
“The correlation between X and
Y was large and statistically
significant”
Summary statistics
Contingency table
Public use sample microdata
Information Visualization
Introduction to Research Data Privacy Use Cases
DATA
DATA
Information Life Cycle Model
Introduction to Research Data Privacy Use Cases
Creation/Colle
ction
Storage/I
ngest
Processing
Internal SharingAnalysis
External
dissemination/publica
tion
Re-use
• Scientometric
• Education
• Scientific
• Policy
Long-term
access
Research
methods
Data Management
Systems
Legal / Policy
Frameworks∂
∂
Statistical /
Computational
Frameworks
Legal/Policy Frameworks
Contract Intellectual Property
Access
Rights Confidentiality
Copyright
Fair Use
DMCA
Database Rights
Moral Rights
Intellectual
Attribution
Trade Secret
Patent
Trademark
Common Rule
45 CFR 26
HIPAA
FERPA
EU Privacy Directive
Privacy
Torts
(Invasion,
Defamation)
Rights of
Publicity
Sensitive but
Unclassified
Potentially
Harmful
(Archeological
Sites,
Endangered
Species, Animal
Testing, …)
Classified
FOIA
CIPSEA
State
Privacy Laws
EAR
State FOI
Laws
Journal
Replication
Requirements
Funder Open
Access
Contract
License
Click-Wrap
TOU
ITAR
Export
Restrictions
Introduction to Research Data Privacy Use Cases
Example: Stakeholder Concerns Across Lifecycle
Research sources:
- Research Subjects.
- Owners of subject material
- Owners of supplementary data
Research sponsors:
- Home institution
- Funding sources
Project Personnel:
- Investigators
- Research Staff
Research Publishers
- Print publishers
- Research archives
Research Consumers
- Readers
- Secondary researcher
Licensing
Copyright
DMCA
Informed Consent
Privacy
Trade secrets
Licensing
Freedom of Information
Copyright
Copyright
Copyright
Licensing
Fair Use
Information
Transfer
Privacy
Confidentiality
Intellectual Property
Replicable Research
Policy Relevance
Accessibility of Research
Protect IP
Avoid third party IP/Privacy Issues
Replicable Research
Publish
Promote use of Publications
Track use
Replicable research
Promote use of their publications
Protect publisher IP
Avoid third party IP/Privacy Issues
Replicate and extend
Secondary analysis
Link research
Stakeholder Concerns Legal Issues
• Infrastructure requirements analysis
– Data acquisition, storage, dissemination
– Identification, authorization, authentication
– Metadata, protocols
• System design: potential implementation cost of differential privacy:
– Information security -- hardening
– Information security – certification & auditing
– Model server development, provisioning, maintenance, reliability, availability
• System design: information security tradeoffs of Interactive privacy mechanisms:
– Availability risks: denial of service attack
– Availability/integrity risks: privacy budget exhaustion attacks
– Integrity risks: modification of delivered results (e.g. man-in-the-middle attacks)
– Secrecy/privacy: breach of authentication/authorization layer
• System design: optimizing privacy & utility across lifecycle
– When does limiting disclosive data collection dominate methods at the data analysis stage
– When does restricted virtual data enclaves + public synthetic data dominate interactive mechanisms
• System design: Information use/reuse
– Support of scientific analysis use cases (model diagnostics, exploratory data analysis, integration of externa
data) within interactive privacy systems.
– Align informational assumptions across stages & incorporating informative priors?
– Requirements for scientific replication/verification of results produced by model servers?
Introduction to Research Data Privacy Use Cases
Systems Policy Research questions deriving from
Information Lifecycle Analysis
Modeling Features
Features Characteristics
Data - Structure; Source; Unit of observation; Attribute
types; Dimensionality; Number of observations;
homogeneity; frequency of updates; quality
characteristics
Analytic Results - Form of output; analysis methodology;
analysis/inferential goal; utility/loss/quality
Disclosure scenario - - Source of threat; areas of vulnerability; attacker
objectives, background knowledge, capability;
Breach criteria/disclosure concept
Stakeholders - Stakeholder types; capacities; trust relationships;
budgets
Lifecycle characteristics - Lifecycle stages controlled/in scope; policies used;
stakeholders involved at each stage
Current privacy management approach - Regulation/policy; legal controls;
statistical/computational disclosure methods;
information security controls
Introduction to Research Data Privacy Use Cases
Exemplar: Social Media Analysis
Introduction to Research Data Privacy Use Cases
Attribute Type Examples
Data: Structure - network
Data: Attribute Types - Continuous/Discrete/
- Scale: ratio/interval/ordinal/nominal
Data: Performance
Characteristics
- 10M-1B observations
- Sample from stream of continuously
updated corpus
- Dozens of dimensions/measures
Measurement: Unit of
Observation
- Individuals; Interactions
Measurement: Measurement
type
- Observational
Measurement: Performance
characteristic
- High volume
- Complex network structure
- Sparsity
- Systematic and sparse metadata
Management Constraints - License; Replication
Analysis methods - Bespoke algorithms (clustering);
nonlinear optimization; Bayesian
methods
Desired Outputs - Summary scalars (model coefficients)
- Summary table
- Static /interactive visualization
More Information
• Grimmer, Justin, and Gary King. "General purpose computer-
assisted clustering and conceptualization." Proceedings of the
National Academy of Sciences 108.7 (2011): 2643-2650.
• King, Gary, Jennifer Pan, and Molly Roberts. "How censorship in
China allows government criticism but silences collective
expression." APSA 2012 Annual Meeting Paper. 2012.
• Lazer, David, et al. "Life in the network: the coming age of
computational social science." Science (New York, NY) 323.5915
(2009): 721.
Mapping the “Space” of Research Data
Privacy
• Many different types of potentially relevant features
• Many types stakeholders
• Many lifecycle stages
 so can’t be exhaustive
Heuristic: Choose some points -- combinations of characteristics -- that
are near various corners of the (hyper-) space and that represent
substantively important examples. Document these…
Discuss. Think. Repeat.
Introduction to Research Data Privacy Use Cases
ExampleUseCases Name/Description Examples
Comparison case: Official Statistics
Well-resourced data collector summarizes
tables/relational data in the form of summary
statistics and contingency tables
• U.S. Census dissemination
• European statistical agencies
Privacy-Aware Journal Replication Policies
Scholarly journals adopting policies for deposit
and disposition of data for verification and
replication. How to balance privacy and
replicability without intensive review?
• Data Sharing Systems for Open Access
Journals
• American Political Science Association Data
Access and Research Transparency [DART]
Policy Initiative
Long-term Longitudinal data Collection
Data collections tracking individual subjects (and
possibly friends and relations) over decades
• National Longitudinal Study of Adolescent
Health (Add Health)
• Framingham Heart Study
• Panel Study of Income Dynamics
Computational Social Science
“Big” data. New forms and sources of data.
Cutting-edge analytical methods and algorithms.
Analyzing …
• Netflix
• Facebook
• Hubway
• GPS
• Blogs
Introduction to Research Data Privacy Use Cases
Proposed Discussion Questions
(for tomorrow)
• Characterization.
• Current approaches.
• Enhancing approaches.
• Integrating approaches.
• Utility.
• Privacy.
• Methodological Barriers
• Incentives.
• Future.
• Prior work.
Introduction to Research Data Privacy Use Cases
• Are these summaries
useful as descriptive
models?
• What is missing from
the big picture?
• What are the
opportunities for
research, practice &
policy?
(What one wants to know)(What one asks)
Selected Bibliography
• L. Willenborg and T. D. Waal. Elements of Statistical Disclosure
Control, volume 155 of Lecture Notes in Statistics. Springer Verlag,
New York, NY, 2001.
• Higgins, Sarah. "The DCC curation lifecycle model." International
Journal of Digital Curation 3.1 (2008): 134-
140.www.dcc.ac.uk/resources/curation-lifecycle-model
• ESSNET, Handbook on Statistical Disclosure Control. 2011.
neon.vb.cbs.nl/casc/SDC_Handbook.pdf
• Fung, Benjamin, et al. "Privacy-preserving data publishing: A survey
of recent developments." ACM Computing Surveys (CSUR) 42.4
(2010): 14.
• Altman, M. (2012). “Mitigating Threats To Data Quality Throughout
the Curation Lifecycle. In G. Marciano, C. Lee, & H. Bowden (Eds.),
Curating For Quality. datacuration.web.unc.edu
Introduction to Research Data Privacy Use Cases
Questions?
E-mail: escience@mit.edu
Web: informatics.mit.edu
Twitter: @drmaltman
Introduction to Research Data Privacy Use
Cases
Appendix: Full Questions
• Characterization.
– Are there key additional characteristics of the use case that should be noted? How do these characteristics change the analysis and
treatment of privacy in these cases?
• Current approaches.
– How is this use case treated now -- what's the state of the art & practice? How is success measured?
• Enhancing approaches.
– Are any of the approaches discussed yesterday used? How could the tools and approaches mentioned earlier or other existing tools be used
at particular stages of the research lifecycle to enhance utility and privacy?
• Integrating approaches.
– Are approaches that have been developed and used in different communities compatible with each other? How should legal,
computational, policy, and statistical tools be integrated so as to be most effective?
• Utility.
– What things would stakeholders like to do with the data that the toolset doesn't restrict or obstruct? Where is social benefit sub-optimal?
How is utility measured/perceived by the stakeholders?
• Privacy.
– What sorts of data/outputs are considered particularly sensitive? What are the most important real and perceived risks -- what harms could
occur if data is released and reidentified, how severe are these harms and how likely?
• Methodological Barriers
– . What are technical, methodological, computational or infrastructural barriers to improving privacy and utility in the management of this
data. What particular characteristics of the use case contribute barriers?
• Incentives.
– If better tools already exist, why aren't they used? What are barriers to adoption of new tools and methods? What are the specific "market
failures" in this area -- such as perverse incentives, lack/asymmetry of information, lack of well-developed market, irrational behavior,
transaction cost, network effects, etc.? What particular characteristics of the use case most influence incentives?
• Future.
– How is this use case likely to evolve over time? What are threats to stability/scalability/robustness/resilience of the proposed/current
solutions?
• Prior work.
– Are there key additional examples of the use case that should be noted? Are there additional key references or writings that should be
noted? Introduction to Research Data Privacy Use Cases

Weitere ähnliche Inhalte

Was ist angesagt?

Big Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTPBig Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTPMicah Altman
 
Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyMicah Altman
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data CitationMicah Altman
 
Implementing open Access
Implementing open AccessImplementing open Access
Implementing open AccessMolly Tamarkin
 
Anything Goes?! Ethical Dimensions of Online Research
Anything Goes?! Ethical Dimensions of Online ResearchAnything Goes?! Ethical Dimensions of Online Research
Anything Goes?! Ethical Dimensions of Online ResearchNele Heise
 
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Finalguestcaef1d
 
Crim 4391 homeland security fall15
Crim 4391 homeland security fall15Crim 4391 homeland security fall15
Crim 4391 homeland security fall15ciakov
 
The Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicThe Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicDavid De Roure
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017Susanna-Assunta Sansone
 
"Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective""Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective"Micah Altman
 
Metadata and Metrics to Support Open Access
Metadata and Metrics to Support Open AccessMetadata and Metrics to Support Open Access
Metadata and Metrics to Support Open AccessMicah Altman
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Susanna-Assunta Sansone
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
 
Managing confidential data
Managing confidential dataManaging confidential data
Managing confidential dataMicah Altman
 

Was ist angesagt? (20)

Niso library law
Niso library lawNiso library law
Niso library law
 
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
Data collection, Data Integration, Data Understanding e Data Cleaning & Prepa...
 
CISER & the Data Reference Interview
CISER & the Data Reference InterviewCISER & the Data Reference Interview
CISER & the Data Reference Interview
 
Big Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTPBig Data & Privacy -- Response to White House OSTP
Big Data & Privacy -- Response to White House OSTP
 
Comments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data PrivacyComments to FTC on Mobile Data Privacy
Comments to FTC on Mobile Data Privacy
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data Citation
 
Implementing open Access
Implementing open AccessImplementing open Access
Implementing open Access
 
Data and Research Infrastructures and Open Science
Data and Research Infrastructures and Open ScienceData and Research Infrastructures and Open Science
Data and Research Infrastructures and Open Science
 
Anything Goes?! Ethical Dimensions of Online Research
Anything Goes?! Ethical Dimensions of Online ResearchAnything Goes?! Ethical Dimensions of Online Research
Anything Goes?! Ethical Dimensions of Online Research
 
Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
 
Crim 4391 homeland security fall15
Crim 4391 homeland security fall15Crim 4391 homeland security fall15
Crim 4391 homeland security fall15
 
The Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicThe Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and Music
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017
 
"Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective""Reproducibility from the Informatics Perspective"
"Reproducibility from the Informatics Perspective"
 
Secondary source qual
Secondary source qualSecondary source qual
Secondary source qual
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
Metadata and Metrics to Support Open Access
Metadata and Metrics to Support Open AccessMetadata and Metrics to Support Open Access
Metadata and Metrics to Support Open Access
 
Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017Introduction to DATS v2.2 - NIH May 2017
Introduction to DATS v2.2 - NIH May 2017
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
Managing confidential data
Managing confidential dataManaging confidential data
Managing confidential data
 

Ähnlich wie Integrating Approaches to Research Data Privacy

AAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysAAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysCliff Lampe
 
Best Practices for Sharing Economics Data
Best Practices for Sharing Economics DataBest Practices for Sharing Economics Data
Best Practices for Sharing Economics DataMicah Altman
 
A Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyA Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyMicah Altman
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!Renaine Julian
 
Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523ORCID, Inc
 
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...SC CTSI at USC and CHLA
 
MCB Qualitative Analysis Workshop
MCB Qualitative Analysis WorkshopMCB Qualitative Analysis Workshop
MCB Qualitative Analysis WorkshopAna Canhoto
 
Internet Research Ethics.docx
Internet Research Ethics.docxInternet Research Ethics.docx
Internet Research Ethics.docxLoudymerTimagos1
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanMicah Altman
 
Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...Micah Altman
 
State of the Art Informatics for Research Reproducibility, Reliability, and...
 State of the Art  Informatics for Research Reproducibility, Reliability, and... State of the Art  Informatics for Research Reproducibility, Reliability, and...
State of the Art Informatics for Research Reproducibility, Reliability, and...Micah Altman
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanMicah Altman
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Robert Oostenveld
 
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...Nele Heise
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 

Ähnlich wie Integrating Approaches to Research Data Privacy (20)

AAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveysAAPOR - comparing found data from social media and made data from surveys
AAPOR - comparing found data from social media and made data from surveys
 
Best Practices for Sharing Economics Data
Best Practices for Sharing Economics DataBest Practices for Sharing Economics Data
Best Practices for Sharing Economics Data
 
A Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information PrivacyA Lifecycle Approach to Information Privacy
A Lifecycle Approach to Information Privacy
 
You down with dmp yeah you know me!
You down with dmp  yeah you know me!You down with dmp  yeah you know me!
You down with dmp yeah you know me!
 
Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523
 
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...
 
MCB Qualitative Analysis Workshop
MCB Qualitative Analysis WorkshopMCB Qualitative Analysis Workshop
MCB Qualitative Analysis Workshop
 
Jonathan Breeze, Symplectic
Jonathan Breeze, SymplecticJonathan Breeze, Symplectic
Jonathan Breeze, Symplectic
 
BLC & Digital Science: Jonathan Breeze, Symplectic
BLC & Digital Science: Jonathan Breeze, SymplecticBLC & Digital Science: Jonathan Breeze, Symplectic
BLC & Digital Science: Jonathan Breeze, Symplectic
 
Internet Research Ethics.docx
Internet Research Ethics.docxInternet Research Ethics.docx
Internet Research Ethics.docx
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental Scan
 
Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...Matching Uses and Protections for Government Data Releases: Presentation at t...
Matching Uses and Protections for Government Data Releases: Presentation at t...
 
State of the Art Informatics for Research Reproducibility, Reliability, and...
 State of the Art  Informatics for Research Reproducibility, Reliability, and... State of the Art  Informatics for Research Reproducibility, Reliability, and...
State of the Art Informatics for Research Reproducibility, Reliability, and...
 
Software Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental ScanSoftware Repositories for Research-- An Environmental Scan
Software Repositories for Research-- An Environmental Scan
 
Data management plans
Data management plansData management plans
Data management plans
 
Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data Using Open Science to advance science - advancing open data
Using Open Science to advance science - advancing open data
 
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...
Revisiting Digital Media and Internet Research Ethics. A Process Oriented App...
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 

Mehr von Micah Altman

Selecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategiesSelecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategiesMicah Altman
 
Well-Being - A Sunset Conversation
Well-Being - A Sunset ConversationWell-Being - A Sunset Conversation
Well-Being - A Sunset ConversationMicah Altman
 
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019Micah Altman
 
Well-being A Sunset Conversation
Well-being A Sunset ConversationWell-being A Sunset Conversation
Well-being A Sunset ConversationMicah Altman
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer ReviewMicah Altman
 
Academy Owned Peer Review
Academy Owned Peer ReviewAcademy Owned Peer Review
Academy Owned Peer ReviewMicah Altman
 
Redistricting in the US -- An Overview
Redistricting in the US -- An OverviewRedistricting in the US -- An Overview
Redistricting in the US -- An OverviewMicah Altman
 
A Future for Electoral Districting
A Future for Electoral DistrictingA Future for Electoral Districting
A Future for Electoral DistrictingMicah Altman
 
A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk  A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk Micah Altman
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...Micah Altman
 
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...Micah Altman
 
Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:Micah Altman
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsMicah Altman
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...Micah Altman
 
Ndsa 2016 opening plenary
Ndsa 2016 opening plenaryNdsa 2016 opening plenary
Ndsa 2016 opening plenaryMicah Altman
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Micah Altman
 
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...Micah Altman
 
Gary Price, MIT Program on Information Science
Gary Price, MIT Program on Information ScienceGary Price, MIT Program on Information Science
Gary Price, MIT Program on Information ScienceMicah Altman
 
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...Micah Altman
 
Agenda's for Preservation Research
Agenda's for Preservation ResearchAgenda's for Preservation Research
Agenda's for Preservation ResearchMicah Altman
 

Mehr von Micah Altman (20)

Selecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategiesSelecting efficient and reliable preservation strategies
Selecting efficient and reliable preservation strategies
 
Well-Being - A Sunset Conversation
Well-Being - A Sunset ConversationWell-Being - A Sunset Conversation
Well-Being - A Sunset Conversation
 
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
Privacy Gaps in Mediated Library Services: Presentation at NERCOMP2019
 
Well-being A Sunset Conversation
Well-being A Sunset ConversationWell-being A Sunset Conversation
Well-being A Sunset Conversation
 
Can We Fix Peer Review
Can We Fix Peer ReviewCan We Fix Peer Review
Can We Fix Peer Review
 
Academy Owned Peer Review
Academy Owned Peer ReviewAcademy Owned Peer Review
Academy Owned Peer Review
 
Redistricting in the US -- An Overview
Redistricting in the US -- An OverviewRedistricting in the US -- An Overview
Redistricting in the US -- An Overview
 
A Future for Electoral Districting
A Future for Electoral DistrictingA Future for Electoral Districting
A Future for Electoral Districting
 
A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk  A History of the Internet :Scott Bradner’s Program on Information Science Talk
A History of the Internet :Scott Bradner’s Program on Information Science Talk
 
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
 
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
Labor And Reward In Science: Commentary on Cassidy Sugimoto’s Program on Info...
 
Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:Utilizing VR and AR in the Library Space:
Utilizing VR and AR in the Library Space:
 
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-NotsCreative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
Creative Data Literacy: Bridging the Gap Between Data-Haves and Have-Nots
 
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
SOLARSPELL: THE SOLAR POWERED EDUCATIONAL LEARNING LIBRARY - EXPERIENTIAL LEA...
 
Ndsa 2016 opening plenary
Ndsa 2016 opening plenaryNdsa 2016 opening plenary
Ndsa 2016 opening plenary
 
Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...Making Decisions in a World Awash in Data: We’re going to need a different bo...
Making Decisions in a World Awash in Data: We’re going to need a different bo...
 
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
The Open Access Network: Rebecca Kennison’s Talk for the MIT Prorgam on Infor...
 
Gary Price, MIT Program on Information Science
Gary Price, MIT Program on Information ScienceGary Price, MIT Program on Information Science
Gary Price, MIT Program on Information Science
 
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
Attribution from a Research Library Perspective, on NISO Webinar: How Librari...
 
Agenda's for Preservation Research
Agenda's for Preservation ResearchAgenda's for Preservation Research
Agenda's for Preservation Research
 

Kürzlich hochgeladen

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Integrating Approaches to Research Data Privacy

  • 1. Prepared for: Integrating Approaches to Privacy across the Research Lifecycle Sept 2013 Introduction to Research Data Privacy Use Cases Micah Altman <Micah_Altman@alumni.brown.edu> Director of Research, MIT Libraries Non-Resident Senior Fellow, Brookings Institution
  • 2. DISCLAIMER These opinions are my own, they are not the opinions of MIT, Brookings, any of the project funders, nor (with the exception of co-authored previously published work) my collaborators. Secondary disclaimer: “It’s tough to make predictions, especially about the future!” -- Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, etc. Introduction to Research Data Privacy Use Cases
  • 3. About the ‘use cases”? Technical definition: A summary of a pattern of interactions between external actors within a system under consideration to accomplish a goal. Working definition: Who does what, when; and what do they wish to accomplish? Complemented by: • User stories – simle generalized descriptions of specific interactions • Scenarios – variations on a theme • Examples/fact patterns – real life examples of the abstract use case Introduction to Research Data Privacy Use Cases
  • 4. Data InputOutput Model Published Outputs * Jones * * 1961 021* * Jones * * 1961 021* * Jones * * 1972 9404* * Jones * * 1972 9404* * Jones * * 1972 9404* “The correlation between X and Y was large and statistically significant” Summary statistics Contingency table Public use sample microdata Information Visualization Introduction to Research Data Privacy Use Cases DATA DATA
  • 5. Information Life Cycle Model Introduction to Research Data Privacy Use Cases Creation/Colle ction Storage/I ngest Processing Internal SharingAnalysis External dissemination/publica tion Re-use • Scientometric • Education • Scientific • Policy Long-term access Research methods Data Management Systems Legal / Policy Frameworks∂ ∂ Statistical / Computational Frameworks
  • 6. Legal/Policy Frameworks Contract Intellectual Property Access Rights Confidentiality Copyright Fair Use DMCA Database Rights Moral Rights Intellectual Attribution Trade Secret Patent Trademark Common Rule 45 CFR 26 HIPAA FERPA EU Privacy Directive Privacy Torts (Invasion, Defamation) Rights of Publicity Sensitive but Unclassified Potentially Harmful (Archeological Sites, Endangered Species, Animal Testing, …) Classified FOIA CIPSEA State Privacy Laws EAR State FOI Laws Journal Replication Requirements Funder Open Access Contract License Click-Wrap TOU ITAR Export Restrictions
  • 7. Introduction to Research Data Privacy Use Cases Example: Stakeholder Concerns Across Lifecycle Research sources: - Research Subjects. - Owners of subject material - Owners of supplementary data Research sponsors: - Home institution - Funding sources Project Personnel: - Investigators - Research Staff Research Publishers - Print publishers - Research archives Research Consumers - Readers - Secondary researcher Licensing Copyright DMCA Informed Consent Privacy Trade secrets Licensing Freedom of Information Copyright Copyright Copyright Licensing Fair Use Information Transfer Privacy Confidentiality Intellectual Property Replicable Research Policy Relevance Accessibility of Research Protect IP Avoid third party IP/Privacy Issues Replicable Research Publish Promote use of Publications Track use Replicable research Promote use of their publications Protect publisher IP Avoid third party IP/Privacy Issues Replicate and extend Secondary analysis Link research Stakeholder Concerns Legal Issues
  • 8. • Infrastructure requirements analysis – Data acquisition, storage, dissemination – Identification, authorization, authentication – Metadata, protocols • System design: potential implementation cost of differential privacy: – Information security -- hardening – Information security – certification & auditing – Model server development, provisioning, maintenance, reliability, availability • System design: information security tradeoffs of Interactive privacy mechanisms: – Availability risks: denial of service attack – Availability/integrity risks: privacy budget exhaustion attacks – Integrity risks: modification of delivered results (e.g. man-in-the-middle attacks) – Secrecy/privacy: breach of authentication/authorization layer • System design: optimizing privacy & utility across lifecycle – When does limiting disclosive data collection dominate methods at the data analysis stage – When does restricted virtual data enclaves + public synthetic data dominate interactive mechanisms • System design: Information use/reuse – Support of scientific analysis use cases (model diagnostics, exploratory data analysis, integration of externa data) within interactive privacy systems. – Align informational assumptions across stages & incorporating informative priors? – Requirements for scientific replication/verification of results produced by model servers? Introduction to Research Data Privacy Use Cases Systems Policy Research questions deriving from Information Lifecycle Analysis
  • 9. Modeling Features Features Characteristics Data - Structure; Source; Unit of observation; Attribute types; Dimensionality; Number of observations; homogeneity; frequency of updates; quality characteristics Analytic Results - Form of output; analysis methodology; analysis/inferential goal; utility/loss/quality Disclosure scenario - - Source of threat; areas of vulnerability; attacker objectives, background knowledge, capability; Breach criteria/disclosure concept Stakeholders - Stakeholder types; capacities; trust relationships; budgets Lifecycle characteristics - Lifecycle stages controlled/in scope; policies used; stakeholders involved at each stage Current privacy management approach - Regulation/policy; legal controls; statistical/computational disclosure methods; information security controls Introduction to Research Data Privacy Use Cases
  • 10. Exemplar: Social Media Analysis Introduction to Research Data Privacy Use Cases Attribute Type Examples Data: Structure - network Data: Attribute Types - Continuous/Discrete/ - Scale: ratio/interval/ordinal/nominal Data: Performance Characteristics - 10M-1B observations - Sample from stream of continuously updated corpus - Dozens of dimensions/measures Measurement: Unit of Observation - Individuals; Interactions Measurement: Measurement type - Observational Measurement: Performance characteristic - High volume - Complex network structure - Sparsity - Systematic and sparse metadata Management Constraints - License; Replication Analysis methods - Bespoke algorithms (clustering); nonlinear optimization; Bayesian methods Desired Outputs - Summary scalars (model coefficients) - Summary table - Static /interactive visualization More Information • Grimmer, Justin, and Gary King. "General purpose computer- assisted clustering and conceptualization." Proceedings of the National Academy of Sciences 108.7 (2011): 2643-2650. • King, Gary, Jennifer Pan, and Molly Roberts. "How censorship in China allows government criticism but silences collective expression." APSA 2012 Annual Meeting Paper. 2012. • Lazer, David, et al. "Life in the network: the coming age of computational social science." Science (New York, NY) 323.5915 (2009): 721.
  • 11. Mapping the “Space” of Research Data Privacy • Many different types of potentially relevant features • Many types stakeholders • Many lifecycle stages  so can’t be exhaustive Heuristic: Choose some points -- combinations of characteristics -- that are near various corners of the (hyper-) space and that represent substantively important examples. Document these… Discuss. Think. Repeat. Introduction to Research Data Privacy Use Cases
  • 12. ExampleUseCases Name/Description Examples Comparison case: Official Statistics Well-resourced data collector summarizes tables/relational data in the form of summary statistics and contingency tables • U.S. Census dissemination • European statistical agencies Privacy-Aware Journal Replication Policies Scholarly journals adopting policies for deposit and disposition of data for verification and replication. How to balance privacy and replicability without intensive review? • Data Sharing Systems for Open Access Journals • American Political Science Association Data Access and Research Transparency [DART] Policy Initiative Long-term Longitudinal data Collection Data collections tracking individual subjects (and possibly friends and relations) over decades • National Longitudinal Study of Adolescent Health (Add Health) • Framingham Heart Study • Panel Study of Income Dynamics Computational Social Science “Big” data. New forms and sources of data. Cutting-edge analytical methods and algorithms. Analyzing … • Netflix • Facebook • Hubway • GPS • Blogs Introduction to Research Data Privacy Use Cases
  • 13. Proposed Discussion Questions (for tomorrow) • Characterization. • Current approaches. • Enhancing approaches. • Integrating approaches. • Utility. • Privacy. • Methodological Barriers • Incentives. • Future. • Prior work. Introduction to Research Data Privacy Use Cases • Are these summaries useful as descriptive models? • What is missing from the big picture? • What are the opportunities for research, practice & policy? (What one wants to know)(What one asks)
  • 14. Selected Bibliography • L. Willenborg and T. D. Waal. Elements of Statistical Disclosure Control, volume 155 of Lecture Notes in Statistics. Springer Verlag, New York, NY, 2001. • Higgins, Sarah. "The DCC curation lifecycle model." International Journal of Digital Curation 3.1 (2008): 134- 140.www.dcc.ac.uk/resources/curation-lifecycle-model • ESSNET, Handbook on Statistical Disclosure Control. 2011. neon.vb.cbs.nl/casc/SDC_Handbook.pdf • Fung, Benjamin, et al. "Privacy-preserving data publishing: A survey of recent developments." ACM Computing Surveys (CSUR) 42.4 (2010): 14. • Altman, M. (2012). “Mitigating Threats To Data Quality Throughout the Curation Lifecycle. In G. Marciano, C. Lee, & H. Bowden (Eds.), Curating For Quality. datacuration.web.unc.edu Introduction to Research Data Privacy Use Cases
  • 15. Questions? E-mail: escience@mit.edu Web: informatics.mit.edu Twitter: @drmaltman Introduction to Research Data Privacy Use Cases
  • 16. Appendix: Full Questions • Characterization. – Are there key additional characteristics of the use case that should be noted? How do these characteristics change the analysis and treatment of privacy in these cases? • Current approaches. – How is this use case treated now -- what's the state of the art & practice? How is success measured? • Enhancing approaches. – Are any of the approaches discussed yesterday used? How could the tools and approaches mentioned earlier or other existing tools be used at particular stages of the research lifecycle to enhance utility and privacy? • Integrating approaches. – Are approaches that have been developed and used in different communities compatible with each other? How should legal, computational, policy, and statistical tools be integrated so as to be most effective? • Utility. – What things would stakeholders like to do with the data that the toolset doesn't restrict or obstruct? Where is social benefit sub-optimal? How is utility measured/perceived by the stakeholders? • Privacy. – What sorts of data/outputs are considered particularly sensitive? What are the most important real and perceived risks -- what harms could occur if data is released and reidentified, how severe are these harms and how likely? • Methodological Barriers – . What are technical, methodological, computational or infrastructural barriers to improving privacy and utility in the management of this data. What particular characteristics of the use case contribute barriers? • Incentives. – If better tools already exist, why aren't they used? What are barriers to adoption of new tools and methods? What are the specific "market failures" in this area -- such as perverse incentives, lack/asymmetry of information, lack of well-developed market, irrational behavior, transaction cost, network effects, etc.? What particular characteristics of the use case most influence incentives? • Future. – How is this use case likely to evolve over time? What are threats to stability/scalability/robustness/resilience of the proposed/current solutions? • Prior work. – Are there key additional examples of the use case that should be noted? Are there additional key references or writings that should be noted? Introduction to Research Data Privacy Use Cases

Hinweis der Redaktion

  1. This work. by Micah Altman (http://micahaltman.com) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
  2. The structure and design of digital storage systems is a cornerstone of digital preservation. To better understand ongoing storage practices of organizations committed to digital preservation, the National Digital Stewardship Alliance conducted a survey of member organizations. This talk discusses findings from this survey, common gaps, and trends in this area.(I also have a little fun highlighting the hidden assumptions underlying Amazon Glacier&apos;s reliability claims. For more on that see this earlier post: http://drmaltman.wordpress.com/2012/11/15/amazons-creeping-glacier-and-digital-preservation )
  3. Other image source: wikimedia commons