SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Health data and the re-
identification threat – a real
world example
Giske Ursin
Cancer Registry of Norway
March 5, 2018
Seminar om privacy-preserving distributed
statistical computation,
Statistics Norway
Norwegian health registries
17 central + 54 clinical registries
Purpose:
- Asess distribution of disease
- Obtain information on how to prevent disease and
death from disease
Other health data:
- Population surveys
- 360+ biobanks
Cancer screening programs: all women 25-69
All these data…..
Let’s
link
them
30/04/20
18
Put the data somewhere safe…
Can only access them there….
But….
Kreft i Norge 20151. Are the data safe?
National platform
coming….
1. Are the data really SAFE?
http://www.free-bullion-investment-guide.com/homesafes.html
Kreft i Norge 2015
2. Does it matter?
Reidentification threat
Trust
…versus…..
Current systems are based on trust
An example
An example
Month and year of birth
Dates of all cervical exams
Results of each test
Whether or not get cancer
Cancer diagnosis date
1 million women
An example
Month and year of birth
Dates of all cervical exams
Results of each test
Whether or not get cancer
Cancer diagnosis date
1 million women
Month and year of birth
Dates of all cervical exams
Results of each test
Linked to identifiers
on n = xxx women
What do we do?
Trust?
All data deliveries based on trust
….or
What do we do?
Reduce
reidentification threat
………………Exactly HOW??
Anonymization protocols
K-anonymization (categorizing variables)
Creating synthetic datasets
Fuzzification
Synthetic datasets
Reset all dates from reference date
Day of birth = day 0
Day started using drug before diagnosis) = day 19 345
Day diagnosed with cancer = day 20 693
Challenge:
If need some aspect of calendar year
(treatments change)
Fuzzification – alter the data
- K-anonymization (Categorized variables)
- Excluded some observations (extreme dates/combinations)
- ALTERED all dates:
- Removed DAY
- CHANGED month – with random number (fuzzy factor)
- REMOVED month of birth
Fuzzification of cervix data
Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
Fuzzification – alter the data
• 5,6 million records
• All cervical exam dates
• Results
• Diagnosis dates of cancer
• 915 000 women
Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
Fuzzification – alter the data
• Removed extreme dates/combinations
• Set day in dates to 15
• Used fuzzy factor on month:
• random value between -4 and +4
• All dates one individual changed with same
random number
Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
Original ID DOB Exam 1 Exam 2 Diagnosis date
01071972 23456 1/7/1972 2/8/2000 10/11/2004 21/1/2007
03041960 45678 3/4/1960 5/1/1995 10/2/1998 ----
ID DOB Exam 1 Exam 2 Diagnosis date
001 15/7/1972 15/8/2000 15/11/2004 15/1/2007
002 15/4/1960 15/1/1995 15/2/1998 ----
Allocated ID DOB Exam1 Exam2 Diagnosis date
1023 15/10/1972 15/11/2000 15/2/2005 15/4/2007
4567 15/1/1960 15/11/1994 15/12/1997 ---
Fuzzification – alter the data
Allocated ID DOB Exam1 Exam2 Diagnosis date
1023 1972 15/11/2000 15/2/2005 15/4/2007
4567 1960 15/11/1994 15/12/1997 ---
FINAL DATA
Original ID DOB Exam 1 Exam 2 Diagnosis date
01071972 23456 1/7/1972 2/8/2000 10/11/2004 21/1/2007
03041960 45678 3/4/1960 5/1/1995 10/2/1998 ----
Fuzzification – alter the data
Allocated ID DOB Exam1 Exam2 Diagnosis date
1023 1972 15/11/2000 15/2/2005 15/4/2007
4567 1960 15/11/1994 15/12/1997 ---
FINAL DATA
Assessing the risk of reidentification
• ARX tool
• Quantifies risk of re-identification based on
uniqueness
• Prosecutor scenario: assumes person in dataset
• Classify variables as identiable, quasi-identifiable
or sensitive
Prasser F, Kohlmayer F, Lautenschlager R, Kuhn KA. ARX--A
Comprehensive Tool for Anonymizing Biomedical Data. AMIA Annu
Symp Proc. 2014;2014:984-93.
Assessing the reidentification risk
• D1. Realistic dataset
• D2. k-anonymization of dataset D1
• changing all dates in the dataset to 15th of the month
• D3. Fuzzifying the month in D2
• by adding a random factor between -4 to +4 months to each
month.
Fuzzification – WHAT helps?
Fuzzification – WHAT helps?
Fuzzification – WHAT helps?
Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
Reidentification risk
• Simple step reduces the risk of reidentification
• Adding a fuzzy factor makes reidentification even
more difficult
Graden av personidentifikasjon skal ikke være større enn
nødvendig for det aktuelle formålet. Graden av
personidentifikasjon skal begrunnes. Tilsynsmyndigheten kan
kreve at den databehandlingsansvarlige legger frem
begrunnelsen.
• Helseregisterloven §6
Current regulations
EU – GDPR:
Data Protection Impact Assessment
Article 35
Current practice - examples
Cancer Registry: Restrictive with dates
Helseregisterloven §6
Prescription Registry: Restrictive with dates
§4 «Forbud mot samtidig tilgang»
(Differansedager = synthetic dataset)
Statistics Norway: ?
Common guidelines - and
better solutions - needed!
Income?
Large linkages continue
…..still based on trust
Can NOT build a national platform on TRUST alone
For the researchers…….
BALANCE
The researchers need:
Safe analysis
of large linked data
(no reidentification threat)
- Rapid and seamless analyses
- Ability to check individual records
Need national platforms that can do it all!
Thank you
Fuzzy paper:
Mari Nygård
Sagar Sen
Jean-Marie Mottu
Discussions with:
Jan Nygård
Bjørn Møller
Hilde Olav
Johanne Gulbrandsen
Datautleveringsenheten
Livmorhalsprogrammet

Weitere ähnliche Inhalte

Was ist angesagt?

Towards a National Learning Health System - Aziz Sheikh
Towards a National Learning Health System - Aziz SheikhTowards a National Learning Health System - Aziz Sheikh
Towards a National Learning Health System - Aziz SheikhNIHR CLAHRC West Midlands
 
Eileen Hutton TALMOR Do we drive faster in canada
Eileen Hutton TALMOR Do we drive faster in canadaEileen Hutton TALMOR Do we drive faster in canada
Eileen Hutton TALMOR Do we drive faster in canadatalmorbv
 
Facilitating Analytics while Protecting Privacy
Facilitating Analytics while Protecting PrivacyFacilitating Analytics while Protecting Privacy
Facilitating Analytics while Protecting PrivacyKhaled El Emam
 
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...DrMuni Neurophysiologist
 
EHR Poster 4-11-16
EHR Poster 4-11-16EHR Poster 4-11-16
EHR Poster 4-11-16Larry Liu
 
Annual2018 kuwait newborn screening
Annual2018 kuwait newborn screening Annual2018 kuwait newborn screening
Annual2018 kuwait newborn screening Newborn Screening KW
 
Anti-retroviral therapy in HIV-positive pregnant women and children
Anti-retroviral therapy in HIV-positive pregnant women and childrenAnti-retroviral therapy in HIV-positive pregnant women and children
Anti-retroviral therapy in HIV-positive pregnant women and childrenZeena Nackerdien
 
Dr. Christopher Braden - The NIAA Effort: Learning from the June Roundtable
Dr. Christopher Braden - The NIAA Effort: Learning from the June RoundtableDr. Christopher Braden - The NIAA Effort: Learning from the June Roundtable
Dr. Christopher Braden - The NIAA Effort: Learning from the June RoundtableJohn Blue
 

Was ist angesagt? (12)

Sepsis and Septic shock
Sepsis and Septic shock Sepsis and Septic shock
Sepsis and Septic shock
 
Towards a National Learning Health System - Aziz Sheikh
Towards a National Learning Health System - Aziz SheikhTowards a National Learning Health System - Aziz Sheikh
Towards a National Learning Health System - Aziz Sheikh
 
Eileen Hutton TALMOR Do we drive faster in canada
Eileen Hutton TALMOR Do we drive faster in canadaEileen Hutton TALMOR Do we drive faster in canada
Eileen Hutton TALMOR Do we drive faster in canada
 
Facilitating Analytics while Protecting Privacy
Facilitating Analytics while Protecting PrivacyFacilitating Analytics while Protecting Privacy
Facilitating Analytics while Protecting Privacy
 
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...
A Small Biotech/Pharma Company with Great Potentials...Novavax's Conference P...
 
EHR Poster 4-11-16
EHR Poster 4-11-16EHR Poster 4-11-16
EHR Poster 4-11-16
 
Annual2018 kuwait newborn screening
Annual2018 kuwait newborn screening Annual2018 kuwait newborn screening
Annual2018 kuwait newborn screening
 
Anti-retroviral therapy in HIV-positive pregnant women and children
Anti-retroviral therapy in HIV-positive pregnant women and childrenAnti-retroviral therapy in HIV-positive pregnant women and children
Anti-retroviral therapy in HIV-positive pregnant women and children
 
Transitions successful practices
Transitions successful practicesTransitions successful practices
Transitions successful practices
 
Dr. Christopher Braden - The NIAA Effort: Learning from the June Roundtable
Dr. Christopher Braden - The NIAA Effort: Learning from the June RoundtableDr. Christopher Braden - The NIAA Effort: Learning from the June Roundtable
Dr. Christopher Braden - The NIAA Effort: Learning from the June Roundtable
 
Public health Surveillance
Public health SurveillancePublic health Surveillance
Public health Surveillance
 
Newborn Screening Programs in Utah
Newborn Screening Programs in UtahNewborn Screening Programs in Utah
Newborn Screening Programs in Utah
 

Ähnlich wie BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Computation

Information Governance And Cancer Intelligence V1 0
Information Governance And Cancer Intelligence V1 0Information Governance And Cancer Intelligence V1 0
Information Governance And Cancer Intelligence V1 0michael_ncin
 
Anurati Mathur & Propeller Health @ Madison's Big Data Meetup
Anurati Mathur & Propeller Health @ Madison's Big Data MeetupAnurati Mathur & Propeller Health @ Madison's Big Data Meetup
Anurati Mathur & Propeller Health @ Madison's Big Data MeetupAnurati Mathur
 
Dr. Martin Bardsley Digital Health Assembly 2015
Dr. Martin Bardsley Digital Health Assembly 2015Dr. Martin Bardsley Digital Health Assembly 2015
Dr. Martin Bardsley Digital Health Assembly 2015DHA2015
 
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...University of California, San Francisco
 
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer MoonshotPrecision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer MoonshotWarren Kibbe
 
CCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxCCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxWarren Kibbe
 
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...Health IT Conference – iHT2
 
Duke Industry Statistics Symposium - Real world evidence , EHRs and Cancer S...
Duke Industry Statistics Symposium -  Real world evidence , EHRs and Cancer S...Duke Industry Statistics Symposium -  Real world evidence , EHRs and Cancer S...
Duke Industry Statistics Symposium - Real world evidence , EHRs and Cancer S...Warren Kibbe
 
How to Use Data to Improve Patient Safety: A Two-Part Discussion
How to Use Data to Improve Patient Safety: A Two-Part DiscussionHow to Use Data to Improve Patient Safety: A Two-Part Discussion
How to Use Data to Improve Patient Safety: A Two-Part DiscussionHealth Catalyst
 
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillanceApplied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillanceNuffield Trust
 

Ähnlich wie BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Computation (20)

Information Governance And Cancer Intelligence V1 0
Information Governance And Cancer Intelligence V1 0Information Governance And Cancer Intelligence V1 0
Information Governance And Cancer Intelligence V1 0
 
Anurati Mathur & Propeller Health @ Madison's Big Data Meetup
Anurati Mathur & Propeller Health @ Madison's Big Data MeetupAnurati Mathur & Propeller Health @ Madison's Big Data Meetup
Anurati Mathur & Propeller Health @ Madison's Big Data Meetup
 
Dr. Martin Bardsley Digital Health Assembly 2015
Dr. Martin Bardsley Digital Health Assembly 2015Dr. Martin Bardsley Digital Health Assembly 2015
Dr. Martin Bardsley Digital Health Assembly 2015
 
Risk Clinic Module of HughesRiskApps
Risk Clinic Module of HughesRiskAppsRisk Clinic Module of HughesRiskApps
Risk Clinic Module of HughesRiskApps
 
Risk Clinic Module of HughesRiskApps
Risk Clinic Module of HughesRiskApps Risk Clinic Module of HughesRiskApps
Risk Clinic Module of HughesRiskApps
 
Big data sharing
Big data sharingBig data sharing
Big data sharing
 
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...
Atul Butte's presentation at #AMIA2021 for the Knowledge Discovery and Data M...
 
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer MoonshotPrecision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
 
HSCIC Data Linkage Stakeholder Forum Nov 2013: The Data Linkage and Extract S...
HSCIC Data Linkage Stakeholder Forum Nov 2013: The Data Linkage and Extract S...HSCIC Data Linkage Stakeholder Forum Nov 2013: The Data Linkage and Extract S...
HSCIC Data Linkage Stakeholder Forum Nov 2013: The Data Linkage and Extract S...
 
CCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxCCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptx
 
PreNatal Module of HughesRiskApps
PreNatal Module of HughesRiskAppsPreNatal Module of HughesRiskApps
PreNatal Module of HughesRiskApps
 
PreNatal Module, HughesRiskApps
PreNatal Module, HughesRiskAppsPreNatal Module, HughesRiskApps
PreNatal Module, HughesRiskApps
 
CLQ Overview Deck
CLQ Overview DeckCLQ Overview Deck
CLQ Overview Deck
 
The challenges of zika: a health IT response
The challenges of zika: a health IT responseThe challenges of zika: a health IT response
The challenges of zika: a health IT response
 
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...
Health IT Summit Beverly Hills 2014 – Case Study “The Progression of Predicti...
 
Duke Industry Statistics Symposium - Real world evidence , EHRs and Cancer S...
Duke Industry Statistics Symposium -  Real world evidence , EHRs and Cancer S...Duke Industry Statistics Symposium -  Real world evidence , EHRs and Cancer S...
Duke Industry Statistics Symposium - Real world evidence , EHRs and Cancer S...
 
How to Use Data to Improve Patient Safety: A Two-Part Discussion
How to Use Data to Improve Patient Safety: A Two-Part DiscussionHow to Use Data to Improve Patient Safety: A Two-Part Discussion
How to Use Data to Improve Patient Safety: A Two-Part Discussion
 
Applied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillanceApplied use of CUSUMs in surveillance
Applied use of CUSUMs in surveillance
 
Power to the Patient
Power to the PatientPower to the Patient
Power to the Patient
 
SgtSaraEdition
SgtSaraEditionSgtSaraEdition
SgtSaraEdition
 

Mehr von Statistisk sentralbyrå

Den europeiske studentundersøkelsen 2018
Den europeiske studentundersøkelsen 2018Den europeiske studentundersøkelsen 2018
Den europeiske studentundersøkelsen 2018Statistisk sentralbyrå
 
Befolkningsframskrivingene 2018, seminar 26. juni
Befolkningsframskrivingene 2018, seminar 26. juniBefolkningsframskrivingene 2018, seminar 26. juni
Befolkningsframskrivingene 2018, seminar 26. juniStatistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...Statistisk sentralbyrå
 
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017Statistisk sentralbyrå
 
Presentasjon rapport: Levekår blant innvandrere i Norge 2016
Presentasjon rapport: Levekår blant innvandrere i Norge 2016Presentasjon rapport: Levekår blant innvandrere i Norge 2016
Presentasjon rapport: Levekår blant innvandrere i Norge 2016Statistisk sentralbyrå
 
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017 SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017 Statistisk sentralbyrå
 
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016Statistisk sentralbyrå
 
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?Statistisk sentralbyrå
 

Mehr von Statistisk sentralbyrå (20)

Den europeiske studentundersøkelsen 2018
Den europeiske studentundersøkelsen 2018Den europeiske studentundersøkelsen 2018
Den europeiske studentundersøkelsen 2018
 
Befolkningsframskrivingene 2018, seminar 26. juni
Befolkningsframskrivingene 2018, seminar 26. juniBefolkningsframskrivingene 2018, seminar 26. juni
Befolkningsframskrivingene 2018, seminar 26. juni
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Co...
 
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017
Innvandrere i Norge 2017, presentasjon fra frokostseminar 11.12.2017
 
SSBs API mot Statistikkbanken
SSBs API mot StatistikkbankenSSBs API mot Statistikkbanken
SSBs API mot Statistikkbanken
 
Norsk kulturbarometer 2016
Norsk kulturbarometer 2016Norsk kulturbarometer 2016
Norsk kulturbarometer 2016
 
Presentasjon rapport: Levekår blant innvandrere i Norge 2016
Presentasjon rapport: Levekår blant innvandrere i Norge 2016Presentasjon rapport: Levekår blant innvandrere i Norge 2016
Presentasjon rapport: Levekår blant innvandrere i Norge 2016
 
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017 SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017
SSB: Fagseminar om innvandring og inntektsutvikling 16. mars 2017
 
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016
Flyktninger i Norge, presentasjoner fra seminar 14. desember 2016
 
SSBs API mot Statistikkbanken
SSBs API mot StatistikkbankenSSBs API mot Statistikkbanken
SSBs API mot Statistikkbanken
 
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?
Flyktninger bosatt i Norge: Hvem er de, og hvordan går det med dem?
 
Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?
 
4. Hva vet vi om verdens flyktninger?
4. Hva vet vi om verdens flyktninger?4. Hva vet vi om verdens flyktninger?
4. Hva vet vi om verdens flyktninger?
 
Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?
 
Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?Hva vet vi om verdens flyktninger?
Hva vet vi om verdens flyktninger?
 

Kürzlich hochgeladen

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 

BigInsight seminar on Practical Privacy-Preserving Distributed Statistical Computation

  • 1. Health data and the re- identification threat – a real world example Giske Ursin Cancer Registry of Norway March 5, 2018 Seminar om privacy-preserving distributed statistical computation, Statistics Norway
  • 2. Norwegian health registries 17 central + 54 clinical registries Purpose: - Asess distribution of disease - Obtain information on how to prevent disease and death from disease Other health data: - Population surveys - 360+ biobanks Cancer screening programs: all women 25-69
  • 4. 30/04/20 18 Put the data somewhere safe… Can only access them there…. But….
  • 5. Kreft i Norge 20151. Are the data safe? National platform coming….
  • 6. 1. Are the data really SAFE? http://www.free-bullion-investment-guide.com/homesafes.html
  • 7. Kreft i Norge 2015 2. Does it matter? Reidentification threat Trust …versus…..
  • 8. Current systems are based on trust
  • 10. An example Month and year of birth Dates of all cervical exams Results of each test Whether or not get cancer Cancer diagnosis date 1 million women
  • 11. An example Month and year of birth Dates of all cervical exams Results of each test Whether or not get cancer Cancer diagnosis date 1 million women Month and year of birth Dates of all cervical exams Results of each test Linked to identifiers on n = xxx women
  • 12. What do we do? Trust? All data deliveries based on trust ….or
  • 13. What do we do? Reduce reidentification threat ………………Exactly HOW??
  • 14. Anonymization protocols K-anonymization (categorizing variables) Creating synthetic datasets Fuzzification
  • 15. Synthetic datasets Reset all dates from reference date Day of birth = day 0 Day started using drug before diagnosis) = day 19 345 Day diagnosed with cancer = day 20 693 Challenge: If need some aspect of calendar year (treatments change)
  • 16. Fuzzification – alter the data - K-anonymization (Categorized variables) - Excluded some observations (extreme dates/combinations) - ALTERED all dates: - Removed DAY - CHANGED month – with random number (fuzzy factor) - REMOVED month of birth
  • 17. Fuzzification of cervix data Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
  • 18. Fuzzification – alter the data • 5,6 million records • All cervical exam dates • Results • Diagnosis dates of cancer • 915 000 women Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
  • 19. Fuzzification – alter the data • Removed extreme dates/combinations • Set day in dates to 15 • Used fuzzy factor on month: • random value between -4 and +4 • All dates one individual changed with same random number Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
  • 20. Original ID DOB Exam 1 Exam 2 Diagnosis date 01071972 23456 1/7/1972 2/8/2000 10/11/2004 21/1/2007 03041960 45678 3/4/1960 5/1/1995 10/2/1998 ---- ID DOB Exam 1 Exam 2 Diagnosis date 001 15/7/1972 15/8/2000 15/11/2004 15/1/2007 002 15/4/1960 15/1/1995 15/2/1998 ---- Allocated ID DOB Exam1 Exam2 Diagnosis date 1023 15/10/1972 15/11/2000 15/2/2005 15/4/2007 4567 15/1/1960 15/11/1994 15/12/1997 --- Fuzzification – alter the data Allocated ID DOB Exam1 Exam2 Diagnosis date 1023 1972 15/11/2000 15/2/2005 15/4/2007 4567 1960 15/11/1994 15/12/1997 --- FINAL DATA
  • 21. Original ID DOB Exam 1 Exam 2 Diagnosis date 01071972 23456 1/7/1972 2/8/2000 10/11/2004 21/1/2007 03041960 45678 3/4/1960 5/1/1995 10/2/1998 ---- Fuzzification – alter the data Allocated ID DOB Exam1 Exam2 Diagnosis date 1023 1972 15/11/2000 15/2/2005 15/4/2007 4567 1960 15/11/1994 15/12/1997 --- FINAL DATA
  • 22. Assessing the risk of reidentification • ARX tool • Quantifies risk of re-identification based on uniqueness • Prosecutor scenario: assumes person in dataset • Classify variables as identiable, quasi-identifiable or sensitive Prasser F, Kohlmayer F, Lautenschlager R, Kuhn KA. ARX--A Comprehensive Tool for Anonymizing Biomedical Data. AMIA Annu Symp Proc. 2014;2014:984-93.
  • 23. Assessing the reidentification risk • D1. Realistic dataset • D2. k-anonymization of dataset D1 • changing all dates in the dataset to 15th of the month • D3. Fuzzifying the month in D2 • by adding a random factor between -4 to +4 months to each month.
  • 26. Fuzzification – WHAT helps? Ursin et al., Cancer Epidemiology Biomarkers Prevention 2017
  • 27. Reidentification risk • Simple step reduces the risk of reidentification • Adding a fuzzy factor makes reidentification even more difficult
  • 28. Graden av personidentifikasjon skal ikke være større enn nødvendig for det aktuelle formålet. Graden av personidentifikasjon skal begrunnes. Tilsynsmyndigheten kan kreve at den databehandlingsansvarlige legger frem begrunnelsen. • Helseregisterloven §6 Current regulations EU – GDPR: Data Protection Impact Assessment Article 35
  • 29. Current practice - examples Cancer Registry: Restrictive with dates Helseregisterloven §6 Prescription Registry: Restrictive with dates §4 «Forbud mot samtidig tilgang» (Differansedager = synthetic dataset) Statistics Norway: ? Common guidelines - and better solutions - needed! Income?
  • 30. Large linkages continue …..still based on trust Can NOT build a national platform on TRUST alone
  • 32. The researchers need: Safe analysis of large linked data (no reidentification threat) - Rapid and seamless analyses - Ability to check individual records Need national platforms that can do it all!
  • 33. Thank you Fuzzy paper: Mari Nygård Sagar Sen Jean-Marie Mottu Discussions with: Jan Nygård Bjørn Møller Hilde Olav Johanne Gulbrandsen Datautleveringsenheten Livmorhalsprogrammet

Hinweis der Redaktion

  1. Vi har mye helsedata. Først og fremst mange helseregistre. 17 setnrale helseregistre: Fødselsregister, dødsårsak, norsk pasientregister, reseptregister, kreftregister osv. Så Kvalitetsregistre for ulike sykdommer. Data samlet inn for å kartlegge….. I tillegg andre helsedata