Presentation about maintaining privacy of patients, while harvesting aggregated data for improvement of patient treatment and scientiffic medical research.
5. Stichting PALGA Foundation founded in 1971 An official medical registration, as described in Dutch Privacy laws Helps pathologist connect to colleagues on a case-to-case basis, since medical relevancy for diagnosis is measured in decades Enabler for statistical medical research from Universities that can be observed through pathology reports Supports national policy development through: Dutch Cancer registration, Cervical and Breast Cancer Screening Programs, Health Care Evaluation and Epidemiological Research Survey National coverage since 1990 Patients can opt-out through responsible pathology lab Slide 4
6. Example scientific questions How effective is the cervix cancer screening program? Is there an effect of innoculations and specific types of cancer? Is there a relation between being born in the 1944 hunger winter and risk of colon cancer? Is there a relation between living in specific geographic locations or regions and the risk of cancer? What is the chance of a type of cancer re-occuring after treatment? Is there an increased risk of having another type of cancer when surviving a specific type of cancer? Slide 5
7. Our privacy challenge We do notwant to know the patient’s identity Directly (name, adress, etc.) Indirectly (by combining information) We do want to correlate medical diagnosis across the lifetime of a subject: Patients change hospital when an illness escalates Current “health waiting list mediation” increases patient mobility People move Medical relevancy is about 20 years Slide 6
8. Indirect identification is challenging Correlating information to real people by combining seemingly innocent information Researchers in the US have been able to corrolate real people with “innocent” information found on the internet using the US public survey data In the Netherlands we have less people per postal code than US citizens per zipcode Some illnesses or combination of illnesses are extremely rare Slide 7
9. Organisational measures Patients can opt-out per investigation through pathology lab External privacy commission evaluates every request made. Judging: legality of a request balance between the medical relevancy and the potential impact on patient privacy privacy of the pathology employees and labs All personnel is screened and under non-diclosure contract (even external ones) Operational guidelines that aim to escalate requests that on hindsight might harm patient privacy Operational guidelines to prevent sharing any information that can be used for indirect identification Processes are audited every year Slide 8
10. Slide 9 Why rebuild? Technology used was 12 years old, without means to upgrade Contained End of Life technology on crucial spots (like file processing) Software was tied to dying hardware, reaching technical End Of Life
11. Why completely re-engineer? Despite being fully compliant with privacy laws, we thought we could do better: Stronger pseudonimisation through a Trusted Third Party prevented mistakes (key collisions did occur too often) Create a better foundation for potential future requirements Better separation between maintenance personnel and operational users Better separation of concerns Isolate high-availability systems better Easier intermediate step towards national electronic patient files (EPD) Slide 10
13. Fundamental design principles Patient identifying information is pseudonomised at the source All communication is encrypted and authenticated Any information is need to know basis only If you really need to know: You will only have to access to the data when absolutely necessary We log every access and every move on the data Only crucial information will be duplicated Slide 12
14. Implications of this design Operational users will be granted access only to those databases they really require for their work, through controlled interfaces Application administrators: Will use adminstrative interface for day-to-day operations, blocking any data access Will only see data when they need to in order to troubleshoot issues Technical administrators will never see medical data at all Slide 13
16. Seperation of goals Needed for a separation of concerns, as well as realizing availability demands Needed in order to prevent potential weakening of the pseudonyms We hope to turn off the direct patient care system someday... Slide 15
18. Technical solution: pseudonimisation Remove patient identifying information without losing the ability to reconstruct a chain of medical episodes through history One-way hash of all patient-identifying information at the source Is nearly collision-proof identifyer for the coming future Is protected against name enumeration attacks Centralised systems don’t know the underlying algorithm, just see it as an externally controlled key Use different pseudonimization algorithms for different goals Slide 17
20. Role of ZorgTTP Second pseudonimisation of patient identifiers used for scientific research Allows for collaboration between medical registrations, providing there is legal clearance and the go-ahead privacy commission Provides a trusted route for medical researchers with identifying data, providing clearance of the privacy commission ZorgTTP is never exposed to medical data, only to “meaningless” identifiers Slide 19
22. A seperation of powers... Application management Access to database (only if required) Monitor application progress Responsible for data quality Technical management Management OS System backup management Responsible for user management Responsible for secure logging actions application management Slide 21
23. Most challenging aspects Moving from old to new pseudonimisation without creating a permanent route for attacking current pseudonimisation Destruction of old data, especially on backups Moving hosting centers and to a new solution, without any disruption in service Slide 22
24. Conclusion System is designed to conform to NEN7510 Reduced identifying information as much as possible, without making the resulting data useless Minimised exposure of sensitive medical data Slide 23
25. Open Ends We are there for 99%, still fighting for the last 1% Logging without creating information overload is challenging Decryption of data without being able to eavesdrop is extremely difficult Slide 24
26. It is a delicate dynamic balance... Computing power increases, and thus the posibilities of indirect identification People themselves have become less stringent with personal information on the internet (Facebook, Twitter), unintentionally opening doors for indirect identification We all learn about new potential ways to attacks on privacy The public debate about what is considered an acceptable level of privacy still rages on Slide 25
These are the images we all get imprinted in our brain from pathology.This is in fact only a tiny portion of their work, the rest is dedicated to keep people of their tableThey are fighting a fight against the most deadly disseases in the world, including cancerIn most cases: a quicker and correct diagnoses greatly improves chances of survival (unlike House M.D.)
Non-identification makes opt-out more difficult
Unfortunatly, diagnosis is extremely complexThis raises questions that are crucial for a quick and correct diagnosis: for both prevention and correct diagnosis, there have to be statistics collected over the population.
“Upgrading” from a regular hospital to a university hospital or even a specialized hospital like the Antony van LeeuwenhoekMeans people move about 3 times....
Solution: reduce the resolution of data in order to protect patient privacy
Although we do have documented cases of opt-out, the level of information dumped on a patient does make you wonder...Some tumors are so rare that asking for them will result in 3 cases in the last 3 decades.
Although Technical Administrators can make themselves a part of the Application Administrators the technical implementation is such that it will be detected in the user management systems of the hosting party, and it will be logged.
Use two encrypted versions of the same text to break the cypher (please note that it really is a one-way hash...).
Use XML SEC (both AUTH and ENC)Chosen not to expose ZorgTTP to medical data....
Hash + Encryption
Please note that in the research database, the original Pseudonims are replaced by a number
When discussing design with developers, this role is unclear to many people.....
We need high availability for some systems, and just surviveability for somePlease note the location of the backups: it is at the remote location (i.e. not close to primary location)
Backups are challenging: it tends to cross the line unless you encrypt the database and its dumps