SlideShare ist ein Scribd-Unternehmen logo
1 von 36
Downloaden Sie, um offline zu lesen
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
CHCDB & CHCDBWEB
Clinical Annotation Database and web interface
Thomas Burguiere
INSERM Unit´e 674
May 5th, 2011
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 1 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 2 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
• 1500 liver tumor samples
• Malignant (HCC) and benign (HCA) tumors
• Normal Tissue
Existing Data
• Clinical Annotations of malignant tumors (4D)
• Excel files which contains :
• Clinical Annotations of malignant & benign tumors
• Other annotations (mutations, clinical studies, etc.)
• Tissue extractions listings (concentrations / quantities)
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 4 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Existing Data
• Clinical Annotations of malignant tumors (4D)
• Excel files which contains :
• Clinical Annotations of malignant & benign tumors
• Other annotations (mutations, clinical studies, etc.)
• Tissue extractions listings (concentrations / quantities)
Problems
• Clinical Annotations of malignant tumours can only be accessed on
single machine
• Redundant data among di↵erent files
• Duplicated files on di↵erent machines
,! Discrepancies between di↵erent files
,! Cross-checking data between the di↵erent data source is cumbersome
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 5 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 6 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Principle
Relational Database : Software
• Relational Database Management System : software which contains
and organizes data (OracleTM
, MySQL, DB2TM
, SQL ServerTM
, etc.)
• Client Server architecture :
• Server software, which manages data, installed on a single machine
• Client software, which queries the server, installed on any machine used to
consult the database
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 7 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Principle
Client Server architecture
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 8 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Principle
Relational Database : Data
• The data is stored in a set of tables
• One can define a set of constraints regarding the data contained in
the tables
• The tables can be associated to one another by logical links : integrity
constraints
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 9 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Principle
Relational Database : Example
!"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789
!"!#$%& "!' ( )# &*+, - ./.
!"!)01& "!! ( 2-
!"!)#.& "!! 3 $.
Classical table (e.g. Excel sheet)
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 10 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Principle
Relational Database : Example
1 Breaking down
data
2 Typing
constraints
3 Unicity
constraints
4 Integrity
constraints
!"##$%&' !$()*!+,% -%. /0%
!"!#$%& "!' ( )#
!"!)01& "!! ( 2-
!"!)#.& "!! 3 $.
!"##$%&' -1%21)#"# 34526%3)(2# 789
!"!#$%& &*+, - ./.
!"!%-)& ('=E, ) ./.
&4567-
&45670
8'*!"'* 9:&
9:&;<<=,': (=<'&8'*!"'*
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 11 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Benefits
Relational Database : Benefits
• Data centralisation on the server side
• Constraints allow, in some instances, to avoid data inconsistencies
,! Consistent data
• E cient : tables containing millions of rows can be easily manipulated
• Querying a correctly structured database allows one to cross-check
data very rapidly*
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 12 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Interface
A database requires graphical interface
• Data manipulation in a database is done exclusively with queries,
written in SQL (Structured Query Language)
•
!"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789
!"!#$%& "!' ( )# &*+, - ./.
!"!)01& "!! ( 2-
!"!)#.& "!! 3 $.
• SELECT t a b l e 1 . t i s s u e I D , t a b l e 1 . TumorType ,
t a b l e 1 . Sex , t a b l e 1 . Age , t a b l e 2 . S t e a t o s i s ,
t a b l e 2 . nb adenomas , t a b l e 2 .CRP
FROM t a b l e 1 INNER JOIN t a b l e 2
ON ( t a b l e 1 . t i s s u e I D = t a b l e 2 . t i s s u e I D )
WHERE t a b l e 1 . TissueID = ’CHC358T ’ ;
• Powerful language, albeit counterintuitive
,! A graphical interface must be associated to the database
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 13 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Interface
Graphical interface : principle
Mecanism
1 The interface receives instruction from the user, and transform them
into SQL queries sent to the server
2 The server receives the SQL queries, and sends back results
3 The interface receives the results from the server, and displays the
results to the user
Interface types
• 2 types of interface : desktop program or web interface
• In our case, we decided to develop a web interface
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 14 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Interface
Web interface
• Software installed on a single machine : the web server
• Accessing the interface only requires a web browser
,! Avoids installation and maintenance issues on the client machines
,! Avoids OS compatibility issues (Mac, Windows, Linux, etc. . .)
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 15 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Interface
Web client / server architecture
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 16 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 17 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Specificities
Specific features of clinical annotation data
Specificities
• New variables are
frequently added
• Data regarding the same
variable can be input
di↵erently, depending of
sample provenance and
type (malignant or
benign tumor)
Consequences in a database
• Frequent addition of new
columns or sub-tables
• Tables contain a lot of columns,
with sparsely filled rows
,! Constant maintenance of the
database
Clinical annotation data must be stored in a specific database structure :
the E.A.V. structure
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 18 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Principle
• E.A.V. = Entity Attribute Value[?]
• An E.A.V. is a subset of tables in a relational database, with a
specific organization
• This data organization is particularly suitable of clinical annotation
data
• In the E.A.V., all clinical annotation data is stored in one 3-columns
table :
• Entity : contains the identifier of the entity for which an annotation is
stored (In our case, an entity is a tissue)
• Attribute : contains the identifier (e.g. the name) of the annotation variable
• Value : contains the value of the annotation, for a given entity
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 19 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Example
!"##$%&' !$()*!+,% -%. /0%
!"!#$%& "!' ( )#
!"!)*+& "!! ( ,-
!"!)#.& "!! / $.
!"##$%&' -1%21)#"# 34526%3)(2# 789
!"!#$%& &012 - .3.
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Example
!"##$%&' !$()*!+,% -%. /0%
!"!#$%& "!' ( )#
!"!)*+& "!! ( ,-
!"!)#.& "!! / $.
!"##$%&' -1%21)#"# 34526%3)(2# 789
!"!#$%& &012 - .3.
256789
256789
':;<=>79
':;<=>79
?@AB>;9
?@AB>;9
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Example
!"##$%&' ?2*"24<%&' ?2<$%
!"!#$%& CBD (
!"!#$%& &>EF;&GHB "!'
!"!#$%& 'IB )#
!"!#$%& C7B@7F9<9 &012
!"!#$%& 5=J@KB5FE@9 -
!"!#$%& !0L .3.
!"!)*+& CBD (
!"!)*+& 'IB ,-
!"!)*+& &>EF;&GHB "!!
!"!)#.& CBD /
!"!)#.& 'IB $.
!"!)#.& &>EF;&GHB "!!
256789 ':;<=>79 ?@AB>;9
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Pros & Cons
Pros
• New annotation = New line in
the table
• No structural modifications
• No sparsely filled tables
Cons
• A lot of lines in the table
• Very complex queries
• Columns are no longer typed
!"##$%&' ?2*"24<%&' ?2<$%
!"!#$%& CBD (
!"!#$%& &>EF;&GHB "!'
!"!#$%& 'IB )#
!"!#$%& C7B@7F9<9 &012
!"!#$%& 5=J@KB5FE@9 -
!"!#$%& !0L .3.
!"!#$%& M;@A!F57;@NBH6F5 &012
!"!)*+& CBD (
!"!)*+& 'IB ,-
!"!)*+& &>EF;&GHB "!!
!"!)#.& CBD /
!"!)#.& 'IB $.
!"!)#.& &>EF;&GHB "!!
!"!)#.& 2KEF59F5 444
?'0!"'0
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 21 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
E.A.V.
Metadata
• Loosing
information
regarding variable
data types is
problematic
,! This data is
stored in an
ancillary table,
the metadata
table
!"##$%&' ?2*"24<%&' ?2<$%
!"!#$%& CBD (
!"!#$%& &>EF;&GHB "!'
!"!#$%& 'IB )#
!"!#$%& C7B@7F9<9 &012
!"!#$%& 5=J@KB5FE@9 -
!"!#$%& !0L .3.
!"!#$%& M;@A!F57;@NBH6F5 &012
!"!)*+& CBD (
!"!)*+& 'IB ,-
!"!)*+& &>EF;&GHB "!!
!"!)#.& CBD /
!"!)#.& 'IB $.
!"!)#.& &>EF;&GHB "!!
!"!)#.& 2KEF59F5 444
?2*"24<%&' '212!+,%
CBD ?'0!"'0
&>EF;&GHB ?'0!"'0
'IB 4O&
C7B@7F9<9 PMMQ2'O
5=J@KB5FE@9 4O&
!0L (QM'&
M;@A!F57;@NBH6F5 PMMQ2'O
2KEF59F5 ?'0!"'0
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 22 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 23 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
CHCDB
Software
• R.D.B.M.S. : MySQL
• Open source & free
• Most widely used open-source R.D.B.M.S.
,! Actively maintained
,! Lots of maintenance and development tools
• The machine hosting the R.D.B.M.S. has yet to be bought
Data
• CHCDB’s tables fall into one of three categories
• Tissue listings
• Clinical annotation data, in the E.A.V. structure
• Extraction data
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 24 / 35
Database structure
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 26 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
A web interface
Peculiarities
• Installed on the server hosting the R.D.B.M.S.
• Can be reached from any machine on the CEPH network
Features
• Consultation and modificationof clinical annotations for a given tissue
• Listing of tissues and their annotations
• Listing of tissue extractions
• Management (add/modify/delete) of annotation variables
• Batch import of annotations
• Batch import of extraction data
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 27 / 35
Consultation & modification of the annotations of a given
tissue
Consultation & modification of the annotations of a given
tissue
Listing of tissues and annotations
Listing of tissue extractions
Annotation variables management
Annotation variables management
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
1 Introduction
2 Relational Database
Principle
Benefits
Interface
3 Clinical Annotation data
Specificities
E.A.V.
4 CHCDB
5 CHCDBWEB
6 Conclusions
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 33 / 35
Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions
Missing features in CHCDBWEB
• The tissue management interface is not yet complete
• The batch import interface for annotations and extraction is missing
CHCDB
• Defining a starting set of variables
• Importing existing data into CHCDB
Material
• Acquiring a configuring the machine which will host the database and
the web server
CHCDB and CHCDBWEB should enter production phase in June 2011.
Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 34 / 35

Weitere ähnliche Inhalte

Andere mochten auch (8)

Programacionlemo
ProgramacionlemoProgramacionlemo
Programacionlemo
 
Concept presentation - Pondy's Adventure
Concept presentation - Pondy's AdventureConcept presentation - Pondy's Adventure
Concept presentation - Pondy's Adventure
 
Researching your biography
Researching your biographyResearching your biography
Researching your biography
 
English
EnglishEnglish
English
 
English
EnglishEnglish
English
 
Long trem disability lawyer
Long trem disability lawyerLong trem disability lawyer
Long trem disability lawyer
 
Just Consult
Just ConsultJust Consult
Just Consult
 
Introducing The Open Business Program
Introducing The Open Business ProgramIntroducing The Open Business Program
Introducing The Open Business Program
 

Ähnlich wie Clinical data eav

The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
Snow Owl
 
Study start up activities in clinical data management
Study start up activities in clinical data managementStudy start up activities in clinical data management
Study start up activities in clinical data management
soumyapottola
 
2016 Standardization of Laboratory Test Coding - PHI Conference
2016 Standardization of Laboratory Test Coding - PHI Conference2016 Standardization of Laboratory Test Coding - PHI Conference
2016 Standardization of Laboratory Test Coding - PHI Conference
Megan Sawchuk
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
Philip Cheung
 
ICIC 2014 From SureChem to SureChEMBL
ICIC 2014 From SureChem to SureChEMBLICIC 2014 From SureChem to SureChEMBL
ICIC 2014 From SureChem to SureChEMBL
Dr. Haxel Consult
 

Ähnlich wie Clinical data eav (20)

Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
Optimizing SPARQL Query Processing On Dynamic and Static Data Based on Query ...
 
The Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to TerminologyThe Logical Model Designer - Binding Information Models to Terminology
The Logical Model Designer - Binding Information Models to Terminology
 
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
Data Warehousing in Pharma: How to Find Bad Data while Meeting Regulatory Req...
 
EFFICIENT EXECUTION METHODS OF PIVOTING FOR BULK EXTRACTION OF ENTITY-ATTRIBU...
EFFICIENT EXECUTION METHODS OF PIVOTING FOR BULK EXTRACTION OF ENTITY-ATTRIBU...EFFICIENT EXECUTION METHODS OF PIVOTING FOR BULK EXTRACTION OF ENTITY-ATTRIBU...
EFFICIENT EXECUTION METHODS OF PIVOTING FOR BULK EXTRACTION OF ENTITY-ATTRIBU...
 
Qa what is_clinical_data_management
Qa what is_clinical_data_managementQa what is_clinical_data_management
Qa what is_clinical_data_management
 
Clinical data management
Clinical data management Clinical data management
Clinical data management
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Study start up activities in clinical data management
Study start up activities in clinical data managementStudy start up activities in clinical data management
Study start up activities in clinical data management
 
Clinical data-management-overview
Clinical data-management-overviewClinical data-management-overview
Clinical data-management-overview
 
Production Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated WorldProduction Readiness Strategies in an Automated World
Production Readiness Strategies in an Automated World
 
2016 Standardization of Laboratory Test Coding - PHI Conference
2016 Standardization of Laboratory Test Coding - PHI Conference2016 Standardization of Laboratory Test Coding - PHI Conference
2016 Standardization of Laboratory Test Coding - PHI Conference
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Designing and launching the Clinical Reference Library
Designing and launching the Clinical Reference LibraryDesigning and launching the Clinical Reference Library
Designing and launching the Clinical Reference Library
 
ICIC 2014 From SureChem to SureChEMBL
ICIC 2014 From SureChem to SureChEMBLICIC 2014 From SureChem to SureChEMBL
ICIC 2014 From SureChem to SureChEMBL
 
Using Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDESUsing Semantic Technology to Drive Agile Analytics - SLIDES
Using Semantic Technology to Drive Agile Analytics - SLIDES
 
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
 
EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...
EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...
EUGM 2014 - Serge P. Parel (Exquiron): Farewell, PipelinePilot : Migrating th...
 
Clinical Healthcare Data Analytics
Clinical Healthcare Data AnalyticsClinical Healthcare Data Analytics
Clinical Healthcare Data Analytics
 
Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)Metabolomic Data Analysis Workshop and Tutorials (2014)
Metabolomic Data Analysis Workshop and Tutorials (2014)
 
Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...Enhancing Reproducibility and Transparency in Clinical Research through Seman...
Enhancing Reproducibility and Transparency in Clinical Research through Seman...
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

Clinical data eav

  • 1. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB & CHCDBWEB Clinical Annotation Database and web interface Thomas Burguiere INSERM Unit´e 674 May 5th, 2011 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 1 / 35
  • 2. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 2 / 35
  • 3. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions • 1500 liver tumor samples • Malignant (HCC) and benign (HCA) tumors • Normal Tissue Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 4 / 35
  • 4. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Existing Data • Clinical Annotations of malignant tumors (4D) • Excel files which contains : • Clinical Annotations of malignant & benign tumors • Other annotations (mutations, clinical studies, etc.) • Tissue extractions listings (concentrations / quantities) Problems • Clinical Annotations of malignant tumours can only be accessed on single machine • Redundant data among di↵erent files • Duplicated files on di↵erent machines ,! Discrepancies between di↵erent files ,! Cross-checking data between the di↵erent data source is cumbersome Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 5 / 35
  • 5. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 6 / 35
  • 6. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Software • Relational Database Management System : software which contains and organizes data (OracleTM , MySQL, DB2TM , SQL ServerTM , etc.) • Client Server architecture : • Server software, which manages data, installed on a single machine • Client software, which queries the server, installed on any machine used to consult the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 7 / 35
  • 7. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Client Server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 8 / 35
  • 8. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Data • The data is stored in a set of tables • One can define a set of constraints regarding the data contained in the tables • The tables can be associated to one another by logical links : integrity constraints Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 9 / 35
  • 9. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. Classical table (e.g. Excel sheet) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 10 / 35
  • 10. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Principle Relational Database : Example 1 Breaking down data 2 Typing constraints 3 Unicity constraints 4 Integrity constraints !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)01& "!! ( 2- !"!)#.& "!! 3 $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &*+, - ./. !"!%-)& ('=E, ) ./. &4567- &45670 8'*!"'* 9:& 9:&;<<=,': (=<'&8'*!"'* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 11 / 35
  • 11. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Benefits Relational Database : Benefits • Data centralisation on the server side • Constraints allow, in some instances, to avoid data inconsistencies ,! Consistent data • E cient : tables containing millions of rows can be easily manipulated • Querying a correctly structured database allows one to cross-check data very rapidly* Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 12 / 35
  • 12. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface A database requires graphical interface • Data manipulation in a database is done exclusively with queries, written in SQL (Structured Query Language) • !"##$%&' !$()*!+,% -%. /0% -1%21)#"# 34526%3)(2# 789 !"!#$%& "!' ( )# &*+, - ./. !"!)01& "!! ( 2- !"!)#.& "!! 3 $. • SELECT t a b l e 1 . t i s s u e I D , t a b l e 1 . TumorType , t a b l e 1 . Sex , t a b l e 1 . Age , t a b l e 2 . S t e a t o s i s , t a b l e 2 . nb adenomas , t a b l e 2 .CRP FROM t a b l e 1 INNER JOIN t a b l e 2 ON ( t a b l e 1 . t i s s u e I D = t a b l e 2 . t i s s u e I D ) WHERE t a b l e 1 . TissueID = ’CHC358T ’ ; • Powerful language, albeit counterintuitive ,! A graphical interface must be associated to the database Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 13 / 35
  • 13. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Graphical interface : principle Mecanism 1 The interface receives instruction from the user, and transform them into SQL queries sent to the server 2 The server receives the SQL queries, and sends back results 3 The interface receives the results from the server, and displays the results to the user Interface types • 2 types of interface : desktop program or web interface • In our case, we decided to develop a web interface Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 14 / 35
  • 14. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web interface • Software installed on a single machine : the web server • Accessing the interface only requires a web browser ,! Avoids installation and maintenance issues on the client machines ,! Avoids OS compatibility issues (Mac, Windows, Linux, etc. . .) Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 15 / 35
  • 15. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Interface Web client / server architecture Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 16 / 35
  • 16. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 17 / 35
  • 17. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Specificities Specific features of clinical annotation data Specificities • New variables are frequently added • Data regarding the same variable can be input di↵erently, depending of sample provenance and type (malignant or benign tumor) Consequences in a database • Frequent addition of new columns or sub-tables • Tables contain a lot of columns, with sparsely filled rows ,! Constant maintenance of the database Clinical annotation data must be stored in a specific database structure : the E.A.V. structure Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 18 / 35
  • 18. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Principle • E.A.V. = Entity Attribute Value[?] • An E.A.V. is a subset of tables in a relational database, with a specific organization • This data organization is particularly suitable of clinical annotation data • In the E.A.V., all clinical annotation data is stored in one 3-columns table : • Entity : contains the identifier of the entity for which an annotation is stored (In our case, an entity is a tissue) • Attribute : contains the identifier (e.g. the name) of the annotation variable • Value : contains the value of the annotation, for a given entity Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 19 / 35
  • 19. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 20. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' !$()*!+,% -%. /0% !"!#$%& "!' ( )# !"!)*+& "!! ( ,- !"!)#.& "!! / $. !"##$%&' -1%21)#"# 34526%3)(2# 789 !"!#$%& &012 - .3. 256789 256789 ':;<=>79 ':;<=>79 ?@AB>;9 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 21. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Example !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! 256789 ':;<=>79 ?@AB>;9 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 20 / 35
  • 22. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Pros & Cons Pros • New annotation = New line in the table • No structural modifications • No sparsely filled tables Cons • A lot of lines in the table • Very complex queries • Columns are no longer typed !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 21 / 35
  • 23. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions E.A.V. Metadata • Loosing information regarding variable data types is problematic ,! This data is stored in an ancillary table, the metadata table !"##$%&' ?2*"24<%&' ?2<$% !"!#$%& CBD ( !"!#$%& &>EF;&GHB "!' !"!#$%& 'IB )# !"!#$%& C7B@7F9<9 &012 !"!#$%& 5=J@KB5FE@9 - !"!#$%& !0L .3. !"!#$%& M;@A!F57;@NBH6F5 &012 !"!)*+& CBD ( !"!)*+& 'IB ,- !"!)*+& &>EF;&GHB "!! !"!)#.& CBD / !"!)#.& 'IB $. !"!)#.& &>EF;&GHB "!! !"!)#.& 2KEF59F5 444 ?2*"24<%&' '212!+,% CBD ?'0!"'0 &>EF;&GHB ?'0!"'0 'IB 4O& C7B@7F9<9 PMMQ2'O 5=J@KB5FE@9 4O& !0L (QM'& M;@A!F57;@NBH6F5 PMMQ2'O 2KEF59F5 ?'0!"'0 Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 22 / 35
  • 24. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 23 / 35
  • 25. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions CHCDB Software • R.D.B.M.S. : MySQL • Open source & free • Most widely used open-source R.D.B.M.S. ,! Actively maintained ,! Lots of maintenance and development tools • The machine hosting the R.D.B.M.S. has yet to be bought Data • CHCDB’s tables fall into one of three categories • Tissue listings • Clinical annotation data, in the E.A.V. structure • Extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 24 / 35
  • 27. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 26 / 35
  • 28. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions A web interface Peculiarities • Installed on the server hosting the R.D.B.M.S. • Can be reached from any machine on the CEPH network Features • Consultation and modificationof clinical annotations for a given tissue • Listing of tissues and their annotations • Listing of tissue extractions • Management (add/modify/delete) of annotation variables • Batch import of annotations • Batch import of extraction data Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 27 / 35
  • 29. Consultation & modification of the annotations of a given tissue
  • 30. Consultation & modification of the annotations of a given tissue
  • 31. Listing of tissues and annotations
  • 32. Listing of tissue extractions
  • 35. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions 1 Introduction 2 Relational Database Principle Benefits Interface 3 Clinical Annotation data Specificities E.A.V. 4 CHCDB 5 CHCDBWEB 6 Conclusions Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 33 / 35
  • 36. Introduction Relational Database Clinical Annotation data CHCDB CHCDBWEB Conclusions Missing features in CHCDBWEB • The tissue management interface is not yet complete • The batch import interface for annotations and extraction is missing CHCDB • Defining a starting set of variables • Importing existing data into CHCDB Material • Acquiring a configuring the machine which will host the database and the web server CHCDB and CHCDBWEB should enter production phase in June 2011. Thomas Burguiere (INSERM Unit´e 674) CHCDB & CHCDBWEB May 5th, 2011 34 / 35