SlideShare ist ein Scribd-Unternehmen logo
1 von 1
www.helmholtz-hzi.de
Combination of informative
biomarkers in small pilot studies and estimation
of sample size for extended studies
Amani Al-Mekhlafi1,2,
Frank Klawonn1,3
Figure 1. HAUCA Curve starting from 0.85 AUC value for hip
infection dataset
Aims:
• Finding the optimal combination of biomarkers to maximize the AUC
• Estimating the sample size for extended studies
1Department of Biostatistics, Helmholtz Centre for Infection Research
2PhD student Epidemiology, Braunschweig-Hannover
3Department of Computer Science, Ostfalia University of Applied Sciences
Method:
Data:
A pilot study by Omar et al.5 has a total of 24 patients, 12 patients with chronic
periprosthetic hip infection and 12 patients with aseptic hip prosthesis loosening, and
50,416 biomarker candidates (Hip infection dataset)
Feature Selection Approach:
The classification criterion is based on area under the receiver operating characteristic
curve (AUC)
Calculation of p-value:
Based on the same statistic that is used for the Wilcoxon-Mann-Whitney-U test 6
Correction of p-value:
Holm- Bonferroni correction
Background:
Biomarker candidates are defined as measurable molecules found in biological media. According to Biomarkers Definitions Working Group, 20011, biomarkers cover a rather
wide range of parameters. Recently, biomarkers are used widely in medical researches, where single biomarkers may not possess the desired cause-effect association for
disease classification and outcome prediction2. Therefore the efforts of the researchers currently is to combine biomarkers. By new technologies like microarrays, next
generation sequencing and mass spectrometry, researchers can obtain many biomarker candidates that can exceed tens of thousands3. To avoid wasting money and time, it
is suggested to control the number of patients strictly. However, pilot studies usually have low statistical power which reduces the chance of detecting a true effect 4.
Step I:
HAUCA Curve:
A method indicating how many good biomarkers a
data set contains compared to pure random effects7
• Calculate the number of biomarkers that exceed
specific values of AUC:
 in the real dataset
 in a random dataset
• Compute 95% quantile of the binomial distribution
of each AUC value to obtain a confidence interval
In the hip infection data, there is more than random
association between the biomarker candidates and
the disease. Moreover, the study is worthwhile for
further studies.
Biomarker AUC p-value Corrected p-value
1 Bio. with high AUC 0.951 3.328e-05 1.678
2 Bio. with high AUC 0.944 4.955e-05 2.498
5 Bio. with high AUC 0.931 1.028e-04 5.183
4 Bio. with high AUC 0.924 1.442e-04 7.271
6 Bio. with high AUC 0.917 2.012e-04 10.142
9 Bio. with high AUC 0.910 2.744e-04 13.834
12 Bio. with high AUC 0.903 3.713e-04 18.718
Table 1. Top Biomarkers with highest AUC values, their p-value, and corrected p-value
Step III:
Estimate the Sample size:
• Specify the AUC value which is wanted to be validated
• Specify the prevalence of the positive cases
• Specify the number of hypothesis tests
n and n+ are increased gradually until the wanted AUC
value with a significant corrected p-value is achieved
In the hip infection data, a sample size of 60 is needed
to validate the 0.85 AUC value
Step II:
Combination of Biomarkers:
• Select the top k features according to the AUC
• Calculate within the groups:
 The difference in means for each feature
 The variance-covariance matrices between combined
features
• Calculate the AUC of the combination of possibly correlated
biomarkers according to Demler et al. 8
• Measure the lower confidence interval for this combination by
bootstrapping with different levels (0.025, 0.05, 0.1)
In the same dataset, we can notice that when 10 biomarkers are
combined, AUC value becomes close to 1 and the different lower
confidence intervals are not less than 0.95.
NO.
COMBINED
BIOMARKERS
AUC
VALUE
1 0.906186
2 0.952129
3 0.935715
4 0.955792
5 0.944372
6 0.958933
7 0.965342
8 0.978783
9 0.986358
10 0.993131
Figure 2. Curve of AUCs of the combination of the
top 20 biomarkers
Figure 3. Sample Size to validate each AUC valueTable 2. Top 10 combined
AUC values
The LEGaTO project has received funding from the European Union’s Horizon 2020 research and innovation programme under the grant agreement No 780681.
References
1. Biomarkers Definitions Working Group: Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clin. Pharmacol. Ther,2001; 69, 89–95.
2. Yan, L., Tian, L., and Liu, S. Combining large number of weak biomarkers based on AUC. Stat Med, 2015; 34(29): 3811-3830.
3. Soon, W.W., Hariharan, M., and Snyder, M.P. High-throughput sequencing for biology and medicine. Molecular systems biology,2013; 9:640.
4. Button, KS., Ioannidis, JP., Mokrysz C., Nosek BA., Flint J., Robinson ES., and Munafo MR. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;.14(5):365-76.
5. Omar, M., Klawonn, F., Brand, S., Stiesch, M. Krettek, C., and Eberhard, J.Transcriptome wide high-density microarray analysis reveals differential gene transcription in periprosthetic tissue from hips with low-grade
infection versus aseptic loosening. Journal of Arthroplasty, 2017;32: 234-240,2016.
6. Mason, S.J., Graham, N.E.: Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quarterly Journal of the Royal Meteorological
Society 128(584) (2002) 2145–2166
7. Klawonn, F., Wang, J., Koch, I., Eberhard, J., and Omar, M. HAUCA curves for the evaluation of biomarker pilot studies with small sample sizes and large numbers of features. Advances in Intelligent Data Analysis XV,
2016; 356–367.
8. Demler, O., Pencina, M., D’Agostino, R.S.: Impact of correlation on predictive ability of biomarkers. Statistics in Medicine 32 (2013) 4196–421
Conclusion:
• AUC performance measurement has been used not just because it is well established but also it yields closed form solutions of the required calculations and therefore to fast
computation. However, other measurement like entropy, misclassification rate or mutual information might be very good alternatives to AUC.
• The correlation between biomarkers may influence the performance of their combination. Therefore it has been taken into consideration and the variance-covariance matrices
have been calculated to minimize the overoptimistic performance of the combination when independence is assumed.
• In order to statistically validate biomarker candidates from pilot studies it is necessary to estimate the required larger sample size
Result:

Weitere ähnliche Inhalte

Was ist angesagt?

Lecture 10 Sample Size
Lecture 10 Sample SizeLecture 10 Sample Size
Lecture 10 Sample Sizeq8dentist
 
Minimizing Risk In Phase II and III Sample Size Calculation
Minimizing Risk In Phase II and III Sample Size CalculationMinimizing Risk In Phase II and III Sample Size Calculation
Minimizing Risk In Phase II and III Sample Size CalculationnQuery
 
Non-inferiority and Equivalence Study design considerations and sample size
Non-inferiority and Equivalence Study design considerations and sample sizeNon-inferiority and Equivalence Study design considerations and sample size
Non-inferiority and Equivalence Study design considerations and sample sizenQuery
 
Sample size estimation
Sample size estimationSample size estimation
Sample size estimationMonali2011
 
Biostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerBiostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerHopkinsCFAR
 
Sample size & meta analysis
Sample size & meta analysisSample size & meta analysis
Sample size & meta analysisdrsrb
 
Sample and sample size
Sample and sample sizeSample and sample size
Sample and sample sizeManoj Xavier
 
Sample size estimation
Sample size estimationSample size estimation
Sample size estimationHanaaBayomy
 
Introduction to Power Analysis
Introduction to Power AnalysisIntroduction to Power Analysis
Introduction to Power AnalysisDaria Bondareva
 
Sample size calculation
Sample  size calculationSample  size calculation
Sample size calculationSwati Singh
 
Sample size
Sample sizeSample size
Sample sizezubis
 
Sampling and Sample Size
Sampling and Sample SizeSampling and Sample Size
Sampling and Sample SizeDr. Keerti Jain
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencycheweb1
 

Was ist angesagt? (20)

Lecture 10 Sample Size
Lecture 10 Sample SizeLecture 10 Sample Size
Lecture 10 Sample Size
 
SAMPLE SIZE, CONSENT, STATISTICS
SAMPLE SIZE, CONSENT, STATISTICSSAMPLE SIZE, CONSENT, STATISTICS
SAMPLE SIZE, CONSENT, STATISTICS
 
Minimizing Risk In Phase II and III Sample Size Calculation
Minimizing Risk In Phase II and III Sample Size CalculationMinimizing Risk In Phase II and III Sample Size Calculation
Minimizing Risk In Phase II and III Sample Size Calculation
 
Non-inferiority and Equivalence Study design considerations and sample size
Non-inferiority and Equivalence Study design considerations and sample sizeNon-inferiority and Equivalence Study design considerations and sample size
Non-inferiority and Equivalence Study design considerations and sample size
 
Sample size calculation
Sample size calculationSample size calculation
Sample size calculation
 
Sample size estimation
Sample size estimationSample size estimation
Sample size estimation
 
Biostatistics in cancer RCTs
Biostatistics in cancer RCTsBiostatistics in cancer RCTs
Biostatistics in cancer RCTs
 
Biostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & PowerBiostatistics Workshop: Sample Size & Power
Biostatistics Workshop: Sample Size & Power
 
Sample size & meta analysis
Sample size & meta analysisSample size & meta analysis
Sample size & meta analysis
 
Sample and sample size
Sample and sample sizeSample and sample size
Sample and sample size
 
Sample size calculation
Sample size calculationSample size calculation
Sample size calculation
 
Sample size estimation
Sample size estimationSample size estimation
Sample size estimation
 
Introduction to Power Analysis
Introduction to Power AnalysisIntroduction to Power Analysis
Introduction to Power Analysis
 
Sample size calculation
Sample  size calculationSample  size calculation
Sample size calculation
 
Sample size
Sample sizeSample size
Sample size
 
Sampling and Sample Size
Sampling and Sample SizeSampling and Sample Size
Sampling and Sample Size
 
6. sample size v3
6. sample size   v36. sample size   v3
6. sample size v3
 
Network meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistencyNetwork meta-analysis & models for inconsistency
Network meta-analysis & models for inconsistency
 
Sample size
Sample sizeSample size
Sample size
 
P-values in crisis
P-values in crisisP-values in crisis
P-values in crisis
 

Ähnlich wie Combination of informative biomarkers in small pilot studies and estimation of sample size for extended studies

HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataLEGATO project
 
Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Sean Ekins
 
Slides for st judes
Slides for st judesSlides for st judes
Slides for st judesSean Ekins
 
Automated hematology analyzer as a cost effective aid to screen and monitor s...
Automated hematology analyzer as a cost effective aid to screen and monitor s...Automated hematology analyzer as a cost effective aid to screen and monitor s...
Automated hematology analyzer as a cost effective aid to screen and monitor s...nisaiims
 
Assay Standardisation - how this leads to improved patient results
Assay Standardisation - how this leads to improved patient resultsAssay Standardisation - how this leads to improved patient results
Assay Standardisation - how this leads to improved patient resultsWalt Whitman
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfEffective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfPubrica
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitLaure Wynants
 
Point of Care Microfluidics Device for hr-HPV Detection
Point of Care Microfluidics Device for hr-HPV DetectionPoint of Care Microfluidics Device for hr-HPV Detection
Point of Care Microfluidics Device for hr-HPV DetectionCFTCC
 
Antimicrobial stewardship ppt.pptx
Antimicrobial stewardship ppt.pptxAntimicrobial stewardship ppt.pptx
Antimicrobial stewardship ppt.pptxssuser62f0ca
 
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoveryCollaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoverySean Ekins
 
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Fundación Ramón Areces
 
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...Nishita Jaykumar
 
POSTER_JIANYU_LIU
POSTER_JIANYU_LIUPOSTER_JIANYU_LIU
POSTER_JIANYU_LIUJianyu Liu
 
In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...Enrico Ferrero
 
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...ScHARR HEDS
 
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoea
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoeaSaccharomyces boulardii in the prevention of antibiotic-associated diarrhoea
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoeaUtai Sukviwatsirikul
 
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...Utai Sukviwatsirikul
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906Mark Gusack
 

Ähnlich wie Combination of informative biomarkers in small pilot studies and estimation of sample size for extended studies (20)

HiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat dataHiPerMAb: A statistical tool for judging the potential of short fat data
HiPerMAb: A statistical tool for judging the potential of short fat data
 
Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery Exploiting bigger data and collaborative tools for predictive drug discovery
Exploiting bigger data and collaborative tools for predictive drug discovery
 
Slides for st judes
Slides for st judesSlides for st judes
Slides for st judes
 
Automated hematology analyzer as a cost effective aid to screen and monitor s...
Automated hematology analyzer as a cost effective aid to screen and monitor s...Automated hematology analyzer as a cost effective aid to screen and monitor s...
Automated hematology analyzer as a cost effective aid to screen and monitor s...
 
Assay Standardisation - how this leads to improved patient results
Assay Standardisation - how this leads to improved patient resultsAssay Standardisation - how this leads to improved patient results
Assay Standardisation - how this leads to improved patient results
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfEffective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
 
Measuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net BenefitMeasuring clinical utility: uncertainty in Net Benefit
Measuring clinical utility: uncertainty in Net Benefit
 
Point of Care Microfluidics Device for hr-HPV Detection
Point of Care Microfluidics Device for hr-HPV DetectionPoint of Care Microfluidics Device for hr-HPV Detection
Point of Care Microfluidics Device for hr-HPV Detection
 
Antimicrobial stewardship ppt.pptx
Antimicrobial stewardship ppt.pptxAntimicrobial stewardship ppt.pptx
Antimicrobial stewardship ppt.pptx
 
Flacs vs mcs
Flacs vs mcsFlacs vs mcs
Flacs vs mcs
 
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug DiscoveryCollaborative Database and Computational Models for Tuberculosis Drug Discovery
Collaborative Database and Computational Models for Tuberculosis Drug Discovery
 
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...
 
Plos
PlosPlos
Plos
 
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...
ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Sum...
 
POSTER_JIANYU_LIU
POSTER_JIANYU_LIUPOSTER_JIANYU_LIU
POSTER_JIANYU_LIU
 
In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...In silico prediction of novel therapeutic targets using gene - disease associ...
In silico prediction of novel therapeutic targets using gene - disease associ...
 
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...
Valuing Trial Designs from a Pharmaceutical Perspective using Value Based Pri...
 
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoea
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoeaSaccharomyces boulardii in the prevention of antibiotic-associated diarrhoea
Saccharomyces boulardii in the prevention of antibiotic-associated diarrhoea
 
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...
Systematic review with meta-analysis: Saccharomyces boulardii in the preventi...
 
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
[Typ]Poster[Sbj]1593Synoptics[Dte]20150906
 

Mehr von LEGATO project

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitLEGATO project
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemLEGATO project
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsLEGATO project
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworkLEGATO project
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...LEGATO project
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGATO project
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edgeLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGATO project
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGATO project
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGATO project
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGATO project
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGATO project
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneLEGATO project
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingLEGATO project
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edgeLEGATO project
 
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyFPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyLEGATO project
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...LEGATO project
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsLEGATO project
 

Mehr von LEGATO project (20)

Scrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for ProfitScrooge Attack: Undervolting ARM Processors for Profit
Scrooge Attack: Undervolting ARM Processors for Profit
 
A practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating systemA practical approach for updating an integrity-enforced operating system
A practical approach for updating an integrity-enforced operating system
 
TEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEsTEEMon: A continuous performance monitoring framework for TEEs
TEEMon: A continuous performance monitoring framework for TEEs
 
secureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow FrameworksecureTF: A Secure TensorFlow Framework
secureTF: A Secure TensorFlow Framework
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
 
LEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use CaseLEGaTO: Machine Learning Use Case
LEGaTO: Machine Learning Use Case
 
Smart Home AI at the edge
Smart Home AI at the edgeSmart Home AI at the edge
Smart Home AI at the edge
 
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the projectLEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
LEGaTO: Low-Energy Heterogeneous Computing Use of AI in the project
 
LEGaTO Integration
LEGaTO IntegrationLEGaTO Integration
LEGaTO Integration
 
LEGaTO: Use cases
LEGaTO: Use casesLEGaTO: Use cases
LEGaTO: Use cases
 
LEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming ModelsLEGaTO: Software Stack Programming Models
LEGaTO: Software Stack Programming Models
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
 
LEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous HardwareLEGaTO Heterogeneous Hardware
LEGaTO Heterogeneous Hardware
 
LEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing WorkshopLEGaTO: Low-Energy Heterogeneous Computing Workshop
LEGaTO: Low-Energy Heterogeneous Computing Workshop
 
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZoneTZ4Fabric: Executing Smart Contracts with ARM TrustZone
TZ4Fabric: Executing Smart Contracts with ARM TrustZone
 
Infection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow ComputingInfection Research with Maxeler Dataflow Computing
Infection Research with Maxeler Dataflow Computing
 
Smart Home - AI at the edge
Smart Home - AI at the edgeSmart Home - AI at the edge
Smart Home - AI at the edge
 
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-ResiliencyFPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
FPGA Undervolting and Checkpointing for Energy-Efficiency and Error-Resiliency
 
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...Device Data Directory and Asynchronous execution: A path to heterogeneous com...
Device Data Directory and Asynchronous execution: A path to heterogeneous com...
 
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric EnvironmentsScheduling Task-parallel Applications in Dynamically Asymmetric Environments
Scheduling Task-parallel Applications in Dynamically Asymmetric Environments
 

Kürzlich hochgeladen

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 

Kürzlich hochgeladen (20)

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Combination of informative biomarkers in small pilot studies and estimation of sample size for extended studies

  • 1. www.helmholtz-hzi.de Combination of informative biomarkers in small pilot studies and estimation of sample size for extended studies Amani Al-Mekhlafi1,2, Frank Klawonn1,3 Figure 1. HAUCA Curve starting from 0.85 AUC value for hip infection dataset Aims: • Finding the optimal combination of biomarkers to maximize the AUC • Estimating the sample size for extended studies 1Department of Biostatistics, Helmholtz Centre for Infection Research 2PhD student Epidemiology, Braunschweig-Hannover 3Department of Computer Science, Ostfalia University of Applied Sciences Method: Data: A pilot study by Omar et al.5 has a total of 24 patients, 12 patients with chronic periprosthetic hip infection and 12 patients with aseptic hip prosthesis loosening, and 50,416 biomarker candidates (Hip infection dataset) Feature Selection Approach: The classification criterion is based on area under the receiver operating characteristic curve (AUC) Calculation of p-value: Based on the same statistic that is used for the Wilcoxon-Mann-Whitney-U test 6 Correction of p-value: Holm- Bonferroni correction Background: Biomarker candidates are defined as measurable molecules found in biological media. According to Biomarkers Definitions Working Group, 20011, biomarkers cover a rather wide range of parameters. Recently, biomarkers are used widely in medical researches, where single biomarkers may not possess the desired cause-effect association for disease classification and outcome prediction2. Therefore the efforts of the researchers currently is to combine biomarkers. By new technologies like microarrays, next generation sequencing and mass spectrometry, researchers can obtain many biomarker candidates that can exceed tens of thousands3. To avoid wasting money and time, it is suggested to control the number of patients strictly. However, pilot studies usually have low statistical power which reduces the chance of detecting a true effect 4. Step I: HAUCA Curve: A method indicating how many good biomarkers a data set contains compared to pure random effects7 • Calculate the number of biomarkers that exceed specific values of AUC:  in the real dataset  in a random dataset • Compute 95% quantile of the binomial distribution of each AUC value to obtain a confidence interval In the hip infection data, there is more than random association between the biomarker candidates and the disease. Moreover, the study is worthwhile for further studies. Biomarker AUC p-value Corrected p-value 1 Bio. with high AUC 0.951 3.328e-05 1.678 2 Bio. with high AUC 0.944 4.955e-05 2.498 5 Bio. with high AUC 0.931 1.028e-04 5.183 4 Bio. with high AUC 0.924 1.442e-04 7.271 6 Bio. with high AUC 0.917 2.012e-04 10.142 9 Bio. with high AUC 0.910 2.744e-04 13.834 12 Bio. with high AUC 0.903 3.713e-04 18.718 Table 1. Top Biomarkers with highest AUC values, their p-value, and corrected p-value Step III: Estimate the Sample size: • Specify the AUC value which is wanted to be validated • Specify the prevalence of the positive cases • Specify the number of hypothesis tests n and n+ are increased gradually until the wanted AUC value with a significant corrected p-value is achieved In the hip infection data, a sample size of 60 is needed to validate the 0.85 AUC value Step II: Combination of Biomarkers: • Select the top k features according to the AUC • Calculate within the groups:  The difference in means for each feature  The variance-covariance matrices between combined features • Calculate the AUC of the combination of possibly correlated biomarkers according to Demler et al. 8 • Measure the lower confidence interval for this combination by bootstrapping with different levels (0.025, 0.05, 0.1) In the same dataset, we can notice that when 10 biomarkers are combined, AUC value becomes close to 1 and the different lower confidence intervals are not less than 0.95. NO. COMBINED BIOMARKERS AUC VALUE 1 0.906186 2 0.952129 3 0.935715 4 0.955792 5 0.944372 6 0.958933 7 0.965342 8 0.978783 9 0.986358 10 0.993131 Figure 2. Curve of AUCs of the combination of the top 20 biomarkers Figure 3. Sample Size to validate each AUC valueTable 2. Top 10 combined AUC values The LEGaTO project has received funding from the European Union’s Horizon 2020 research and innovation programme under the grant agreement No 780681. References 1. Biomarkers Definitions Working Group: Biomarkers and surrogate endpoints: Preferred definitions and conceptual framework. Clin. Pharmacol. Ther,2001; 69, 89–95. 2. Yan, L., Tian, L., and Liu, S. Combining large number of weak biomarkers based on AUC. Stat Med, 2015; 34(29): 3811-3830. 3. Soon, W.W., Hariharan, M., and Snyder, M.P. High-throughput sequencing for biology and medicine. Molecular systems biology,2013; 9:640. 4. Button, KS., Ioannidis, JP., Mokrysz C., Nosek BA., Flint J., Robinson ES., and Munafo MR. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;.14(5):365-76. 5. Omar, M., Klawonn, F., Brand, S., Stiesch, M. Krettek, C., and Eberhard, J.Transcriptome wide high-density microarray analysis reveals differential gene transcription in periprosthetic tissue from hips with low-grade infection versus aseptic loosening. Journal of Arthroplasty, 2017;32: 234-240,2016. 6. Mason, S.J., Graham, N.E.: Areas beneath the relative operating characteristics (ROC) and relative operating levels (ROL) curves: Statistical significance and interpretation. Quarterly Journal of the Royal Meteorological Society 128(584) (2002) 2145–2166 7. Klawonn, F., Wang, J., Koch, I., Eberhard, J., and Omar, M. HAUCA curves for the evaluation of biomarker pilot studies with small sample sizes and large numbers of features. Advances in Intelligent Data Analysis XV, 2016; 356–367. 8. Demler, O., Pencina, M., D’Agostino, R.S.: Impact of correlation on predictive ability of biomarkers. Statistics in Medicine 32 (2013) 4196–421 Conclusion: • AUC performance measurement has been used not just because it is well established but also it yields closed form solutions of the required calculations and therefore to fast computation. However, other measurement like entropy, misclassification rate or mutual information might be very good alternatives to AUC. • The correlation between biomarkers may influence the performance of their combination. Therefore it has been taken into consideration and the variance-covariance matrices have been calculated to minimize the overoptimistic performance of the combination when independence is assumed. • In order to statistically validate biomarker candidates from pilot studies it is necessary to estimate the required larger sample size Result: