SlideShare ist ein Scribd-Unternehmen logo
First Steps Towards a Risk of Bias Corpus
of Randomized Controlled Trials
Presenter – Anjani Dhrangadhariya
MIE2023 - Göteborg, Sweden, 23.05.23
Authors: Anjani Dhrangadhariya, Roger Hilfiker, Martin Sattelmayer, Katia
Giacomino, Rahel Caliesch, Simone Elsig, Nona Naderi, Henning Müller
Randomized Controlled Trial
• In theory, an RCT accurately measures intervention effects on patient
outcomes, but in practice, biases enter
• Design/Planning
• Execution
• Analysis
• Outcomes reporting
• Systematic Reviews
• Utility
• Medical professionals
• Health policies
• Surgeons
• The risk of bias specifically pertains to systematic errors in the design,
conduct, or reporting of a study that can potentially lead to a
deviation from the true effect being measured.
• RoB assessment guidelines
Risk of Bias (RoB)
Example RoB assessment guidelines Year
Physiotherapy Evidence Database (PEDro) 1999
Risk of Bias Assessment Tool for Nonrandomized Studies (RoBANS) 2004
Cochrane Risk of Bias assessment guidelines 2008
Risk of Bias in Non-randomized Studies of Interventions (ROBINS-I) 2016
Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) 2017
Newcastle-Ottawa Scale (NOS) 2018
Revised Cochrane Risk of Bias for RCTs 2.0 tool (RoB 2) 2019
RoB information extraction
• Thorough assessment
• Manual assessment
• Time-consuming
• Cognitively demanding
• Two experts for manual assessment
• Third, for conflict resolution
• Automation imperative
Related Work
• RoB labelled corpus
• Wang et al. 2022
• Preclinical animal
studies
• Human RCTs
• RobotReviewer
• PDF highlights
• Freely-available
• Closed assess data
• Cochrane RoB v1
• RoB 2.0?
• RoB automation
• Marshall et al. 2015
• Millard et al. 2016
• Cochrane Database
(CDSR)
• Closed access
Motivation
1
No RoB text annotation
guidelines exist
2
No RoB annotated RCTs
exist
Revised Cochrane RoB 2.0 tool
• Can you use the guidelines to
annotate text corpus?
• Extensive guidelines
• Step-by-step instructions
• Divides RoB into 5 domains
• Each domain is assessed using several
signalling questions
Randomization
process
Deviations from
intended
interventions
Missing
outcomes data
Outcomes
measurement
Selection of
reported result
Sterne, J.A., Savović, J., Page, M.J., Elbers, R.G., Blencowe, N.S., Boutron, I., Cates, C.J., Cheng, H.Y., Corbett, M.S., Eldridge, S.M. and Emberson, J.R., 2019. RoB 2: a
revised tool for assessing risk of bias in randomised trials. bmj, 366.
Revised Cochrane RoB 2.0 tool
• Reviewers manually go through the RCT to identify text describing the
answer to a signalling question.
• Based on the answer to the signalling question, select one of the five
response judgements:
Yes Probably Yes Probably No No No Information
Revised Cochrane RoB 2.0 tool
• 2.1 - Were the participants aware of their assigned intervention
during the trial?
2.1 No Good
Risk domains Signalling questions
5 22
Annotation schema
• Follow the revised Cochrane RoB 2.0
• 110 span Labels
• 1.1 Yes Good
• 1.1 Probably Yes Good
• 1.1 Probably No Bad
• 1.1 No bad
• 1.1 No Information
• 1.2 Yes Good
• 1.2 Probably Yes Good
• 1.2 Probably No Bad
• …
1.1 Yes Good
Risk domain
Signalling question
SQ response
Direction
Good = low risk
Bad = High risk
Pilot Annotation
• Ten RCT full-text PDFs
• 2000-2019
• Four annotators
• 2 scientists
• 1 doctoral student
• 1 scientific collaborator
• Two NLP experts
• 1 professor
• 1 doctoral student
• tagtog PDF annotation tool
https://www.tagtog.com/
Evaluation
• F1-measure as Inter-annotator agreement
• Disregards out-of-the-span tokens (unannotated tokens)
1. IAASQ
Do the annotator pairs annotate
the same text span to answer a
signalling question (SQ)?
2. IAAresponse
If the annotator pairs annotate
the same text to answer a
signalling question, do they also
select same response
judgment?
Results - IAASQ
• Zero or no Annotation
• Domain 2 - 52%
• Domain 3 - 54%
• Domain 4 - 50%
• Domain 5 - 61% (protocol)
• Less subjective questions
• Better IAA
The table details the interpretation of pairwise F1-measure.
Results - IAAresponse
• IAA - SQ response judgment
• Averaged over all annotator pairs
• Zero agreement - 52.63%
• No annotation – 22%
~75%
The table details the interpretation of pairwise F1-measure.
Error Inspection – 1. Text span disagreement
• Not limiting the annotators to
annotating
• phrases vs full sentences
4.1 Was the method of measuring the outcome
inappropriate?
…The primary outcome measure was a 0–10
NRS pain score, which reflected the average
pain experienced by the patient for ten days
prior to follow-up…
…a 0–10 NRS pain score…
Phrase!
Sentence
Error Inspection – 2. Different sections
• Annotators use different regions
(Methods section, Results section,
Table, …) of full text to come to
identical labels.
• Same judgment, different parts of
text evidence
2.6 Was an appropriate analysis used to estimate
the effect of assignment to intervention?
…This study was guided by the HAPA, which
has been widely used to address the gap
between intention to change and a person’s
actual change in behaviour [25-27]…
…intention-to-treat analysis was done with
missing data substituted by the last-
observation-carried-forward procedure…
2.1 Yes Good
Error Inspection – 3. Polarity disagreement
… 71 allocated routine services, 67 allocated
intervention service, 69 assessed at 8 weeks,
64 assessed at 8 week...
3.1 Were data for the outcome of interest
available for all, or nearly all, participants
randomized?
• Selecting response judgment
options with different polarities
• Yes vs. No
• Three of the four annotators
responded to 3.1 with Yes, but
one chose Probably no.
• All or nearly all (cut-off?)
Error Inspection – 4. Degree disagreement
• Lenient - definitive
• Yes
• No
• Stringent
• Probably yes
• Probably no
1.1 Was a random sequence generation
method used to assign participants to
intervention groups?
…Patients were randomly allocated to either
intervention by a computer-generated
schedule stratified by sex and attendance at
a day hospital…
Conclusions
1. RoB 2.0 assessment guidelines cannot be directly used as RoB
corpus annotation guidelines.
2. RoB assessment and RoB text annotation tasks are both highly
subjective, but the annotation guidelines can be refined with an
iterative process to improve both.
Future Directions
1. Instructional placards as
annotation guidelines
2. Larger annotated corpus
of RCTs
Dr. Roger Hilfiker
Dr. Martin Sattelmayer
Rahel Caliesch
Katia Giacomino
Dr. Nona Naderi
Annotation team
References
1. Wang, Q., Liao, J., Lapata, M., & Macleod, M. (2022). Risk of bias assessment in preclinical literature using natural language processing. Research Synthesis
Methods, 13(3), 368-380.
2. Macleod, M. R., O’Collins, T., Howells, D. W., & Donnan, G. A. (2004). Pooling of animal experimental data reveals influence of study design and publication
bias. Stroke, 35(5), 1203-1208.
3. Deleger L, Li Q, Lingren T, Kaiser M, Molnar K, Stoutenborough L, Kouril M, Marsolo K, Solti I. Building gold standard corpora for medical natural language processing tasks. InAMIA
Annual Symposium Proceedings 2012 (Vol. 2012, p. 144). American Medical Informatics Association.
4. Sterne, J.A., Savović, J., Page, M.J., Elbers, R.G., Blencowe, N.S., Boutron, I., Cates, C.J., Cheng, H.Y., Corbett, M.S., Eldridge, S.M. and Emberson, J.R., 2019.
RoB 2: a revised tool for assessing risk of bias in randomised trials. bmj, 366.
Thank You
Questions?
Dataset: https://zenodo.org/record/7698941#.ZEGhXexBzzU
Email: anjani.k.dhrangadhariya@gmail.com
LinkedIn: https://www.linkedin.com/in/anjani-dhrangadhariya/

Weitere ähnliche Inhalte

Ähnlich wie MIE20232.pptx

Knowledge transfer research examples
Knowledge transfer research examplesKnowledge transfer research examples
Knowledge transfer research examples
taem
 
Top Articles in Medical Education 2017
Top Articles in Medical Education 2017Top Articles in Medical Education 2017
Top Articles in Medical Education 2017
dsandro1
 
Resident Presentations - Evidence-Based Medicine for Haematology
Resident Presentations - Evidence-Based Medicine for HaematologyResident Presentations - Evidence-Based Medicine for Haematology
Resident Presentations - Evidence-Based Medicine for Haematology
Robin Featherstone
 
Comparison of registered and published intervention fidelity assessment in cl...
Comparison of registered and published intervention fidelity assessment in cl...Comparison of registered and published intervention fidelity assessment in cl...
Comparison of registered and published intervention fidelity assessment in cl...
valéry ridde
 
Techniques in clinical epidemiology
Techniques in clinical epidemiologyTechniques in clinical epidemiology
Techniques in clinical epidemiology
Bhoj Raj Singh
 
CAT Systematic reviews of RCT.pptx
CAT Systematic reviews of RCT.pptxCAT Systematic reviews of RCT.pptx
CAT Systematic reviews of RCT.pptx
mariaidrees3
 
Dataset Codebook BUS7105, Week 8 Name Source Represe
Dataset Codebook  BUS7105, Week 8  Name Source RepreseDataset Codebook  BUS7105, Week 8  Name Source Represe
Dataset Codebook BUS7105, Week 8 Name Source Represe
OllieShoresna
 
Quick introduction to critical appraisal of quantitative research
Quick introduction to critical appraisal of quantitative researchQuick introduction to critical appraisal of quantitative research
Quick introduction to critical appraisal of quantitative research
Alan Fricker
 
Systematic Review & Meta Analysis.pptx
Systematic Review & Meta Analysis.pptxSystematic Review & Meta Analysis.pptx
Systematic Review & Meta Analysis.pptx
Dr. Anik Chakraborty
 
Spotlight Webinar: ROBINS-I
Spotlight Webinar: ROBINS-I Spotlight Webinar: ROBINS-I
How to conduct a systematic review
How to conduct a systematic reviewHow to conduct a systematic review
How to conduct a systematic review
DrNidhiPruthiShukla
 
Efficacy of Information interventions in reducing transfer anxiety from a cri...
Efficacy of Information interventions in reducing transfer anxiety from a cri...Efficacy of Information interventions in reducing transfer anxiety from a cri...
Efficacy of Information interventions in reducing transfer anxiety from a cri...
Ambika Rai
 
Development of health measurement scales - part 1
Development of health measurement scales - part 1Development of health measurement scales - part 1
Development of health measurement scales - part 1
Rizwan S A
 
Correlational research
Correlational researchCorrelational research
Correlational research
Dhiya Lara
 
Correlational research
Correlational researchCorrelational research
Correlational research
Azura Zaki
 
Jan Hrabal: Evaluation of medical information quality #bcs2015
Jan Hrabal: Evaluation of medical information quality #bcs2015Jan Hrabal: Evaluation of medical information quality #bcs2015
Jan Hrabal: Evaluation of medical information quality #bcs2015
KISK FF MU
 
SHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLPSHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLP
AlAcademia Tsr
 
medicine_research_slides_1415_topic6.pdf
medicine_research_slides_1415_topic6.pdfmedicine_research_slides_1415_topic6.pdf
medicine_research_slides_1415_topic6.pdf
PerioKLE
 
judith dyson collaborative launch
judith dyson collaborative launchjudith dyson collaborative launch
judith dyson collaborative launch
NHS Improving Quality
 
47711.ppt
47711.ppt47711.ppt
47711.ppt
mousaderhem1
 

Ähnlich wie MIE20232.pptx (20)

Knowledge transfer research examples
Knowledge transfer research examplesKnowledge transfer research examples
Knowledge transfer research examples
 
Top Articles in Medical Education 2017
Top Articles in Medical Education 2017Top Articles in Medical Education 2017
Top Articles in Medical Education 2017
 
Resident Presentations - Evidence-Based Medicine for Haematology
Resident Presentations - Evidence-Based Medicine for HaematologyResident Presentations - Evidence-Based Medicine for Haematology
Resident Presentations - Evidence-Based Medicine for Haematology
 
Comparison of registered and published intervention fidelity assessment in cl...
Comparison of registered and published intervention fidelity assessment in cl...Comparison of registered and published intervention fidelity assessment in cl...
Comparison of registered and published intervention fidelity assessment in cl...
 
Techniques in clinical epidemiology
Techniques in clinical epidemiologyTechniques in clinical epidemiology
Techniques in clinical epidemiology
 
CAT Systematic reviews of RCT.pptx
CAT Systematic reviews of RCT.pptxCAT Systematic reviews of RCT.pptx
CAT Systematic reviews of RCT.pptx
 
Dataset Codebook BUS7105, Week 8 Name Source Represe
Dataset Codebook  BUS7105, Week 8  Name Source RepreseDataset Codebook  BUS7105, Week 8  Name Source Represe
Dataset Codebook BUS7105, Week 8 Name Source Represe
 
Quick introduction to critical appraisal of quantitative research
Quick introduction to critical appraisal of quantitative researchQuick introduction to critical appraisal of quantitative research
Quick introduction to critical appraisal of quantitative research
 
Systematic Review & Meta Analysis.pptx
Systematic Review & Meta Analysis.pptxSystematic Review & Meta Analysis.pptx
Systematic Review & Meta Analysis.pptx
 
Spotlight Webinar: ROBINS-I
Spotlight Webinar: ROBINS-I Spotlight Webinar: ROBINS-I
Spotlight Webinar: ROBINS-I
 
How to conduct a systematic review
How to conduct a systematic reviewHow to conduct a systematic review
How to conduct a systematic review
 
Efficacy of Information interventions in reducing transfer anxiety from a cri...
Efficacy of Information interventions in reducing transfer anxiety from a cri...Efficacy of Information interventions in reducing transfer anxiety from a cri...
Efficacy of Information interventions in reducing transfer anxiety from a cri...
 
Development of health measurement scales - part 1
Development of health measurement scales - part 1Development of health measurement scales - part 1
Development of health measurement scales - part 1
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Correlational research
Correlational researchCorrelational research
Correlational research
 
Jan Hrabal: Evaluation of medical information quality #bcs2015
Jan Hrabal: Evaluation of medical information quality #bcs2015Jan Hrabal: Evaluation of medical information quality #bcs2015
Jan Hrabal: Evaluation of medical information quality #bcs2015
 
SHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLPSHE, Quality, and Ethics in Medical Laboratories - PCLP
SHE, Quality, and Ethics in Medical Laboratories - PCLP
 
medicine_research_slides_1415_topic6.pdf
medicine_research_slides_1415_topic6.pdfmedicine_research_slides_1415_topic6.pdf
medicine_research_slides_1415_topic6.pdf
 
judith dyson collaborative launch
judith dyson collaborative launchjudith dyson collaborative launch
judith dyson collaborative launch
 
47711.ppt
47711.ppt47711.ppt
47711.ppt
 

Mehr von Institute of Information Systems (HES-SO)

Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...
Institute of Information Systems (HES-SO)
 
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Institute of Information Systems (HES-SO)
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Institute of Information Systems (HES-SO)
 
L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?
Institute of Information Systems (HES-SO)
 
Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...
Institute of Information Systems (HES-SO)
 
Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...
Institute of Information Systems (HES-SO)
 
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Institute of Information Systems (HES-SO)
 
Le système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodesLe système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodes
Institute of Information Systems (HES-SO)
 
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair AccessibilityCrowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
Institute of Information Systems (HES-SO)
 
Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?
Institute of Information Systems (HES-SO)
 
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
Institute of Information Systems (HES-SO)
 
Challenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL modelChallenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL model
Institute of Information Systems (HES-SO)
 
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesNOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
Institute of Information Systems (HES-SO)
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
Institute of Information Systems (HES-SO)
 
Medical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructuresMedical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructures
Institute of Information Systems (HES-SO)
 
How to detect soft falls on devices
How to detect soft falls on devicesHow to detect soft falls on devices
How to detect soft falls on devices
Institute of Information Systems (HES-SO)
 
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSISFUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
Institute of Information Systems (HES-SO)
 
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLSMOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
Institute of Information Systems (HES-SO)
 
Enhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET projectEnhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET project
Institute of Information Systems (HES-SO)
 
Solar production prediction based on non linear meteo source adaptation
Solar production prediction based on non linear meteo source adaptationSolar production prediction based on non linear meteo source adaptation
Solar production prediction based on non linear meteo source adaptation
Institute of Information Systems (HES-SO)
 

Mehr von Institute of Information Systems (HES-SO) (20)

Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...Classification of noisy free-text prostate cancer pathology reports using nat...
Classification of noisy free-text prostate cancer pathology reports using nat...
 
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...Machine learning assisted citation screening for Systematic Reviews - Anjani ...
Machine learning assisted citation screening for Systematic Reviews - Anjani ...
 
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...Exploiting biomedical literature to mine out a large multimodal dataset of ra...
Exploiting biomedical literature to mine out a large multimodal dataset of ra...
 
L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?L'IoT dans les usines. Quels avantages ?
L'IoT dans les usines. Quels avantages ?
 
Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...Studying Public Medical Images from Open Access Literature and Social Network...
Studying Public Medical Images from Open Access Literature and Social Network...
 
Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...Risques opérationnels et le système de contrôle interne : les limites d’un te...
Risques opérationnels et le système de contrôle interne : les limites d’un te...
 
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
Le contrôle interne dans les administrations publiques tient-il toutes ses pr...
 
Le système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodesLe système de contrôle interne : Présentation générale, enjeux et méthodes
Le système de contrôle interne : Présentation générale, enjeux et méthodes
 
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair AccessibilityCrowdsourcing-based Mobile Application for Wheelchair Accessibility
Crowdsourcing-based Mobile Application for Wheelchair Accessibility
 
Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?Quelle(s) valeur(s) pour le leadership stratégique ?
Quelle(s) valeur(s) pour le leadership stratégique ?
 
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...
 
Challenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL modelChallenges in medical imaging and the VISCERAL model
Challenges in medical imaging and the VISCERAL model
 
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesNOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines
 
Medical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructuresMedical image analysis and big data evaluation infrastructures
Medical image analysis and big data evaluation infrastructures
 
Medical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructuresMedical image analysis, retrieval and evaluation infrastructures
Medical image analysis, retrieval and evaluation infrastructures
 
How to detect soft falls on devices
How to detect soft falls on devicesHow to detect soft falls on devices
How to detect soft falls on devices
 
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSISFUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS
 
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLSMOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS
 
Enhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET projectEnhanced Students Laboratory The GET project
Enhanced Students Laboratory The GET project
 
Solar production prediction based on non linear meteo source adaptation
Solar production prediction based on non linear meteo source adaptationSolar production prediction based on non linear meteo source adaptation
Solar production prediction based on non linear meteo source adaptation
 

Kürzlich hochgeladen

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Wask
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 

Kürzlich hochgeladen (20)

UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Digital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying AheadDigital Marketing Trends in 2024 | Guide for Staying Ahead
Digital Marketing Trends in 2024 | Guide for Staying Ahead
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 

MIE20232.pptx

  • 1. First Steps Towards a Risk of Bias Corpus of Randomized Controlled Trials Presenter – Anjani Dhrangadhariya MIE2023 - Göteborg, Sweden, 23.05.23 Authors: Anjani Dhrangadhariya, Roger Hilfiker, Martin Sattelmayer, Katia Giacomino, Rahel Caliesch, Simone Elsig, Nona Naderi, Henning Müller
  • 2. Randomized Controlled Trial • In theory, an RCT accurately measures intervention effects on patient outcomes, but in practice, biases enter • Design/Planning • Execution • Analysis • Outcomes reporting • Systematic Reviews • Utility • Medical professionals • Health policies • Surgeons
  • 3. • The risk of bias specifically pertains to systematic errors in the design, conduct, or reporting of a study that can potentially lead to a deviation from the true effect being measured. • RoB assessment guidelines Risk of Bias (RoB) Example RoB assessment guidelines Year Physiotherapy Evidence Database (PEDro) 1999 Risk of Bias Assessment Tool for Nonrandomized Studies (RoBANS) 2004 Cochrane Risk of Bias assessment guidelines 2008 Risk of Bias in Non-randomized Studies of Interventions (ROBINS-I) 2016 Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) 2017 Newcastle-Ottawa Scale (NOS) 2018 Revised Cochrane Risk of Bias for RCTs 2.0 tool (RoB 2) 2019
  • 4. RoB information extraction • Thorough assessment • Manual assessment • Time-consuming • Cognitively demanding • Two experts for manual assessment • Third, for conflict resolution • Automation imperative
  • 5. Related Work • RoB labelled corpus • Wang et al. 2022 • Preclinical animal studies • Human RCTs • RobotReviewer • PDF highlights • Freely-available • Closed assess data • Cochrane RoB v1 • RoB 2.0? • RoB automation • Marshall et al. 2015 • Millard et al. 2016 • Cochrane Database (CDSR) • Closed access
  • 6. Motivation 1 No RoB text annotation guidelines exist 2 No RoB annotated RCTs exist
  • 7. Revised Cochrane RoB 2.0 tool • Can you use the guidelines to annotate text corpus? • Extensive guidelines • Step-by-step instructions • Divides RoB into 5 domains • Each domain is assessed using several signalling questions Randomization process Deviations from intended interventions Missing outcomes data Outcomes measurement Selection of reported result Sterne, J.A., Savović, J., Page, M.J., Elbers, R.G., Blencowe, N.S., Boutron, I., Cates, C.J., Cheng, H.Y., Corbett, M.S., Eldridge, S.M. and Emberson, J.R., 2019. RoB 2: a revised tool for assessing risk of bias in randomised trials. bmj, 366.
  • 8. Revised Cochrane RoB 2.0 tool • Reviewers manually go through the RCT to identify text describing the answer to a signalling question. • Based on the answer to the signalling question, select one of the five response judgements: Yes Probably Yes Probably No No No Information
  • 9. Revised Cochrane RoB 2.0 tool • 2.1 - Were the participants aware of their assigned intervention during the trial? 2.1 No Good Risk domains Signalling questions 5 22
  • 10. Annotation schema • Follow the revised Cochrane RoB 2.0 • 110 span Labels • 1.1 Yes Good • 1.1 Probably Yes Good • 1.1 Probably No Bad • 1.1 No bad • 1.1 No Information • 1.2 Yes Good • 1.2 Probably Yes Good • 1.2 Probably No Bad • … 1.1 Yes Good Risk domain Signalling question SQ response Direction Good = low risk Bad = High risk
  • 11. Pilot Annotation • Ten RCT full-text PDFs • 2000-2019 • Four annotators • 2 scientists • 1 doctoral student • 1 scientific collaborator • Two NLP experts • 1 professor • 1 doctoral student • tagtog PDF annotation tool https://www.tagtog.com/
  • 12. Evaluation • F1-measure as Inter-annotator agreement • Disregards out-of-the-span tokens (unannotated tokens) 1. IAASQ Do the annotator pairs annotate the same text span to answer a signalling question (SQ)? 2. IAAresponse If the annotator pairs annotate the same text to answer a signalling question, do they also select same response judgment?
  • 13. Results - IAASQ • Zero or no Annotation • Domain 2 - 52% • Domain 3 - 54% • Domain 4 - 50% • Domain 5 - 61% (protocol) • Less subjective questions • Better IAA The table details the interpretation of pairwise F1-measure.
  • 14. Results - IAAresponse • IAA - SQ response judgment • Averaged over all annotator pairs • Zero agreement - 52.63% • No annotation – 22% ~75% The table details the interpretation of pairwise F1-measure.
  • 15. Error Inspection – 1. Text span disagreement • Not limiting the annotators to annotating • phrases vs full sentences 4.1 Was the method of measuring the outcome inappropriate? …The primary outcome measure was a 0–10 NRS pain score, which reflected the average pain experienced by the patient for ten days prior to follow-up… …a 0–10 NRS pain score… Phrase! Sentence
  • 16. Error Inspection – 2. Different sections • Annotators use different regions (Methods section, Results section, Table, …) of full text to come to identical labels. • Same judgment, different parts of text evidence 2.6 Was an appropriate analysis used to estimate the effect of assignment to intervention? …This study was guided by the HAPA, which has been widely used to address the gap between intention to change and a person’s actual change in behaviour [25-27]… …intention-to-treat analysis was done with missing data substituted by the last- observation-carried-forward procedure… 2.1 Yes Good
  • 17. Error Inspection – 3. Polarity disagreement … 71 allocated routine services, 67 allocated intervention service, 69 assessed at 8 weeks, 64 assessed at 8 week... 3.1 Were data for the outcome of interest available for all, or nearly all, participants randomized? • Selecting response judgment options with different polarities • Yes vs. No • Three of the four annotators responded to 3.1 with Yes, but one chose Probably no. • All or nearly all (cut-off?)
  • 18. Error Inspection – 4. Degree disagreement • Lenient - definitive • Yes • No • Stringent • Probably yes • Probably no 1.1 Was a random sequence generation method used to assign participants to intervention groups? …Patients were randomly allocated to either intervention by a computer-generated schedule stratified by sex and attendance at a day hospital…
  • 19. Conclusions 1. RoB 2.0 assessment guidelines cannot be directly used as RoB corpus annotation guidelines. 2. RoB assessment and RoB text annotation tasks are both highly subjective, but the annotation guidelines can be refined with an iterative process to improve both.
  • 20. Future Directions 1. Instructional placards as annotation guidelines 2. Larger annotated corpus of RCTs
  • 21. Dr. Roger Hilfiker Dr. Martin Sattelmayer Rahel Caliesch Katia Giacomino Dr. Nona Naderi Annotation team
  • 22. References 1. Wang, Q., Liao, J., Lapata, M., & Macleod, M. (2022). Risk of bias assessment in preclinical literature using natural language processing. Research Synthesis Methods, 13(3), 368-380. 2. Macleod, M. R., O’Collins, T., Howells, D. W., & Donnan, G. A. (2004). Pooling of animal experimental data reveals influence of study design and publication bias. Stroke, 35(5), 1203-1208. 3. Deleger L, Li Q, Lingren T, Kaiser M, Molnar K, Stoutenborough L, Kouril M, Marsolo K, Solti I. Building gold standard corpora for medical natural language processing tasks. InAMIA Annual Symposium Proceedings 2012 (Vol. 2012, p. 144). American Medical Informatics Association. 4. Sterne, J.A., Savović, J., Page, M.J., Elbers, R.G., Blencowe, N.S., Boutron, I., Cates, C.J., Cheng, H.Y., Corbett, M.S., Eldridge, S.M. and Emberson, J.R., 2019. RoB 2: a revised tool for assessing risk of bias in randomised trials. bmj, 366.
  • 23. Thank You Questions? Dataset: https://zenodo.org/record/7698941#.ZEGhXexBzzU Email: anjani.k.dhrangadhariya@gmail.com LinkedIn: https://www.linkedin.com/in/anjani-dhrangadhariya/

Hinweis der Redaktion

  1. Randomized controlled trials or RCTs, aim to accurately measure treatment effects on patient outcomes. In theory, they aim to minimize bias, but in practice, biases tend to creep into any of the trial stages. When RCTs with such questionable biases are used to write systematic reviews, they reduce the validity and utility of the review.
  2. Now, biases cannot be assessed from RCT studies, but the risk of bias can be estimated by identifying the systematic flaws in study design, planning, execution or even outcomes reporting. There are several risk-of-bias assessment guidelines that help thoroughly assess several bias risks in RCT literature. The latest published guidelines are the revised Cochrane RoB 2.0 guidelines.
  3. These guidelines help you thoroughly assess biases from RCT full-texts, but the process of manual RoB assessment is extremely time-consuming, resource intensive and cognitively demanding. Manual bias assessment is challenged by the rapidly rising publication of RCTs, and therefore, automatic RoB information extraction is imperative.
  4. There has been some work in automating RoB information extraction by Marshal and Millard studies, but the dataset used to train machine learning models is closed access. Later they developed a tool called RobotReviewer which is freely available but develops on closed access data which isn’t available to the community, and they automate using the older risk of bias guidelines. Recently, a RoB labelled corpus was released by Wang et al, but the corpus is based on preclinical animal studies and not human RCTs.
  5. So currently, we do no have any open access corpus annotated with risk of bias judgments and neither do we have guidelines to build one. These gaps prompted us to conduct this pilot project.
  6. RoB 2 are these really extensive and instructional guidelines that help you step-by-step assess the overall risk of bias from any RCT study. So before building our own annotation guidelines, we thought maybe we could use the RoB2 tool to annotate a text corpus as well. And to understand if we can use RoB 2 for this matter, we need to examine how it structures the bias assessment procedure. It divides the biases into 5 domains, each domain loosely translating to each of the trial stages. Each domain is assessed using several signalling questions.
  7. The reviewers manually go through each signalling question as it appears in the guidelines, and they try to identify text to answer this question in the RCT they are assessing. Once an answer text is found, based on that answer, they use this information to judge a minute chunk of risk corresponding to this signalling question. And based on the judgment they chose one of the five response options, with Yes mostly corresponding to yes – the answer suggests there’s risk of bias or No – there is no risk of bias for this question. However, it can also correspond to “Yes” – everything is alright and theres no risk of bias for this question.
  8. Take, for example, the signalling question 2.1. It asks whether the participants were aware of their assigned intervention during the trial. The reviewers identify the answer to this question in the text and let’s say they found that the participants were properly blinded to the intervention and were unaware of the assigned intervention meaning the bias is low and all is good for this signalling question. The reviewers needed to do it for 22 signalling questions in the RoB 2 tool so the exact procedure shown manually could be translated into the process of annotation.
  9. We need an annotation schema before starting to annotate the corpus We keep our annotation scheme very similar to how the assessment is structured in the RoB2 guidelines. Each of our span labels contains information about the domain the text is labelled for, the signalling question and also the response judgment. As the overall task of RoB assessment and annotation is very complex, we wanted to ensure the way labels are designed makes it easier for them to annotate.
  10. We then proceeded to annotate 10 full-text RCTs by four experts with varied RoB assessment expertise.
  11. This signalling question asks whether the outcomes data were available for all, or nearly all, participants randomized but does not clarify the exact cut-off for how many participant dropouts increase the risk? Therefore, the annotators make subjective response judgments depending upon what exact percentage of participant dropout is considered valid in their experience.
  12. The references, and...