SlideShare ist ein Scribd-Unternehmen logo
1 von 20
The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010) 19th World Wide Web Conference WWW-2010 26-30 April 2010: Raleigh Conference Center, Raleigh, NC, USA Animal Disease EventRecognition and Classification Svitlana Volkova, Doina Caragea, William H. Hsu, Swathi Bujuru Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University Sponsor: K-State National Agricultural Biosecurity Center (NABC)
Agenda The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)   Overview Related Work Methodology Entity Recognition Event Sentence Classification Event Tuple Generation Experiment Summary and Future Work
Animal Infectious Disease Outbreaks influence on the travel and trade cause economic crises, political instability; diseases, zoonotic in type can cause loss of life. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Animal Disease-related Data Online Structured Data Unstructured Data Official reports by different organizations: state and federal laboratories, bioportals;  health care providers; governmental agricultural or environmental agencies. Web-pages News E-mails (e.g., ProMed-Mail) Blogs Medical literature (e.g., books) Scientific papers (e.g., PubMed) The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Infectious Disease Informatics (IDI) Event example: “On 12 September 2007, a new foot-and-mouth disease outbreak was confirmed in Egham, Surrey” *Event is an occurrence of a disease within a particular time and space range. **Outbreak is a set of disease-related events which are constrained in space and have temporal overlap The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Problem Statement The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)   How to recognize animal disease-related events from unstructured web documents? How to accurately extract event-related entities? Disease Names, Viruses, Serotypes Locations Species Dates How to classify event-related and non-related sentences? How to define the confirmation status of the event? How to generate true event tuples?
Related Work BioCaster - http://biocaster.nii.ac.jp/  + crawls news and uses ontology pattern matching approaches to recognize disease-location-verb pairs;  -  does not classify events as suspected or confirmed. HealthMap - http://healthmap.org/en  + crawls data from Google News and the ProMED-Mail;  -  manually supported web system. Pattern-based Understanding and Learning System (PULS) - http://sysdb.cs.helsinki.fi/puls/jrc/all  - does not classify events and does not report past outbreaks. Our approach: Consider such event attributes as: disease, date, location, species and confirmation status; Classify events into two categories such as: suspected or confirmed Identify events in the historical data, e.g. medical literature, papers. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Methodology Step 1. Entity recognition from raw text. Step 2. Sentence classification from which entities are extracted as being related to an event or not; if they are related to an event we classify them as confirmed or suspected. Step 3. Combination of entities within an event sentence into the structured tuples and aggregation of tuples related to the same event into one comprehensive tuple. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Domain Meta-data Domain-independent knowledge Domain-specific knowledge Location hierarchy names of countries, states, cities; Time hierarchy canonical dates. Medical ontology diseases, serotypes, and viruses. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Step1: Entity Recognition The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)   Locate and classify atomic elements into predefined categories: Disease names:“foot and mouth disease”, “rift valley fever”; viruses: “picornavirus”; serotypes: “Asia-1”; Species: “sheep”, “pigs”, “cattle” and “livestock”; Locationsof events specified at different levels of geo-granularity: “United Kingdom", “eastern provinces of Shandong and Jiangsu, China”; Datesin different formats: “last Tuesday”, “two month ago”.
Entity Recognition Tools Animal Disease Extractor* relies on a medical ontology, automatically-enriched with synonyms and causative viruses. Species Extractor*  pattern matching on a stemmed dictionary of animal names from Wikipedia. Location Extractor Stanford NER Tool** (uses conditional random fields); NGA GEOnet Names Database (GNS)*** for location disambiguation and retrieving latitude/longitude. Date/Time Extractor set of regular expressions. *KDD KSU DSEx - http://fingolfin.user.cis.ksu.edu:8080/diseaseextractor/ **Stanford NER - http://nlp.stanford.edu/ner/index.shtml ***GNS - http://earth-info.nga.mil/gns/html/ The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Step 2: Event Sentence Classification  Constraint: True events should include a disease name together with a status verb (eliminate event non-related sentences). “Foot and mouth disease is[V] a highly pathogenic animal disease”. Confirmed status verbs “happened” and verb phrases “strike out” “On 9 Jun 2009, the farm's owner reported[V] symptoms of FMD in more than 30 hogs”. Suspected status verbs “catch” and verb phrases “be taken in” “RVF is suspected[V] in Saudi Arabia in September 2000”. Extend the Initial List with synonyms from GoogleSets* and WordNet** 	*GoogleSets - http://labs.google.com/sets		**WordNet - http://wordnet.princeton.edu/ The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Step 3: Event Tuple Generation The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)   Event attributes: disease date location species confirmation status Event tuple: Eventi  = < disease; date; location; species; status > =  			<FMD, 9 Jun 2009, Taoyuan, hog, confirmed> Event tuple with missing attributes: Eventj = <FMD, ?, ?, ?, confirmed>
Event Recognition Workflow Step 1: Entity Recognition Foot-and-mouth disease[DIS]on hog[SP] farm in Taoyuan[LOC].  Taiwan's TVBS television station reports that agricultural authorities confirmed foot-and-mouth disease[DIS] on a hog[SP] farm in Taoyuan[LOC]. On 9 Jun 2009[DT], the farm's owner reported symptoms of FMD[DIS] in more than 30 hogs[SP]. Subsequent testing confirmed FMD[DIS]. Agricultural authorities asked the farmer to strengthen immunization. The outbreak has not affected other farms. Authorities stipulated that the affected hog[SP] farm may not sell pork for 2 weeks. Step 2: Sentence Classification YES      1. Foot-and-mouth disease[DIS]on hog[SP] farm in Taoyuan[LOC].  YES      2.Taiwan's TVBS television station reports that agricultural authorities confirmedfoot-and-mouth disease[DIS]on a hog[SP] farm in Taoyuan[LOC].  YES      3. On 9 Jun 2009[DT], the farm's owner reported symptoms of FMD[DIS] in more than 30 hogs[SP].  YES      4. Subsequent testing confirmedFMD[DIS]. NO        5. Agricultural authorities asked the farmer to strengthen immunization. NO        6. The outbreak has not affected other farms.	 NO        7. Authorities stipulated that the affected hog[SP] farm may not sell pork for 2 weeks. Step 3a: Tuple Generation E1 = <Foot-and-mouth disease, ?, Taoyuan, hog, ?>	E3 = <FMD, 9 Jun 2009,?, hog, reported> E2 = <Foot-and-mouth disease, ?, Taoyuan, hog, confirmed >	E4 = <FMD, ?, ?, ?, confirmed> Step 3b: Tuple Aggregation E = <disease, date, location, species, status> = <Foot-and-mouth disease, 9 Jun 2009, Taoyuan, hog, confirmed >  The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Rule-based event recognition approach Step 1. Entity Recognition Step 2. Event Sentence Classification Step 3. Event Tuple Generation & Aggregation Tuple aggregation is based on the set of rules:   - disease name is the main attribute for the event tuple;   - optimally combine the event tuples around a disease     name (event tuple should have max number of event     attributes);   - if tuple has a new disease, then form the next tuple. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Experiment The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)   ~100 event-related documents Foot-and-mouth disease (FMD) Rift valley fever (RVF) Manually created 2 sets of summaries for 100 docs DUCView Pyramid Scoring Tool* – Score [0..1] relies on multiple summaries to assign the significance weights to summarization content units (i.e., entities) to compare automatically generated event tuples with entities from human summaries. Scorei = < wddisease; wtdate; wllocation; wsspecies; wcstatus… >, subject to disease + status = 2
Event Score Distribution by Range We interpret the Pyramid score values as an event extraction accuracy: # of unique contributing entities (TP); # of entities not in the summary (FP); # of extra contributing entities from summary (FN). multiple summaries – majority voting for annotation. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Experimental Results The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Summary & Future Work Sentence-based Event Recognition and Classification Entity and Confirmation Status Extraction Methods apply several lists of verbs for confirmation status extraction: initial list, lists augmented using GoogleSets and WordNet Pyramid method and DUCView Tool for scoring automatically generated event tuples. Future Work: Deeper syntactic analysis of the sentence Co-reference resolution Multilingual event recognition and classification The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)
Thank you! The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx  2010)

Weitere ähnliche Inhalte

Andere mochten auch

Cyber bulling panagopoulos_georgie
Cyber bulling panagopoulos_georgieCyber bulling panagopoulos_georgie
Cyber bulling panagopoulos_georgiegeorgepana
 
Linked in Social Selling 5 Easy Steps to Convert Connections to New Sales - ...
Linked in Social Selling  5 Easy Steps to Convert Connections to New Sales - ...Linked in Social Selling  5 Easy Steps to Convert Connections to New Sales - ...
Linked in Social Selling 5 Easy Steps to Convert Connections to New Sales - ...Social Jack
 
5 Easy Steps to Access SBA Business Financing Bridgeview Bank - Tom Meyer -...
5 Easy Steps to Access SBA Business Financing   Bridgeview Bank - Tom Meyer -...5 Easy Steps to Access SBA Business Financing   Bridgeview Bank - Tom Meyer -...
5 Easy Steps to Access SBA Business Financing Bridgeview Bank - Tom Meyer -...Social Jack
 
人生沒有彩排 (With music)
人生沒有彩排 (With music)人生沒有彩排 (With music)
人生沒有彩排 (With music)Dhamma Jata
 
Grace Hopper Celebration 2010
Grace Hopper Celebration 2010Grace Hopper Celebration 2010
Grace Hopper Celebration 2010Svitlana volkova
 
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemus
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemusWebinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemus
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemusJarno Malaprade
 
NAHJ Mobile Journalism Live Video
NAHJ Mobile Journalism Live VideoNAHJ Mobile Journalism Live Video
NAHJ Mobile Journalism Live VideoMo Krochmal
 
Informática aplicada a Ed. Física
Informática aplicada a Ed. Física Informática aplicada a Ed. Física
Informática aplicada a Ed. Física João Filho
 
Capítulo1 - Introdução a Sistemas Distribuídos - Coulouris
Capítulo1 - Introdução a Sistemas Distribuídos - CoulourisCapítulo1 - Introdução a Sistemas Distribuídos - Coulouris
Capítulo1 - Introdução a Sistemas Distribuídos - CoulourisWindson Viana
 

Andere mochten auch (12)

Cyber bulling panagopoulos_georgie
Cyber bulling panagopoulos_georgieCyber bulling panagopoulos_georgie
Cyber bulling panagopoulos_georgie
 
Linked in Social Selling 5 Easy Steps to Convert Connections to New Sales - ...
Linked in Social Selling  5 Easy Steps to Convert Connections to New Sales - ...Linked in Social Selling  5 Easy Steps to Convert Connections to New Sales - ...
Linked in Social Selling 5 Easy Steps to Convert Connections to New Sales - ...
 
Egames
EgamesEgames
Egames
 
5 Easy Steps to Access SBA Business Financing Bridgeview Bank - Tom Meyer -...
5 Easy Steps to Access SBA Business Financing   Bridgeview Bank - Tom Meyer -...5 Easy Steps to Access SBA Business Financing   Bridgeview Bank - Tom Meyer -...
5 Easy Steps to Access SBA Business Financing Bridgeview Bank - Tom Meyer -...
 
人生沒有彩排 (With music)
人生沒有彩排 (With music)人生沒有彩排 (With music)
人生沒有彩排 (With music)
 
MS Thesis Short
MS Thesis ShortMS Thesis Short
MS Thesis Short
 
Grace Hopper Celebration 2010
Grace Hopper Celebration 2010Grace Hopper Celebration 2010
Grace Hopper Celebration 2010
 
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemus
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemusWebinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemus
Webinaarimateriaali: Terveystalon OmaTerveys ja mobiili asiakakaskokemus
 
Web Intelligence 2010
Web Intelligence 2010Web Intelligence 2010
Web Intelligence 2010
 
NAHJ Mobile Journalism Live Video
NAHJ Mobile Journalism Live VideoNAHJ Mobile Journalism Live Video
NAHJ Mobile Journalism Live Video
 
Informática aplicada a Ed. Física
Informática aplicada a Ed. Física Informática aplicada a Ed. Física
Informática aplicada a Ed. Física
 
Capítulo1 - Introdução a Sistemas Distribuídos - Coulouris
Capítulo1 - Introdução a Sistemas Distribuídos - CoulourisCapítulo1 - Introdução a Sistemas Distribuídos - Coulouris
Capítulo1 - Introdução a Sistemas Distribuídos - Coulouris
 

Ähnlich wie MedEx'10

Improving Disease Surveillance in the United States Using Companion Animal Data
Improving Disease Surveillance in the United States Using Companion Animal DataImproving Disease Surveillance in the United States Using Companion Animal Data
Improving Disease Surveillance in the United States Using Companion Animal DataPamela Okerholm
 
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...InSTEDD
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Taha Kass-Hout, MD, MS
 
Livestock Disease Prediction System
Livestock Disease Prediction SystemLivestock Disease Prediction System
Livestock Disease Prediction Systemvivatechijri
 
Evolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemEvolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemTaha Kass-Hout, MD, MS
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsNigel Collier
 
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...ExternalEvents
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsNigel Collier
 
Using the internet to search for clinical trial information
Using the internet to search for clinical trial informationUsing the internet to search for clinical trial information
Using the internet to search for clinical trial informationEURORDIS - Rare Diseases Europe
 
POS DecJan 2016 - Welton
POS DecJan 2016 - WeltonPOS DecJan 2016 - Welton
POS DecJan 2016 - WeltonRonelle Welton
 
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)BLENCH Michael - Global Public Health Intelligence Network (GPHIN)
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)Global Risk Forum GRFDavos
 
A generic model for disease outbreak
A generic model for disease outbreakA generic model for disease outbreak
A generic model for disease outbreakijcsit
 
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...John Blue
 
Introduction to Epidemiology and Surveillance
Introduction to Epidemiology and SurveillanceIntroduction to Epidemiology and Surveillance
Introduction to Epidemiology and SurveillanceGeorge Moulton
 
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...Kostyantyn Bobyk
 
Epidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots ModelEpidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots ModelIJECEIAES
 
AI transmission risks: Analysis of biosecurity measures and contact structure
AI transmission risks: Analysis of biosecurity measures and contact structureAI transmission risks: Analysis of biosecurity measures and contact structure
AI transmission risks: Analysis of biosecurity measures and contact structureHarm Kiezebrink
 
Take Home PortionDirections Problems (150pts). This part, i.e. th.docx
Take Home PortionDirections Problems (150pts). This part, i.e. th.docxTake Home PortionDirections Problems (150pts). This part, i.e. th.docx
Take Home PortionDirections Problems (150pts). This part, i.e. th.docxssuserf9c51d
 

Ähnlich wie MedEx'10 (20)

Improving Disease Surveillance in the United States Using Companion Animal Data
Improving Disease Surveillance in the United States Using Companion Animal DataImproving Disease Surveillance in the United States Using Companion Animal Data
Improving Disease Surveillance in the United States Using Companion Animal Data
 
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
RIFF - A Social Network and Collaborative Platform For Public Health Disease ...
 
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...Riff: A Social Network and Collaborative Platform for Public Health Disease S...
Riff: A Social Network and Collaborative Platform for Public Health Disease S...
 
Livestock Disease Prediction System
Livestock Disease Prediction SystemLivestock Disease Prediction System
Livestock Disease Prediction System
 
Evolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response SystemEvolve: InSTEDD's Global Early Warning and Response System
Evolve: InSTEDD's Global Early Warning and Response System
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease Informatics
 
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
GenomeTrakr: Whole-Genome Sequencing for Food Safety and A New Way Forward in...
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease Informatics
 
Evolve
EvolveEvolve
Evolve
 
Using the internet to search for clinical trial information
Using the internet to search for clinical trial informationUsing the internet to search for clinical trial information
Using the internet to search for clinical trial information
 
POS DecJan 2016 - Welton
POS DecJan 2016 - WeltonPOS DecJan 2016 - Welton
POS DecJan 2016 - Welton
 
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)BLENCH Michael - Global Public Health Intelligence Network (GPHIN)
BLENCH Michael - Global Public Health Intelligence Network (GPHIN)
 
A generic model for disease outbreak
A generic model for disease outbreakA generic model for disease outbreak
A generic model for disease outbreak
 
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...
Dr. Tim Snider - PEDV-Warning Shot for National Biosecurity and Foreign Anima...
 
Introduction to Epidemiology and Surveillance
Introduction to Epidemiology and SurveillanceIntroduction to Epidemiology and Surveillance
Introduction to Epidemiology and Surveillance
 
Artificial Intelligence in Medicine
Artificial Intelligence in MedicineArtificial Intelligence in Medicine
Artificial Intelligence in Medicine
 
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...
K Bobyk - %22A Primer on Personalized Medicine - The Imminent Systemic Shift%...
 
Epidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots ModelEpidemic Alert System: A Web-based Grassroots Model
Epidemic Alert System: A Web-based Grassroots Model
 
AI transmission risks: Analysis of biosecurity measures and contact structure
AI transmission risks: Analysis of biosecurity measures and contact structureAI transmission risks: Analysis of biosecurity measures and contact structure
AI transmission risks: Analysis of biosecurity measures and contact structure
 
Take Home PortionDirections Problems (150pts). This part, i.e. th.docx
Take Home PortionDirections Problems (150pts). This part, i.e. th.docxTake Home PortionDirections Problems (150pts). This part, i.e. th.docx
Take Home PortionDirections Problems (150pts). This part, i.e. th.docx
 

Mehr von Svitlana volkova

Multimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location RetrievalMultimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location RetrievalSvitlana volkova
 
Multilingual Ner Using Wiki
Multilingual Ner Using WikiMultilingual Ner Using Wiki
Multilingual Ner Using WikiSvitlana volkova
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)Svitlana volkova
 
Methods Of Reliability Analysis
Methods Of Reliability AnalysisMethods Of Reliability Analysis
Methods Of Reliability AnalysisSvitlana volkova
 
Ukraine Presentation at Kansas State University
Ukraine Presentation at Kansas State UniversityUkraine Presentation at Kansas State University
Ukraine Presentation at Kansas State UniversitySvitlana volkova
 

Mehr von Svitlana volkova (13)

EACL'12 Poster
EACL'12 PosterEACL'12 Poster
EACL'12 Poster
 
Multimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location RetrievalMultimodal Information Extraction: Disease, Date and Location Retrieval
Multimodal Information Extraction: Disease, Date and Location Retrieval
 
Multilingual Ner Using Wiki
Multilingual Ner Using WikiMultilingual Ner Using Wiki
Multilingual Ner Using Wiki
 
WiML Poster
WiML PosterWiML Poster
WiML Poster
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)
 
Social Networks
Social NetworksSocial Networks
Social Networks
 
Methods Of Reliability Analysis
Methods Of Reliability AnalysisMethods Of Reliability Analysis
Methods Of Reliability Analysis
 
Ohio Project
Ohio ProjectOhio Project
Ohio Project
 
Ukraine Presentation
Ukraine PresentationUkraine Presentation
Ukraine Presentation
 
Ukraine Presentation at Kansas State University
Ukraine Presentation at Kansas State UniversityUkraine Presentation at Kansas State University
Ukraine Presentation at Kansas State University
 
Communicatons Fulbright
Communicatons FulbrightCommunicatons Fulbright
Communicatons Fulbright
 
Communications Ternopil
Communications TernopilCommunications Ternopil
Communications Ternopil
 

Kürzlich hochgeladen

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 

Kürzlich hochgeladen (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 

MedEx'10

  • 1. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) 19th World Wide Web Conference WWW-2010 26-30 April 2010: Raleigh Conference Center, Raleigh, NC, USA Animal Disease EventRecognition and Classification Svitlana Volkova, Doina Caragea, William H. Hsu, Swathi Bujuru Laboratory for Knowledge Discovery in Databases Department of Computing and Information Sciences Kansas State University Sponsor: K-State National Agricultural Biosecurity Center (NABC)
  • 2. Agenda The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) Overview Related Work Methodology Entity Recognition Event Sentence Classification Event Tuple Generation Experiment Summary and Future Work
  • 3. Animal Infectious Disease Outbreaks influence on the travel and trade cause economic crises, political instability; diseases, zoonotic in type can cause loss of life. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 4. Animal Disease-related Data Online Structured Data Unstructured Data Official reports by different organizations: state and federal laboratories, bioportals; health care providers; governmental agricultural or environmental agencies. Web-pages News E-mails (e.g., ProMed-Mail) Blogs Medical literature (e.g., books) Scientific papers (e.g., PubMed) The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 5. Infectious Disease Informatics (IDI) Event example: “On 12 September 2007, a new foot-and-mouth disease outbreak was confirmed in Egham, Surrey” *Event is an occurrence of a disease within a particular time and space range. **Outbreak is a set of disease-related events which are constrained in space and have temporal overlap The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 6. Problem Statement The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) How to recognize animal disease-related events from unstructured web documents? How to accurately extract event-related entities? Disease Names, Viruses, Serotypes Locations Species Dates How to classify event-related and non-related sentences? How to define the confirmation status of the event? How to generate true event tuples?
  • 7. Related Work BioCaster - http://biocaster.nii.ac.jp/ + crawls news and uses ontology pattern matching approaches to recognize disease-location-verb pairs; - does not classify events as suspected or confirmed. HealthMap - http://healthmap.org/en + crawls data from Google News and the ProMED-Mail; - manually supported web system. Pattern-based Understanding and Learning System (PULS) - http://sysdb.cs.helsinki.fi/puls/jrc/all - does not classify events and does not report past outbreaks. Our approach: Consider such event attributes as: disease, date, location, species and confirmation status; Classify events into two categories such as: suspected or confirmed Identify events in the historical data, e.g. medical literature, papers. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 8. Methodology Step 1. Entity recognition from raw text. Step 2. Sentence classification from which entities are extracted as being related to an event or not; if they are related to an event we classify them as confirmed or suspected. Step 3. Combination of entities within an event sentence into the structured tuples and aggregation of tuples related to the same event into one comprehensive tuple. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 9. Domain Meta-data Domain-independent knowledge Domain-specific knowledge Location hierarchy names of countries, states, cities; Time hierarchy canonical dates. Medical ontology diseases, serotypes, and viruses. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 10. Step1: Entity Recognition The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) Locate and classify atomic elements into predefined categories: Disease names:“foot and mouth disease”, “rift valley fever”; viruses: “picornavirus”; serotypes: “Asia-1”; Species: “sheep”, “pigs”, “cattle” and “livestock”; Locationsof events specified at different levels of geo-granularity: “United Kingdom", “eastern provinces of Shandong and Jiangsu, China”; Datesin different formats: “last Tuesday”, “two month ago”.
  • 11. Entity Recognition Tools Animal Disease Extractor* relies on a medical ontology, automatically-enriched with synonyms and causative viruses. Species Extractor* pattern matching on a stemmed dictionary of animal names from Wikipedia. Location Extractor Stanford NER Tool** (uses conditional random fields); NGA GEOnet Names Database (GNS)*** for location disambiguation and retrieving latitude/longitude. Date/Time Extractor set of regular expressions. *KDD KSU DSEx - http://fingolfin.user.cis.ksu.edu:8080/diseaseextractor/ **Stanford NER - http://nlp.stanford.edu/ner/index.shtml ***GNS - http://earth-info.nga.mil/gns/html/ The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 12. Step 2: Event Sentence Classification Constraint: True events should include a disease name together with a status verb (eliminate event non-related sentences). “Foot and mouth disease is[V] a highly pathogenic animal disease”. Confirmed status verbs “happened” and verb phrases “strike out” “On 9 Jun 2009, the farm's owner reported[V] symptoms of FMD in more than 30 hogs”. Suspected status verbs “catch” and verb phrases “be taken in” “RVF is suspected[V] in Saudi Arabia in September 2000”. Extend the Initial List with synonyms from GoogleSets* and WordNet** *GoogleSets - http://labs.google.com/sets **WordNet - http://wordnet.princeton.edu/ The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 13. Step 3: Event Tuple Generation The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) Event attributes: disease date location species confirmation status Event tuple: Eventi = < disease; date; location; species; status > = <FMD, 9 Jun 2009, Taoyuan, hog, confirmed> Event tuple with missing attributes: Eventj = <FMD, ?, ?, ?, confirmed>
  • 14. Event Recognition Workflow Step 1: Entity Recognition Foot-and-mouth disease[DIS]on hog[SP] farm in Taoyuan[LOC]. Taiwan's TVBS television station reports that agricultural authorities confirmed foot-and-mouth disease[DIS] on a hog[SP] farm in Taoyuan[LOC]. On 9 Jun 2009[DT], the farm's owner reported symptoms of FMD[DIS] in more than 30 hogs[SP]. Subsequent testing confirmed FMD[DIS]. Agricultural authorities asked the farmer to strengthen immunization. The outbreak has not affected other farms. Authorities stipulated that the affected hog[SP] farm may not sell pork for 2 weeks. Step 2: Sentence Classification YES 1. Foot-and-mouth disease[DIS]on hog[SP] farm in Taoyuan[LOC]. YES 2.Taiwan's TVBS television station reports that agricultural authorities confirmedfoot-and-mouth disease[DIS]on a hog[SP] farm in Taoyuan[LOC]. YES 3. On 9 Jun 2009[DT], the farm's owner reported symptoms of FMD[DIS] in more than 30 hogs[SP]. YES 4. Subsequent testing confirmedFMD[DIS]. NO 5. Agricultural authorities asked the farmer to strengthen immunization. NO 6. The outbreak has not affected other farms. NO 7. Authorities stipulated that the affected hog[SP] farm may not sell pork for 2 weeks. Step 3a: Tuple Generation E1 = <Foot-and-mouth disease, ?, Taoyuan, hog, ?> E3 = <FMD, 9 Jun 2009,?, hog, reported> E2 = <Foot-and-mouth disease, ?, Taoyuan, hog, confirmed > E4 = <FMD, ?, ?, ?, confirmed> Step 3b: Tuple Aggregation E = <disease, date, location, species, status> = <Foot-and-mouth disease, 9 Jun 2009, Taoyuan, hog, confirmed > The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 15. Rule-based event recognition approach Step 1. Entity Recognition Step 2. Event Sentence Classification Step 3. Event Tuple Generation & Aggregation Tuple aggregation is based on the set of rules: - disease name is the main attribute for the event tuple; - optimally combine the event tuples around a disease name (event tuple should have max number of event attributes); - if tuple has a new disease, then form the next tuple. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 16. Experiment The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010) ~100 event-related documents Foot-and-mouth disease (FMD) Rift valley fever (RVF) Manually created 2 sets of summaries for 100 docs DUCView Pyramid Scoring Tool* – Score [0..1] relies on multiple summaries to assign the significance weights to summarization content units (i.e., entities) to compare automatically generated event tuples with entities from human summaries. Scorei = < wddisease; wtdate; wllocation; wsspecies; wcstatus… >, subject to disease + status = 2
  • 17. Event Score Distribution by Range We interpret the Pyramid score values as an event extraction accuracy: # of unique contributing entities (TP); # of entities not in the summary (FP); # of extra contributing entities from summary (FN). multiple summaries – majority voting for annotation. The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 18. Experimental Results The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 19. Summary & Future Work Sentence-based Event Recognition and Classification Entity and Confirmation Status Extraction Methods apply several lists of verbs for confirmation status extraction: initial list, lists augmented using GoogleSets and WordNet Pyramid method and DUCView Tool for scoring automatically generated event tuples. Future Work: Deeper syntactic analysis of the sentence Co-reference resolution Multilingual event recognition and classification The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)
  • 20. Thank you! The First International Workshop on Web Science and Information Exchange in the Medical Web (MedEx 2010)