Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Multilingual Ontology for Plant Health Threats Media Monitoring

147 Aufrufe

Veröffentlicht am

Development and testing of the media monitoring tool MedISys for the early identification and reporting of existing and emerging plant health threats guided by a plant health threats ontology

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Multilingual Ontology for Plant Health Threats Media Monitoring

  1. 1. GRIHO Research Group, INSPIRES Research Centre, Universitat de Lleida Roberto García, Josep Maria Brunetti*, Rosa Gil, Jordi Virgili, Toni Granollers Multilingual Ontology for Plant Health Threats Media Monitoring (A Smart Data Approach)
  2. 2. Media Monitoring for New and (Re)Emerging Plant Health Threats • Project: development and testing of the media monitoring tool MedISys for the early identification and reporting of existing and emerging plant health threats • Timing (duration): January 2014 – June 2016 (2.5 years) • Funding: EFSA • Coordination: Universitat de Lleida (UdL) • Partners: IRTA and UdL • Other participants: Joint Research Centre (European Commission) • Objectives: • Collate new and appropriate media information sources • Multilingual ontology for the global identification of emerging new plant health threats to be appended to MedISys • English, Spanish, Italian, French, Dutch, German, Portuguese, Russian, Chinese and Arabic • Develop and test strategies to monitor re-emerging plant health threats on global and regional scale • Analyse and test approaches to report identified signals to EFSA Units and experts through MedISys
  3. 3. Approach • Ontology: key component of the developed system that structures and provides knowledge about plant health threats • Knowledge captured from existing sources and experts • Guides applications for • Knowledge capture • Indirect sources search • Terms translation • Media monitoring categories generation 3 An ontology is a formal, explicit specification of a shared conceptualisation. is means implies expressed in terms of Abstract model of portion of world Machine-readable and understandable Based on a consensus Concepts, properties,...
  4. 4. Ontology Skeleton • Collected 140 pests/diseases from EPPO Alerts, 2000/29-1-A-1 and EU Emergency Control Measures • 117 linked to UniProt Taxonomy: • Taxonomical information, scientific/common/other names,… • 47 linked also to Wikipedia • Common names in multiple languages 4
  5. 5. Plant Health Threats Ontology • Enrich ontology with affected crops, hosts, vectors, symptoms expressions… 5
  6. 6. Plant Health Threats Ontology • All concepts linked to labels in different languages • Extract as keywords for MedISys or Web search filters,… • Example: “Maladie de Pierce” OR ( “grapevine” AND “sharpshooter” ) 6 Xylella fastidiosa Gammaproteobacteria Nerium oleander, Prunus salicina, Medicago sp., Sorghum halepense,… Homalodisca coagulata, Graphocephala sp., Oncometopia sp., Draeculacephala sp.,… Grapevine, Citrus, Olive, Almond, Peach, Coffee,… subClassOf vector host crop “Pierce's disease”, “Citrus variegated chlorosis” en “Maladie de Pierce” fr “葉緣焦枯病菌” zn “Glassy-winged sharpshooter”, “Spittlebugs”, “Froghoppers”, “Planthoppers”,… en “vite” it,… …
  7. 7. Ontology Editor • Assist experts during the knowledge capture process 7 http://indagus.udl.cat/medisys/editor/
  8. 8. Ontology Editor – forms with assistance 8
  9. 9. Ontology Editor - autocomplete 9
  10. 10. Ontology Editor - symptoms form 10
  11. 11. Semi-automatic Translation • 11
  12. 12. Multilingual Ontology • Threats names • 1609 terms • 27 languages Not available 617 38% Latin 375 23% English 262 16% French 81 5% German 68 4% Spanish 65 4% Japanese 21 1% Dutch 17 1% Italian 16 1% Portugues 15 1% Finish 8 1% Chinese 7 1% Russian 6 1% Other 51 3%
  13. 13. Ontology - symptom expression • Symptom Expression = symptom + plant part • Set of symptoms and plant parts from CABI form and Plant Ontology • 37 symptoms: – abnormal fall, premature fall – abnormal patterns, chlorotic rings – abnormal shape, malformation, distortion – boring, drilling, internal feeding, mining, tunnelling – canker – chlorosis – colour inversion, colour inversion – curling, curl – dieback – discoloration, discolouration – dwarfing – early senescence, premature senescence – empty – feeding – frass – gummosis – lesion, lesions – mottled, mottle – mummification, wrinkled, hard skin – dead, death, necrosis – odour – premature drop – premature ripening – reddening – reduced size, smaller – resinosis – roll, rolling – rosetting – rot, rotting – burn, scorch – splitting – stunting – thicker – fallen, toppled, falling – rooted out, uprooted – wilt, wilting – yellowing 356 terms for symptoms
  14. 14. Ontology - symptom expression • Symptom Expression = Symptom + Plant Part • 6 Plant Parts: – fruit – plant, tree, whole plant – bud, sprout – stem – seed, seeds – leaf, leaves • Examples: – Whole Plant Dwarfing – Leaf Scorch – Stems Stunting – Leaf Reddening – Fruit Premature Drop – Seeds Discoloration – Leaf Mottle 96 plant part terms
  15. 15. Ontology Browser • Complex queries • Example: “all threats with symptoms affecting the leaves” http://indagus.udl.cat/plantHealthThreats/
  16. 16. Identification of Information Source to Monitor • Objective: collect relevant information sources to be monitored by MedISys • Methodology • Identify information sources already known by experts, previous research projects, official sources like EPPO, journals,…  Direct Sources • Identify web information sources (newspapers, blogs, webs, etc.) unknown discovered using search engines and ontology terms  Indirect Sources • Analyse and evaluate all collected sources using Information Quality measure • First , filter duplicates, irrelevant, non-monitorable, etc.
  17. 17. Methodology Plant Health Threats Sources Inventory Known Sources Web Search Reference resources (expert knowledge) Existing projects related to pest and food/feed risks (EFSA) MedISys sources (JRC) Filtering and Evaluation process List of relevant sources List of relevant sources Filtering process (avoid duplicates & evaluation) Final list Search Mechanisms (query Process) 1956 sources (72 known + 1884 web search) Ontology
  18. 18. Monitor Known Threats • Known threats: explicit mention of the threat name • Generate automatically from ontology • MedISys category for each threat with list of keywords (terms) with threshold • 117 categories for known threats: • Bacteria: Xylella fastidiosa, Acidovorax citrulli,… (6) • Fungi: Ceratocystis fagacearum, Diplocarpon mali,… (18) • Insects: Agrilus coxalis auroguttatus, Agrilus planipennis,… (54) • Mollusks: Pomacea (1) • Nematodes: Bursaphelenchus xylophilus, Nacobbus aberrans,… (7) • Oomycetes: Phytophthora ramorum (1) • Phytoplalsma: Elm yellows phytoplasma, Candidatus Phytoplasma pruni,… (7) • Viroid: Tomato apical stunt viroid, Potato spindle tuber viroid (2) • Virus: Andean potato latent virus, Andean potato mottle virus,… (21) http://medisys.newsbrief.eu/medisys/groupedition/en/PlantHealthAll.html 18 Keyword sources Threshold Scientific names 100 Common names (all languages) 100 Other names 100
  19. 19. Monitor Unknown Threats • Unknown Threats: name not explicitly mentioned • Approach 1: manual generation of MedISys categories by experts http://medisys.newsbrief.eu/medisys/filteredition/en/EFSAUnknownPestFilteredEmailAlert.html 19 A combination of Combinations (Proximity: 15) at least one of alien, danger, dangerous, deadly… and at least one of agricultural, agriculture, almond… and at least one of bacteria, bacterial, crop+failure,… but none of allergies, allergy, animal+abuse,…
  20. 20. Monitor Unknown Threats • Approach 2: automatic generation from ontology (multilingual) • Concepts associated to the threats (but not their names) • Affected crops, vectors, hosts, symptoms, plant parts,... • Currently, the ontology models the symptoms for just 7 threats: • Phytophthora ramorum, Anoplophora glabripennis, Bactrocera tryoni, Agrilus planipennis, Xylella fastidiosa, Candidatus liberibacter and Rhynchophorus ferrugineus • http://medisys.newsbrief.eu/medisys/alertedition/en/AgrilusPlanipennis-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/AnoplophoraGlabripennis-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/BactroceraTryoni-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/CandidatusLiberibacter-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/PhytophthoraRamorum-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/RhynchophorusFerrugineus-PHT-Symptoms.html • http://medisys.newsbrief.eu/medisys/alertedition/en/XylellaFastidiosa-PHT-Symptoms.html 20 Combinations tree (Proximity 10) Example Affected crop AND Symptom AND Plant Part “walnut” AND “necrosis” AND “tree” OR Affected crop AND Vectors “lime” AND “asian citrus psyllid”
  21. 21. Results • Known threats • MedISys categories using threat names as keywords very effective • Example Xylella fastidiosa: • 5078 relevant news items selected from February 2015 to May 2016 (16 months) • However, they miss items not explicitly mentioning the threat • Unknown threats • Manually defined categories by experts • 80% items relevant • 10 items per day • Categories generated automatically using symptoms, crops, vectors… • 60% items relevant • Just 7 per week • A lot of noise, terms ambiguity • Added negative words to filter false positives but increased false negatives • Anyway, just preliminary work (just 7 threats modelled)… 21
  22. 22. Future work Build Disease-Symptom network like for human health? 22 Zho u, X., Menche, J., Barabási, A. L., & Sharma, A. (2014) Human symptoms–disease network. Nature communications, 5
  23. 23. Thank you very much for your attention Questions? Roberto García rgarcia@diei.udl.cat http://rhizomik.net/~roberto/

×