SlideShare ist ein Scribd-Unternehmen logo
1 von 64
Extracting, representing and mining Semantic Metadata from text: Facilitating Knowledge Discovery in Biomedicine Cartic Ramakrishnan  Advisor:   Dr. Amit Sheth Committee Members:  Dr. Michael Raymer Dr. Guozhu Dong Dr. Thaddeus Tarpey Dr. Vasant Honavar Dr. Shaojun Wang
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What Knowledge Discovery is NOT  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What is knowledge discovery? ,[object Object],[object Object],[object Object]
Element of surprise – Swanson’s discoveries Magnesium Migraine PubMed ? Stress Spreading Cortical Depression Calcium  Channel  Blockers Swanson’s  Discoveries Associations Discovered based on keyword searches  followed by manually analysis of text to establish possible relevant relationships 11 possible associations found
Knowledge Discovery in AI- The  robot scientist Planned search over an  well-defined (axiomatic) space  leading to knowledge discovery. Knowledge discovery by humans is done in  non-axiomatic ill-defined spaces over multi-modal data . Scientific literature is ill-defined and loosely structured source of data used in scientific investigations. Assigning structure and interpretation to text (Semantics) Syntax    Structure    Semantics
Knowledge Discovery = Extraction + Heuristic Aggregation Undiscovered Public  Knowledge
Information Extraction & Text Mining This MEK dependency was observed in BRAF mutant cells regardless of tissue lineage, and correlated with both downregulation of cyclin D1 protein expression and the induction of G1 arrest. *MEK dependency  ISA  Dependency_on_an_Organic_chemical  *BRAF mutant cells  ISA  Cell_type *downregulation of cyclin D1 protein expression  ISA  Biological_process *tissue lineage  ISA  Biological_concept *induction of G1 arrest  ISA  Biological_process Information Extraction = segmentation+classification+association+mining Text mining = entity identification+named relationship extraction+discovering association chains…. Segmentation Classification Named Relationship Extraction MEK dependency observed in BRAF mutant cells downregulation of  cyclin D1 protein expression correlated with induction of G1 arrest correlated with
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Knowledge Discovery over text Extraction of  Semantics  from text Semantic Metadata  Guided  Knowledge Explorations  Assigning interpretation to text  Semantic Metadata  Guided  Knowledge Discovery Triple-based Semantic  Search Semantic browser Subgraph discovery Semantic metadata  in the form of semi-structured data Text
Ontology-enabled Information Extraction Cartic Ramakrishnan , Krys Kochut, Amit P. Sheth: A Framework for Schema-Driven  Relationship Discovery from Unstructured Text.  International Semantic Web Conference 2006 : 583-596
Comparison with standard IE ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Information Extraction via Ontology assisted  text mining – Relationship extraction Biologically  active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of UMLS  Semantic Network MeSH PubMed 9284  documents 4733  documents 5  documents
Background knowledge and Data used ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Method – Parse Sentences in PubMed SS-Tagger (University of Tokyo) SS-Parser (University of Tokyo) (TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) )  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Method – Identify entities and relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D004967 MeSHID D006965 MeSHID D004717 UMLS ID T147 Modifiers Modified entities Composite Entities
Representation – Resulting RDF Modifiers Modified entities Composite Entities
Preliminary Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Paths between Migraine and Magnesium Paths are considered interesting if they have one or more named relationship Other than   hasPart  or  hasModifiers  in them
An example of such a path ,[object Object],[object Object],[object Object],[object Object]
Interesting Observations from this preliminary work
Observations – Sentence characteristics and  Parsing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],(TOP (S (NP (NP (DT An) (JJ excessive) (ADJP (JJ endogenous) (CC or) (JJ exogenous) ) (NN stimulation) ) (PP (IN by) (NP (NN estrogen) ) ) ) (VP (VBZ induces) (NP (NP (JJ adenomatous) (NN hyperplasia) ) (PP (IN of) (NP (DT the) (NN endometrium) ) ) ) ) ) )
Observations – Complex entities with nesting and overlapping structure ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Possible Strategies ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unsupervised Joint Extraction of Compound Entities and Relationship Cartic Ramakrishnan , Pablo N. Mendes, Shaojun Wang and Amit P. Sheth  "Unsupervised Discovery of Compound Entities for Relationship Extraction" EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns
Joint Extraction approach ,[object Object],amod  = adjectival modifier nsubjpass = nominal subject in passive voice governor dependent
Stanford Dependency Hierarchy
Hierarchy used to generalize the rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Algorithm Relationship head Subject head Object head Object head
Preliminary results
Extracted Triples
Analysis of compound entities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Predicting the constituents to compound entities ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What is Mutual information? ,[object Object],[object Object]
Dependency-based mutual information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],rel(w i ,w j ) rel(w j ,w i ) rel(w i ,*) rel(*, w i )
Predicting constituents ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluation ,[object Object],[object Object],[object Object],[object Object]
Demo of Evaluation tool ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Evaluation conducted using this tool ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Results of Manual Evaluation
Applications ,[object Object],[object Object]
Semantic Metadata Guided Knowledge Explorations and Discovery
Supporting Knowledge Discovery
Discovery complex connection patterns ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Schema-driven edge weight assignment company Entertainment company Manufacturing company Oil company Automotive company Electronics company Sporting goods company Ford Motors Cartic’s  Company 0.67 0.33 <0.5 Schema Instances 1.0 1.0 1.0 1.0 1.0 1.0 0.33
Heuristics used to bias edge weights ,[object Object]
Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object]
Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Results
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Past work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Other major papers influencing my work ,[object Object],[object Object],[object Object],[object Object]
Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hypothesis Driven  retrieval of Scientific Literature  PubMed Keyword query: Migraine[MH] + Magnesium[MH] Complex  Query Supporting Document  sets retrieved Migraine Stress Patient affects isa Magnesium Calcium Channel  Blockers inhibit
Strength of a connection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Mechanistic Models
Publications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Publications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Publications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Experiences  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgements ,[object Object],[object Object],[object Object],[object Object]
On a lighter note

Weitere ähnliche Inhalte

Was ist angesagt?

Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AIDatabricks
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AIDatabricks
 
Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomicsShyam Sarkar
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text miningIRJET Journal
 
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Finalkdjamies
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
Opportunistic Persistent Data Storage
Opportunistic Persistent Data StorageOpportunistic Persistent Data Storage
Opportunistic Persistent Data StorageLuke Weerasooriya
 
Data mining with human genetics to enhance gene based algorithm and
Data mining with human genetics to enhance gene based algorithm andData mining with human genetics to enhance gene based algorithm and
Data mining with human genetics to enhance gene based algorithm andIAEME Publication
 
A knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsA knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsramakanz
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...Artificial Intelligence Institute at UofSC
 
Mining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity DataMining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity DataChris Southan
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Amit Sheth
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reportsSaeed Mehrabi
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
BIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPBIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPRobert Oostenveld
 
Domain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachDomain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachWaqas Tariq
 

Was ist angesagt? (20)

Ibn Sina
Ibn SinaIbn Sina
Ibn Sina
 
Assessing Drug Safety Using AI
Assessing Drug Safety Using AIAssessing Drug Safety Using AI
Assessing Drug Safety Using AI
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
 
B.3.5
B.3.5B.3.5
B.3.5
 
Stock markets and_human_genomics
Stock markets and_human_genomicsStock markets and_human_genomics
Stock markets and_human_genomics
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
AI for drug discovery
AI for drug discoveryAI for drug discovery
AI for drug discovery
 
Knowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text FinalKnowledge Discovery And Data Mining Of Free Text Final
Knowledge Discovery And Data Mining Of Free Text Final
 
Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
Opportunistic Persistent Data Storage
Opportunistic Persistent Data StorageOpportunistic Persistent Data Storage
Opportunistic Persistent Data Storage
 
Data mining with human genetics to enhance gene based algorithm and
Data mining with human genetics to enhance gene based algorithm andData mining with human genetics to enhance gene based algorithm and
Data mining with human genetics to enhance gene based algorithm and
 
A knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsA knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systems
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
 
Mining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity DataMining Drug Targets, Structures and Activity Data
Mining Drug Targets, Structures and Activity Data
 
Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...Semantic Technology empowering Real World outcomes in Biomedical Research and...
Semantic Technology empowering Real World outcomes in Biomedical Research and...
 
G44083642
G44083642G44083642
G44083642
 
Data Mining in Rediology reports
Data Mining in Rediology reportsData Mining in Rediology reports
Data Mining in Rediology reports
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
BIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCPBIOMAG2018 - Denis Engemann - MNE-HCP
BIOMAG2018 - Denis Engemann - MNE-HCP
 
Domain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised ApproachDomain Specific Named Entity Recognition Using Supervised Approach
Domain Specific Named Entity Recognition Using Supervised Approach
 

Andere mochten auch

User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social MediaMeena Nagarajan
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Artificial Intelligence Institute at UofSC
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Artificial Intelligence Institute at UofSC
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Artificial Intelligence Institute at UofSC
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Artificial Intelligence Institute at UofSC
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Artificial Intelligence Institute at UofSC
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Artificial Intelligence Institute at UofSC
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
 

Andere mochten auch (20)

User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social Media
 
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and QueryingPrateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
Prateek Jain's Dissertation Defense - Linked Open Data Alignment and Querying
 
PhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher ThomasPhD thesis defense of Christopher Thomas
PhD thesis defense of Christopher Thomas
 
PhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith RanabahuPhD thesis defense of Ajith Ranabahu
PhD thesis defense of Ajith Ranabahu
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Automatic Emotion Identification from Text
Automatic Emotion Identification from TextAutomatic Emotion Identification from Text
Automatic Emotion Identification from Text
 
Mining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated ContentMining and Analyzing Subjective Experiences in User-generated Content
Mining and Analyzing Subjective Experiences in User-generated Content
 
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...Personalized and Adaptive Semantic Information Filtering for Social Media - P...
Personalized and Adaptive Semantic Information Filtering for Social Media - P...
 
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent MiningAshutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
Ashutosh Jadhav PhD Defense: Knowledge Driven Search Intent Mining
 
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
Delroy Cameron's Dissertation Defense: A Contenxt-Driven Subgraph Model for L...
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
 
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
Pablo Mendes' Defense: Adaptive Semantic Annotation of Entity and Concept Men...
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Trust Management: A Tutorial
Trust Management: A TutorialTrust Management: A Tutorial
Trust Management: A Tutorial
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 
Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 

Ähnlich wie Cartic Ramakrishnan's dissertation defense

Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...Lars Juhl Jensen
 
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...Artificial Intelligence Institute at UofSC
 
Biomedical literature mining
Biomedical literature miningBiomedical literature mining
Biomedical literature miningLars Juhl Jensen
 
Biomedical literature mining
Biomedical literature miningBiomedical literature mining
Biomedical literature miningLars Juhl Jensen
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...Neuroscience Information Framework
 
Literature Mining and Systems Biology
Literature Mining and Systems BiologyLiterature Mining and Systems Biology
Literature Mining and Systems BiologyLars Juhl Jensen
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wdWagied Davids
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009bosc
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_uploadProf. Wim Van Criekinge
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesJackie Wirz, PhD
 
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherNils Gehlenborg
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformaticsJoel Ricci-López
 

Ähnlich wie Cartic Ramakrishnan's dissertation defense (20)

Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...Biological literature mining - from information retrieval to biological disco...
Biological literature mining - from information retrieval to biological disco...
 
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...
Semantics-Empowered Understanding, Analysis and Mining of Nontraditional and ...
 
Biomedical literature mining
Biomedical literature miningBiomedical literature mining
Biomedical literature mining
 
Biomedical literature mining
Biomedical literature miningBiomedical literature mining
Biomedical literature mining
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...
 
Literature Mining and Systems Biology
Literature Mining and Systems BiologyLiterature Mining and Systems Biology
Literature Mining and Systems Biology
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Navigating the Neuroscience Data Landscape
Navigating the Neuroscience Data LandscapeNavigating the Neuroscience Data Landscape
Navigating the Neuroscience Data Landscape
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Research presentation-wd
Research presentation-wdResearch presentation-wd
Research presentation-wd
 
Protein databases
Protein databasesProtein databases
Protein databases
 
2012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les12012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les1
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Welch Wordifier Bosc2009
Welch Wordifier Bosc2009Welch Wordifier Bosc2009
Welch Wordifier Bosc2009
 
2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload2018 02 20_biological_databases_part1_v_upload
2018 02 20_biological_databases_part1_v_upload
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
Visualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All TogetherVisualization Approaches for Biomedical Omics Data: Putting It All Together
Visualization Approaches for Biomedical Omics Data: Putting It All Together
 
Data retrieval
Data retrievalData retrieval
Data retrieval
 
2017 biological databases_part1_vupload
2017 biological databases_part1_vupload2017 biological databases_part1_vupload
2017 biological databases_part1_vupload
 
An Overview to Protein bioinformatics
An Overview to Protein bioinformaticsAn Overview to Protein bioinformatics
An Overview to Protein bioinformatics
 

Kürzlich hochgeladen

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Kürzlich hochgeladen (20)

Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Cartic Ramakrishnan's dissertation defense

  • 1. Extracting, representing and mining Semantic Metadata from text: Facilitating Knowledge Discovery in Biomedicine Cartic Ramakrishnan Advisor: Dr. Amit Sheth Committee Members: Dr. Michael Raymer Dr. Guozhu Dong Dr. Thaddeus Tarpey Dr. Vasant Honavar Dr. Shaojun Wang
  • 2.
  • 3.
  • 4.
  • 5. Element of surprise – Swanson’s discoveries Magnesium Migraine PubMed ? Stress Spreading Cortical Depression Calcium Channel Blockers Swanson’s Discoveries Associations Discovered based on keyword searches followed by manually analysis of text to establish possible relevant relationships 11 possible associations found
  • 6. Knowledge Discovery in AI- The robot scientist Planned search over an well-defined (axiomatic) space leading to knowledge discovery. Knowledge discovery by humans is done in non-axiomatic ill-defined spaces over multi-modal data . Scientific literature is ill-defined and loosely structured source of data used in scientific investigations. Assigning structure and interpretation to text (Semantics) Syntax  Structure  Semantics
  • 7. Knowledge Discovery = Extraction + Heuristic Aggregation Undiscovered Public Knowledge
  • 8. Information Extraction & Text Mining This MEK dependency was observed in BRAF mutant cells regardless of tissue lineage, and correlated with both downregulation of cyclin D1 protein expression and the induction of G1 arrest. *MEK dependency ISA Dependency_on_an_Organic_chemical *BRAF mutant cells ISA Cell_type *downregulation of cyclin D1 protein expression ISA Biological_process *tissue lineage ISA Biological_concept *induction of G1 arrest ISA Biological_process Information Extraction = segmentation+classification+association+mining Text mining = entity identification+named relationship extraction+discovering association chains…. Segmentation Classification Named Relationship Extraction MEK dependency observed in BRAF mutant cells downregulation of cyclin D1 protein expression correlated with induction of G1 arrest correlated with
  • 9.
  • 10. Knowledge Discovery over text Extraction of Semantics from text Semantic Metadata Guided Knowledge Explorations Assigning interpretation to text Semantic Metadata Guided Knowledge Discovery Triple-based Semantic Search Semantic browser Subgraph discovery Semantic metadata in the form of semi-structured data Text
  • 11. Ontology-enabled Information Extraction Cartic Ramakrishnan , Krys Kochut, Amit P. Sheth: A Framework for Schema-Driven Relationship Discovery from Unstructured Text. International Semantic Web Conference 2006 : 583-596
  • 12.
  • 13. Information Extraction via Ontology assisted text mining – Relationship extraction Biologically active substance Lipid Disease or Syndrome affects causes affects causes complicates Fish Oils Raynaud’s Disease ??????? instance_of instance_of UMLS Semantic Network MeSH PubMed 9284 documents 4733 documents 5 documents
  • 14.
  • 15.
  • 16. Method – Identify entities and relationships in Parse Tree TOP NP VP S NP VBZ induces NP PP NP IN of DT the NN endometrium JJ adenomatous NN hyperplasia NP PP IN by NN estrogen DT the JJ excessive ADJP NN stimulation JJ endogenous JJ exogenous CC or MeSHID D004967 MeSHID D006965 MeSHID D004717 UMLS ID T147 Modifiers Modified entities Composite Entities
  • 17. Representation – Resulting RDF Modifiers Modified entities Composite Entities
  • 18.
  • 19. Paths between Migraine and Magnesium Paths are considered interesting if they have one or more named relationship Other than hasPart or hasModifiers in them
  • 20.
  • 21. Interesting Observations from this preliminary work
  • 22.
  • 23.
  • 24.
  • 25. Unsupervised Joint Extraction of Compound Entities and Relationship Cartic Ramakrishnan , Pablo N. Mendes, Shaojun Wang and Amit P. Sheth &quot;Unsupervised Discovery of Compound Entities for Relationship Extraction&quot; EKAW 2008 - 16th International Conference on Knowledge Engineering and Knowledge Management Knowledge Patterns
  • 26.
  • 28.
  • 29. Algorithm Relationship head Subject head Object head Object head
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42. Results of Manual Evaluation
  • 43.
  • 44. Semantic Metadata Guided Knowledge Explorations and Discovery
  • 46.
  • 47. Schema-driven edge weight assignment company Entertainment company Manufacturing company Oil company Automotive company Electronics company Sporting goods company Ford Motors Cartic’s Company 0.67 0.33 <0.5 Schema Instances 1.0 1.0 1.0 1.0 1.0 1.0 0.33
  • 48.
  • 49.
  • 50.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56. Hypothesis Driven retrieval of Scientific Literature PubMed Keyword query: Migraine[MH] + Magnesium[MH] Complex Query Supporting Document sets retrieved Migraine Stress Patient affects isa Magnesium Calcium Channel Blockers inhibit
  • 57.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64. On a lighter note