2. Automatically identify “interesting” concepts
◦ In a graph of semantic predications
◦ Extracted from biomedical research literature
◦ Using graph features
Support discovery browsing
◦ Information retrieval and knowledge discovery
Compare statistical and rule-based models
Train and test on PubMed query logs
3. Extraordinary amount of
digital text is available on
the Web, example MEDLINE
Ongoing research to extract
valuable information from
text
Exploit this information
through literature-based
discovery (LBD)
Based on semantic
predications
Image Retrieved from http://jasonpriem.
org/2010/10/medline-literature-growth-chart/
4. Logical subject-predicate-object triples
whose elements are drawn from the Unified
Medical Language System knowledge sources
SemRep extracts semantic predications from
biomedical text
Textual content is represented as
predications consisting of UMLS
Metathesaurus concepts as arguments and
UMLS Semantic Network relations as
predicates
5.
6. Inflammation mediated by the immune system is known to be important in carcinogenesis and, specifically, T helper 17 cells have been reported to play
a role in tumor progression by promoting neo-angiogenesis. The aim of this study was to investigate whether inflammatory cytokines and vascular
endothelial growth factor (VEGF) levels in exhaled breath condensate (AFFECTS
EBC) and in serum were related to tumor size in patients with non-small cell lung
cancer (NSCLC). Il-6, IL-17, TNF-α and VEGF cytokine levels were measured in EBC and serum of 15 patients biological with stage I-IIA NSCLC process
and in 30 healthy controls by
immunoassay. The tumor size was measured by a CT scan. The concentrations of IL-6, IL-17 and VEGF were significantly higher in EBC of patients with
lung cancer, compared with controls, while only serum IL-6 concentration was higher in patients compared to controls. A significant correlation (r =
0.78, p = 0.001) was observed between EBC levels of IL-6 and IL-17; IL-17 was also correlated to EBC levels of the VEGF (r = 0.83, p < 0.001) and TNF-α
(r = 0.62, p = 0.014). The tumor diameter was significantly correlated with CAUSES
EBC concentrations of VEGF (r = 0.58, p = 0.039), IL-6 (r = 0.67, p = 0.013)
and IL-17 (r = 0.66, p = 0.017). Our results show a significant relationship between inflammatory and angiogenic markers, measured in EBC by a non-invasive
beta catenin Inflammation
method, and tumor mass. To assess whether polymorphisms of the interleukin-23 receptor (IL23R) gene are associated with bladder transitional
cell carcinoma because chronic inflammation contributes to bladder cancer and the IL23R is known to be critically involved in the carcinogenesis of
various malignant tumors. 226 patients with bladder cancer and 270 age-matched controls were involved in the study. Polymerase chain reaction-restriction
CAUSES
fragment length polymorphism was used for genotyping. Genotype distribution and allelic frequencies between patients and controls were
cytokine Tumorigenesis
compared. In all three single nucleotide polymorphisms of IL23R studied, the distribution of genotype and allele frequencies of rs10889677 differed
significantly between patients and controls. The frequency of allele C of rs10889677 was significantly increased in cases compared with controls (0.2898
vs. 0.1833, odds ratio 1.818, 95 % confidence interval 1.349-2.449). The result indicates that IL23R may play an important role in the susceptibility of
bladder cancer in Chinese population. For over a century, inactivated or attenuated bacteria have been employed in the clinic as immunotherapies to
treat cancer, starting with the Coley's vaccines in the 19th century and cancer. While effective, the inflammation induced by these therapies DISRUPTS
leading to the currently approved bacillus Calmette-Guérin vaccine for bladder
is transient and not designed to induce long-lasting tumor-specific cytolytic T
lymphocyte (Inflammation CTL) responses that have proven Mediators so adept at eradicating tumors. Therefore, in order to T-maintain Lymphocyte
the benefits of bacteria-induced acute
inflammation but gain long-lasting anti-tumor immunity, many groups have constructed recombinant bacteria expressing tumor-associated antigens
(TAAs) for the purpose of activating tumor-specific CTLs. One bacterium has proven particularly adept at inducing powerful anti-tumor immunity,
Listeria monocytogenes (Lm). Lm is a gram-positive bacterium that selectively infects antigen-presenting cells wherein it is able to efficiently deliver
tumor antigens to both the MHC Class I and II antigen presentation pathways for activation of tumor-targeting CTL-mediated immunity. Lm is a versatile
bacterial vector as evidenced by its ability to induce therapeutic immunity against a wide-array of TAAs and specifically infect and kill tumor cells
directly. It is for these reasons, among others, that Lm-based immunotherapies have delivered impressive therapeutic efficacy in preclinical models of
cancer for two decades and are now showing promise clinically. In this review, TREATS
we will provide an overview of the history leading up to the development
of current Lm-based immunotherapies, Pharmacotherapy the advantages and mechanisms of Lm as a therapeutic vaccine vector, the preclinical experience with Lm-based
immunotherapies targeting a number of malignancies, and the recent findings from clinical trials along Patients
with concluding remarks on the future of Lm-based
tumor immunotherapies. Considerable evidence has suggested that chronic inflammation is a causative factor in the development of human
colorectal cancer (CRC). Interleukin (IL)-17A produced mainly by Th17 cells is a novel proinflammatory cytokine and increased IL-17A is associated with
colorectal neoplastic transformation. In this study, we have evaluated the expression of IL-17A in the adjacent tissues along the colorectal adenoma-carcinoma
PROCESS_OF
sequence. The expression of IL-17A in the adjacent tissues of colorectal adenoma (adenoma-adjacent, n = 32) and sporadic CRC (CRC-adjacent,
n = 45) was examined. In addition, Inflammation the expression pattern of Th17 cell differentiation stimulators Individual
(IL-1β, IL-6 and IL-23A) in the adjacent
tissues were also examined. The results showed that the expression level of IL-17A mRNA was non-statistically increased (4-fold higher) in the
adenoma-adjacent tissues and it became significantly increased (9-fold higher) in the CRC-adjacent tissues as compared with the control. The
expression level of IL-17A in the CRC-adjacent tissues was not associated with CRC clinicopathological parameters and overall survival.
Immunohistochemistry confirmed an increased density of intraepithelial IL-17A expressing cells in the CRC-adjacent tissues. The Th17 cell
differentiation simulators IL-1β and IL-6 were also shown in an increase trend from the adenoma-adjacent to CRC-adjacent tissues. These results
provide evidence that IL-17A/Th17 response is enhanced in the adjacent tissues during the colorectal neoplastic transformation. Non-steroidal anti-inflammatory
drugs (NSAIDs) are extensively used over the counter to treat headaches and inflammation as well as clinically to prevent cancer among
high-risk groups. The inhibition of cyclooxygenase (COX) activity by NSAIDs plays a role in their anti-tumorigenic properties. NSAIDs also have COX-independent
activity which is not fully understood. In this study, we report a novel COX-independent mechanism of sulindac sulfide (SS), which
facilitates a previously uncharacterized cleavage of epithelial cell adhesion molecule (EpCAM) protein. EpCAM is a type I transmembrane glycoprotein
7.
8. Discovery browsing involves iterative search
and seek behavior
Discovery browsing involves identifying
interesting concepts in a graph of semantic
predications
◦ Poorly understood relationships explored through
novel points of view
◦ Potentially interesting relationships need not be
known ahead of time
9. Web application based on SemRep predications
Combines
◦ PubMed search in MEDLINE citations
◦ Automatic summarization of predications extracted
◦ Graphical display
Facilitates iterative search for knowledge
discovery
◦ Identify “interesting” concept in graph
◦ Combined into another search
10. Cairelli et. al. elucidates obesity paradox
◦ Using Semantic MEDLINE for discovery browsing
Obesity normally leads to increased mortality
But, increased obesity predicts decreased
morbidity and mortality in intensive care
11. Search
Resulting
citations
Resulting
predications
Summarized
predications
Interesting
term
Interesting Relationship
1 obesity 20,000 118,325 22,378
PPAR
gamma
PPAR gamma in
inflammation cluster
2
obesity and
PPAR gamma
1346 13,224 6733 phthalate
Adipose tissue
LOCATION_OF PPAR gamma
phthalate STIMULATES PPAR
gamma
3
PPAR gamma
and phthalate
32 368 135 MEHP
MEHP STIMULATES PPAR
gamma
DEHP METABOLIZES_TO
MEHP
4
MEHP and
intensive care
unit
7 51 6 PVC CONTAINS DEHP
ICU interventions INCREASE
PVC exposure
5
PPAR gamma
and intensive
care unit
12 150 28
PPAR gamma DECREASES
Inflammation
12.
13. Relevant to user
Relevant to topic
◦ Unexpected
◦ Uncommon
◦ Unfamiliar
Discovery browsing
◦ Requires identification of interesting concepts
manually
14. Explore three models
Naive Bayes
◦ Probabilistic model, attribute independence
Support Vector Machines
◦ Non-probabilistic model
◦ LibSVM
Rule-Induction
◦ Decision tree
◦ RIPPER algorithm
15. Extract SemRep predications for training case
◦ Alzheimer’s disease
Represent predications as a graph
Identify graph metrics to be used as features
Train algorithms on PubMed query logs
Run on test graph data
Evaluate on PubMed query logs
16.
17. Database of semantic predications
Extracted from MEDLINE using SemRep
23.1 million citations
69.3 million predications
Citations from 1865 onwards
40125 predications extracted on Alzheimer’s
disease
18. SEED I.
SEED I.
SEED NI
SEED NI
SEED A
C
D
A
E
SEED
Graph (G)
B
19. Degree centrality
◦ Connectivity of a node
◦ Suggests importance in a
network
Frequency of occurrence
◦ Instances of an edge
◦ Suggests
commonality/familiarity
3
3
20. Features based on nodes themselves Features based on neighboring nodes
Total Predication Frequency:
푇푃퐹(푣푖 ) =
푛
푗=0
푛표푑푒(푣푖 , 푣푗 )
Total Unique Predicates:
푇푈푃 푣푖 =
푛
푗=0
푝푟푒푑푖푐푎푡푒! (푣푖 , 푣푗 )
Neighboring Node Predication Frequency:
푁푁푃퐹 푣푖 =
푛
푗=0
푚
푘=0
푒푑푔푒(푣푗 , 푣푘 )
Neighboring Node Total Connectedness:
푁푁푇퐶 푣푖 =
푛
푗=0
푚
푘=0
푛표푑푒! (푣푘 ) ∈ (푣푗 , 푣푘)
Neighboring Node Unique Predications:
푁푁푈푃 푣푖 =
푛
푗=0
푚
푘=0
푒푑푔푒! (푣푗 , 푣푘)
21. Total Predication Frequency (TPF): for a given node, the
total number of edges connected to the seed node
(disregards predicate). TPF(Seed-A) = 3
Total Unique Predicates (TUP): the total number of
predicates (edge type) connected to the seed node.
TUP(Seed -A) = 2
Neighboring Node Predication Frequency (NNPF): the
sum of edges for all neighboring nodes. NNPF(B) = 4
Neighboring Node Total Connectedness (NNTC): the sum
of the unique nodes connected to each neighboring
node. NNTC(B) = 2
Neighboring Node Unique Predications (NNUP): the total
number of unique edges of neighboring nodes. NNUP(B)
= 3
23. Provides access to largest biomedical
literature database in the world, about 21
million citations
Audience: one-third general public and two-thirds
healthcare professionals and
researchers
ASSUMPTION: terms in search query are
interesting to the user
25. Concepts extracted that exist both in SemMedDB
and PubMed query logs (after processing) are
retained.
For each threshold, concepts above threshold are
marked Interesting (represented as 1) and the
remaining marked as uninteresting (represented
as 0)
Threshold at "-0.2" standard deviation and above
(104 concept/255 total)
Threshold at "-0.15" standard deviation and
above (84/255)
Threshold at "-0.02" standard deviation and
above (46/255)
26. All three models
◦ Naïve Bayes, rule induction, SVM
All three thresholds
◦ -0.2 SD, -0.15 SD, -0.02 SD
On PubMed query logs for Alzheimer’s
28. Rule induction model
Threshold of -0.2 SD
Three test data sets
◦ Schizophrenia
◦ Diabetes
◦ Colitis
29.
30. Rule based over SVM
◦ Better overall
Low Recall
◦ Precision is important, truly interesting concepts
are being captured
Performance best with Schizophrenia (73%
precision & 36% recall)
32. Individual concepts were suggestive of
several categories of users:
active researcher at basic or clinical science level,
practicing specialist, practicing primary care clinician,
caregiver/family member of patient, and lay user
concerned with prevention of Alzheimer disease.
33. Hydroxymethylglutaryl-CoA Reductase
Inhibitors
◦ Fairly new to the investigation for treatment options
◦ Likely neurologists or psychiatrists treating
Alzheimer patients or clinical researchers actively
investigating this area
Supplements (e.g. Curcumin, Melatonin)
◦ Likely lay users interested in prevention but maybe
also researchers and primary care providers
Advanced terminology (e.g. Anosognosia)
◦ Suggests expert, such as a clinical specialist or
scientist or possibly trainees in these areas
34. Novel approach of identifying interestingness
in graph of semantic predications
Positive correlation established between
PubMed query log and graph metrics derived
from predications
Implications of interestingness on discovery
browsing
Domain expert's analysis shows
categorization of user is possible
35. Additional graph features
◦ Semantic class of nodes (e.g. drug, disease)
◦ Semantic class of edge (e.g. TREATS, INHIBITS)
◦ Other graph features (e.g. betweenness centrality)
Larger time period of log
Separate by class of user
36. Committee Members
Dr. Amit Sheth Dr. Thomas C. Rindflesch Dr. Michael J. Cairelli
37. PREDOSE team
◦ Delroy Cameron
◦ Alan Gary Smith
◦ Nishita Jaykumar
◦ Revathy Krishnamurthy
◦ Lu Chen
◦ Swapnil Soni
SemRep team
◦ Dr. Elizabeth Workman
◦ Dr. Halil Kilicoglu
◦ Dr. Dongwook Shin
◦ Dr. Marcelo Fiszman
◦ Dr. Graciela Rosemblat
Family
Friends
◦ Aja Hamilton
◦ Arif Canakoglu
◦ Ashutosh Jadhav
◦ Hemant Purohit
◦ Jaccard Welch
◦ Pavan Kapanipathi
◦ Pramod Ananthram
◦ Sanjaya Wijeratne
◦ Sarasi Lalithsena
◦ Shreyansh Bhatt
◦ Sujan Perera
◦ Surendra Marupudi
◦ Vinh Nguyen
◦ Wenbo Wang