SlideShare ist ein Scribd-Unternehmen logo
1 von 9
Downloaden Sie, um offline zu lesen
PHARMACOGENOMIC DATA MINING
with Hierarchical Clustering Algorithms
Ohene Z. Frank
CSC 576 Data Warehousing and Mining
Final Report
Frank | PGX Data Mining 1
PHARMACOGENOMIC DATA MINING WITH HIERARCHICAL CLUSTERING ALGORITHMS
Designer’s drugs, individualized drugs and personalized medicine are few of the
buzzwords that are proliferating the biotech information super-highway and are
widely used by pharmaceutical scientists, clinical scientists, researchers and medical
humanitarians when referring to pharmacogenomics. Malorye Branca of Bio-IT
World stated, “One of the most seductive lures of the genomic revolution is the
promise of personalized medicine”. Pharmacogenomics is the study of how one’s
genetic makeup affects the body’s response to drugs, hence an intersection of
genetics, pharmacodynamics and pharmacokinetics. Pharmacogenetics is widely
used synonymously with pharmacogenomics. Conceptually, these genomics terms
are interchangeable, but from a purist view, pharmacogenomics is the technology
where as pharmacogenetics is the science. Genaissance Pharmaceuticals defined
pharmacogenomics as the application of genome science (genomics) to the study
of human variability to drug response.
So, what’s the real tumult? In the United States, there is at least 100, 000 death
annually due to adverse reactions (side effects) to prescription drugs. Moreover,
millions of people are being treated with drugs that are ineffective or have very little
pharmacological effect; beta-blockers given to reduce blood pressure are
ineffective in one-third of patients and many antidepressants in half of the people
who take them [1].
The culpability for the lack of efficacy and intolerance of many drugs lies mainly with
our genes, which help to determine the way in which our body reacts, absorbs,
Frank | PGX Data Mining 2
distributes, metabolize and excretes drugs. Small genetic variations between
people (known as polymorphisms) can alter the behavior of proteins that carry a
drug to its target cells or tissues, neutralize the enzymes that activate a drug or aid in
the excretion process or alter the structure of the receptor to which a drug is
supposed to bind [1]. Variation in immune-system genes can also influence how
particular drugs are tolerated. These slight genetic variations mean that the dose at
which a drug will work may vary hugely from person to person; hence, the one-size-
fits-all drug development and prescribing can lead to life-threatening adverse
reaction to a drug or in some cases, fatality.
On the right path forward, the genomics revolution has given us the tools to identify
people who don't fit the standard prescribing mold. Genomics is the use of high
throughput molecular biology technologies to study large numbers of genes, and
gene products simultaneously in whole cells, whole tissues, or whole organisms [2].
The genome is all of the genetic material in a cell or an organism. According to the
U.S. Department of Energy, the genome is an organism’s complete set of DNA. In
the human genome, DNA is arranged into 24 distinct chromosomes, which are
separate molecules (physically) that range in length from about 50 million to 250
million base pairs [3]. Each chromosome is a single strand of the DNA double helix
that is very long in length (as illustrated Figure 1).
Frank | PGX Data Mining 3
Figure 11: Illustration of a chromosome replicating its DNA before a cell divides.
Single nucleotide polymorphisms (SNPs) are single-letter variations in the genetic
code that are scattered throughout the genome. Most SNPs are benign, with
absolutely no effect on gene structure or expression; however, a subset of these
variations provides crucial links to disease-causing genes, either because they
directly alter a gene's activity or aid in pinpointing the location of a disease-related
gene [1].
1
Figure is the courtesy of Genaissance Pharmaceuticals, Inc.
Frank | PGX Data Mining 4
The profusion of SNPs and the simplistic identification, make them the ideal
biomarkers for clinical studies. SNPs are also found in genes for drug-metabolizing
enzymes, influencing individuals' ability to process a drug properly.
Many companies have compiled large collections of SNPs with the intention of
developing diagnostic and prognostic tests, as well as to guide the development of
a new generation of drugs that would target genetically determined subsets of
patients [1]. All in all, this type of genomic technology as it aims to identify the best
possible medications for individuals while maximizing efficacy and minimizing toxicity
is known as pharmacogenomics.
Due the gravity and promise of pharmacogenomics, several genomics companies
are manufacturing DNA microarrays to identify common SNPs that influence the
activity of various enzymes. Ultimately, these gene expression chips could help to
prevent life-threatening reactions to drugs, identify appropriate drug doses, and
prescribe the right drug combination (or concomitant medications) to give to
patients with complex conditions.
In order for this to come to fulfillment at faster pace, one can applied data mining
techniques to a clinical data warehouse that contains both clinical trials data and
genomic data (anonymized genotyping and microarray) utilizing hierarchical
clustering algorithms.
Frank | PGX Data Mining 5
The data mining technique most widely utilized for the analysis of gene expression
data is hierarchical clustering. This type of clustering algorithms has the advantage
of being relatively simple and the result can be easily visualized. Hierarchical
clustering is an agglomerative approach in which single expression profiles are
joined to form groups that are further joined until the process has been carried to
completion, forming a single hierarchical tree [5].
There are six main hierarchical clustering algorithms (single-linkage, complete-
linkage, average-linkage, weighted pair-group average, within-groups and Ward’s
method) that can be applied to gene expression profiling (microarray) data analysis.
These clustering algorithms differ in the methodology in which distances are
calculated between the growing clusters and the remaining members (including
other clusters) in the data set. [5]
Single-linkage Clustering: This method is also referred to as the minimum, or
nearest-neighbor method. The distance between two clusters, x and y, is
calculated as the minimum distance between a member of cluster x and a
member of cluster y. This method tends to produce “loose” clusters that can
be joined, if any two members are close together. This method often results in
sequential addition of single samples to an existing cluster, which in turn,
produces trees with many long, single-addition branches representing clusters
that have grown by accumulation.
Complete-linkage Clustering: This method is also referred to as the maximum
or furthest-neighbor method. The distance between two clusters is calculated
Frank | PGX Data Mining 6
as the greatest distance between members of the relevant clusters. This
method tends to produce very compact clusters of elements and the clusters
are often very similar in size.
Average-linkage Clustering: This method is also referred to as unweighted
pair-group method average. The average distance is calculated from the
distance between each point in a cluster and all other points in another
cluster. The two clusters with the lowest average distance are joined
together to form a new cluster.
Weighted Pair-group Average: This method is identical to average-linkage
clustering (as described above), except that the size of the respective clusters
is used as a weight in the computations. This method should be used when
the cluster sizes are suspected to be greatly uneven.
Within-groups Clustering: This method is similar to average-linkage clustering
also, except that the clusters are merged and a cluster average is used for
further calculations instead of the individual cluster elements. This method
tends to produce tighter clusters than average-linkage clustering.
Ward's Method: In this method, the calculation of the total sum of squared
deviations from the mean of a cluster and joining clusters in order that it
produces the smallest possible increase in the sum of squared errors
determines the clusters.
Frank | PGX Data Mining 7
Figure 32: Hierarchical Clustering Demonstration
Figure 3 is a representation of gene expression data that were subjected to average-
linkage, complete-linkage and single-linkage hierarchical clustering using a
Euclidean distance metric and gene-expression families (A–J) that were color coded
for comparison. Genes that are up-regulated appear in red, and those that are
down-regulated appear in green, with the relative log2 (ratio) reflected by the
intensity of the color [5].
2
Courtesy of Nature Reviews, Nature Publishing Group
Frank | PGX Data Mining 8
The aim and allure of pharmacogenomic data mining is to discovery knowledge
from a clinical genomic data warehouse (comprised of both genomic and clinical
data), in order to identify and prescribe the most effective and least toxic drug for
an individual based the person’s genetic makeup and the targeted disease.
References
[1] Abbott, A., Nature 425, 760 - 762 (23 October 2003).
[2] Genaissance Pharmaceuticals, Inc., Online Glossary (2004).
[3] US Department of Energy, Human Genome Information Project,
Pharmacogenomics (2004).
[4] Branca, M., The New, New Pharmacogenomics, Bio-IT World (Sept. 9, 2002).
[5] Quackenbush, J., Nature Reviews Genetics 2, 418-427 (2001).
[6] Brown, M., Essentials of Medical Genomics, 163-198 (2003).
[7] Hollinger, M.A., Introduction to Pharmacology 2, 288-290 (2003).

Weitere ähnliche Inhalte

Was ist angesagt?

Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
 
Tools for target identification and validation
Tools for target identification and validationTools for target identification and validation
Tools for target identification and validationDr. sreeremya S
 
COMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYCOMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYAmrutha Lakshmi
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtuAqsa Qambrani
 
Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Tudor Oprea
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designingW Roseybala Devi
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravisangelisralopez
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interactionAashish Patel
 
Applications of proteomic sciences
Applications of proteomic sciencesApplications of proteomic sciences
Applications of proteomic sciencessukanyakk
 
Lecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationLecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationRAJAN ROLTA
 
The Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessThe Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessAdebowale Qazeem
 
Impacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptImpacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptGloria Okenze
 
Molecular target and development models
Molecular target and development modelsMolecular target and development models
Molecular target and development modelsAmjad Afridi
 
Unravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseasesUnravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseaseseSAT Journals
 

Was ist angesagt? (20)

Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
Tools for target identification and validation
Tools for target identification and validationTools for target identification and validation
Tools for target identification and validation
 
COMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERYCOMPUTER ASSISTED DRUG DISCOVERY
COMPUTER ASSISTED DRUG DISCOVERY
 
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
 
Research proposal sjtu
Research proposal sjtuResearch proposal sjtu
Research proposal sjtu
 
Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923
 
Genomics and proteomics
Genomics and proteomicsGenomics and proteomics
Genomics and proteomics
 
Genomics & Proteomics Based Drug Discovery
Genomics & Proteomics Based Drug DiscoveryGenomics & Proteomics Based Drug Discovery
Genomics & Proteomics Based Drug Discovery
 
Role of bioinformatics in drug designing
Role of bioinformatics in drug designingRole of bioinformatics in drug designing
Role of bioinformatics in drug designing
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravis
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Preproposal Talk
Preproposal TalkPreproposal Talk
Preproposal Talk
 
Applications of proteomic sciences
Applications of proteomic sciencesApplications of proteomic sciences
Applications of proteomic sciences
 
Lecture 8 drug targets and target identification
Lecture 8 drug targets and target identificationLecture 8 drug targets and target identification
Lecture 8 drug targets and target identification
 
The Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery ProcessThe Role of Bioinformatics in The Drug Discovery Process
The Role of Bioinformatics in The Drug Discovery Process
 
Impacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics pptImpacts of genomics, proteomics, and metabolomics ppt
Impacts of genomics, proteomics, and metabolomics ppt
 
protein microarray
protein microarray protein microarray
protein microarray
 
Molecular target and development models
Molecular target and development modelsMolecular target and development models
Molecular target and development models
 
Unravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseasesUnravelling the molecular linkage of co morbid diseases
Unravelling the molecular linkage of co morbid diseases
 

Andere mochten auch

Общественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовОбщественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовKomitetGI
 
Forced migration, care and family relations
Forced migration, care and family relationsForced migration, care and family relations
Forced migration, care and family relationsRuth Evans
 
SMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsSMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsAaron Levy
 
Конфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьКонфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьKomitetGI
 
καζαντζακησ
καζαντζακησκαζαντζακησ
καζαντζακησfoteini2013
 
Our complex tech future
Our complex tech futureOur complex tech future
Our complex tech futureLizzie Hodgson
 
UN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesUN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesМЦМС | MCIC
 
Βραβείο προπαίδειας
Βραβείο προπαίδειαςΒραβείο προπαίδειας
Βραβείο προπαίδειαςRoula Mple
 

Andere mochten auch (14)

Общественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходовОбщественный контроль государственных и муниципальных расходов
Общественный контроль государственных и муниципальных расходов
 
Resume
ResumeResume
Resume
 
Powerpoint9
Powerpoint9Powerpoint9
Powerpoint9
 
Forced migration, care and family relations
Forced migration, care and family relationsForced migration, care and family relations
Forced migration, care and family relations
 
SMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion MetricsSMX East - Alternate Mobile Conversion Metrics
SMX East - Alternate Mobile Conversion Metrics
 
Leave a legacy
Leave a legacyLeave a legacy
Leave a legacy
 
RecyclinginIV
RecyclinginIVRecyclinginIV
RecyclinginIV
 
Emerce Conversion
Emerce ConversionEmerce Conversion
Emerce Conversion
 
Конфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и властьКонфликты Никовская Л.И. - го и власть
Конфликты Никовская Л.И. - го и власть
 
καζαντζακησ
καζαντζακησκαζαντζακησ
καζαντζακησ
 
Activity Sheet
Activity SheetActivity Sheet
Activity Sheet
 
Our complex tech future
Our complex tech futureOur complex tech future
Our complex tech future
 
UN policy and standards migrants vs refugees
UN policy and standards migrants vs refugeesUN policy and standards migrants vs refugees
UN policy and standards migrants vs refugees
 
Βραβείο προπαίδειας
Βραβείο προπαίδειαςΒραβείο προπαίδειας
Βραβείο προπαίδειας
 

Ähnlich wie PGX Data Mining

The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...Jianghui Xiong
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsJunaidAKG
 
Target discovery and validation
Target discovery and validation Target discovery and validation
Target discovery and validation ANAND SAGAR TIWARI
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureAffymetrix
 
Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Mutiple Sclerosis
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsKevin Jaglinski
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentSuchittaU
 
Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following TatianaMajor22
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingJoaquin Dopazo
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuKAUSHAL SAHU
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryDr. Gerry Higgins
 
Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.University of Allahabad
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...laserxiong
 
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.sebastian naranjo
 
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels  And Drugs Chosen by Your DNAFolding.Building New Blood Vessels  And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels And Drugs Chosen by Your DNAsebastian naranjo
 

Ähnlich wie PGX Data Mining (20)

The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...The Principle of Rational Design of Drug Combination and Personalized Therapy...
The Principle of Rational Design of Drug Combination and Personalized Therapy...
 
Personalized medicine through wes and big data analytics
Personalized medicine through wes and big data analyticsPersonalized medicine through wes and big data analytics
Personalized medicine through wes and big data analytics
 
Target discovery and validation
Target discovery and validation Target discovery and validation
Target discovery and validation
 
Solutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochureSolutions for Personalized Medicine brochure
Solutions for Personalized Medicine brochure
 
Pharmacogenomics
Pharmacogenomics Pharmacogenomics
Pharmacogenomics
 
Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...Contribution of genome-wide association studies to scientific research: a pra...
Contribution of genome-wide association studies to scientific research: a pra...
 
A common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organsA common rejection module (CRM) for acute rejection across multiple organs
A common rejection module (CRM) for acute rejection across multiple organs
 
Genomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and developmentGenomics and proteomics in drug discovery and development
Genomics and proteomics in drug discovery and development
 
Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following Instructions Respond to your colleague in one of the following
Instructions Respond to your colleague in one of the following
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
multiomics-ebook.pdf
multiomics-ebook.pdfmultiomics-ebook.pdf
multiomics-ebook.pdf
 
Pharmaogenomics
PharmaogenomicsPharmaogenomics
Pharmaogenomics
 
Pharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahuPharmacogenomics, by kk sahu
Pharmacogenomics, by kk sahu
 
Genomics
GenomicsGenomics
Genomics
 
Very brief overview of AI in drug discovery
Very brief overview of AI in drug discoveryVery brief overview of AI in drug discovery
Very brief overview of AI in drug discovery
 
Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.Pharmacogenomics: The right drug to the right person.
Pharmacogenomics: The right drug to the right person.
 
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
Pre-clinical drug prioritization via prognosis-guided genetic interaction net...
 
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.Folding. Building New Blood Vessels  And Drugs Chosen by Your DNA.
Folding. Building New Blood Vessels And Drugs Chosen by Your DNA.
 
Building new blood vessels
Building new blood vesselsBuilding new blood vessels
Building new blood vessels
 
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels  And Drugs Chosen by Your DNAFolding.Building New Blood Vessels  And Drugs Chosen by Your DNA
Folding.Building New Blood Vessels And Drugs Chosen by Your DNA
 

Kürzlich hochgeladen

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 

Kürzlich hochgeladen (20)

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 

PGX Data Mining

  • 1. PHARMACOGENOMIC DATA MINING with Hierarchical Clustering Algorithms Ohene Z. Frank CSC 576 Data Warehousing and Mining Final Report
  • 2. Frank | PGX Data Mining 1 PHARMACOGENOMIC DATA MINING WITH HIERARCHICAL CLUSTERING ALGORITHMS Designer’s drugs, individualized drugs and personalized medicine are few of the buzzwords that are proliferating the biotech information super-highway and are widely used by pharmaceutical scientists, clinical scientists, researchers and medical humanitarians when referring to pharmacogenomics. Malorye Branca of Bio-IT World stated, “One of the most seductive lures of the genomic revolution is the promise of personalized medicine”. Pharmacogenomics is the study of how one’s genetic makeup affects the body’s response to drugs, hence an intersection of genetics, pharmacodynamics and pharmacokinetics. Pharmacogenetics is widely used synonymously with pharmacogenomics. Conceptually, these genomics terms are interchangeable, but from a purist view, pharmacogenomics is the technology where as pharmacogenetics is the science. Genaissance Pharmaceuticals defined pharmacogenomics as the application of genome science (genomics) to the study of human variability to drug response. So, what’s the real tumult? In the United States, there is at least 100, 000 death annually due to adverse reactions (side effects) to prescription drugs. Moreover, millions of people are being treated with drugs that are ineffective or have very little pharmacological effect; beta-blockers given to reduce blood pressure are ineffective in one-third of patients and many antidepressants in half of the people who take them [1]. The culpability for the lack of efficacy and intolerance of many drugs lies mainly with our genes, which help to determine the way in which our body reacts, absorbs,
  • 3. Frank | PGX Data Mining 2 distributes, metabolize and excretes drugs. Small genetic variations between people (known as polymorphisms) can alter the behavior of proteins that carry a drug to its target cells or tissues, neutralize the enzymes that activate a drug or aid in the excretion process or alter the structure of the receptor to which a drug is supposed to bind [1]. Variation in immune-system genes can also influence how particular drugs are tolerated. These slight genetic variations mean that the dose at which a drug will work may vary hugely from person to person; hence, the one-size- fits-all drug development and prescribing can lead to life-threatening adverse reaction to a drug or in some cases, fatality. On the right path forward, the genomics revolution has given us the tools to identify people who don't fit the standard prescribing mold. Genomics is the use of high throughput molecular biology technologies to study large numbers of genes, and gene products simultaneously in whole cells, whole tissues, or whole organisms [2]. The genome is all of the genetic material in a cell or an organism. According to the U.S. Department of Energy, the genome is an organism’s complete set of DNA. In the human genome, DNA is arranged into 24 distinct chromosomes, which are separate molecules (physically) that range in length from about 50 million to 250 million base pairs [3]. Each chromosome is a single strand of the DNA double helix that is very long in length (as illustrated Figure 1).
  • 4. Frank | PGX Data Mining 3 Figure 11: Illustration of a chromosome replicating its DNA before a cell divides. Single nucleotide polymorphisms (SNPs) are single-letter variations in the genetic code that are scattered throughout the genome. Most SNPs are benign, with absolutely no effect on gene structure or expression; however, a subset of these variations provides crucial links to disease-causing genes, either because they directly alter a gene's activity or aid in pinpointing the location of a disease-related gene [1]. 1 Figure is the courtesy of Genaissance Pharmaceuticals, Inc.
  • 5. Frank | PGX Data Mining 4 The profusion of SNPs and the simplistic identification, make them the ideal biomarkers for clinical studies. SNPs are also found in genes for drug-metabolizing enzymes, influencing individuals' ability to process a drug properly. Many companies have compiled large collections of SNPs with the intention of developing diagnostic and prognostic tests, as well as to guide the development of a new generation of drugs that would target genetically determined subsets of patients [1]. All in all, this type of genomic technology as it aims to identify the best possible medications for individuals while maximizing efficacy and minimizing toxicity is known as pharmacogenomics. Due the gravity and promise of pharmacogenomics, several genomics companies are manufacturing DNA microarrays to identify common SNPs that influence the activity of various enzymes. Ultimately, these gene expression chips could help to prevent life-threatening reactions to drugs, identify appropriate drug doses, and prescribe the right drug combination (or concomitant medications) to give to patients with complex conditions. In order for this to come to fulfillment at faster pace, one can applied data mining techniques to a clinical data warehouse that contains both clinical trials data and genomic data (anonymized genotyping and microarray) utilizing hierarchical clustering algorithms.
  • 6. Frank | PGX Data Mining 5 The data mining technique most widely utilized for the analysis of gene expression data is hierarchical clustering. This type of clustering algorithms has the advantage of being relatively simple and the result can be easily visualized. Hierarchical clustering is an agglomerative approach in which single expression profiles are joined to form groups that are further joined until the process has been carried to completion, forming a single hierarchical tree [5]. There are six main hierarchical clustering algorithms (single-linkage, complete- linkage, average-linkage, weighted pair-group average, within-groups and Ward’s method) that can be applied to gene expression profiling (microarray) data analysis. These clustering algorithms differ in the methodology in which distances are calculated between the growing clusters and the remaining members (including other clusters) in the data set. [5] Single-linkage Clustering: This method is also referred to as the minimum, or nearest-neighbor method. The distance between two clusters, x and y, is calculated as the minimum distance between a member of cluster x and a member of cluster y. This method tends to produce “loose” clusters that can be joined, if any two members are close together. This method often results in sequential addition of single samples to an existing cluster, which in turn, produces trees with many long, single-addition branches representing clusters that have grown by accumulation. Complete-linkage Clustering: This method is also referred to as the maximum or furthest-neighbor method. The distance between two clusters is calculated
  • 7. Frank | PGX Data Mining 6 as the greatest distance between members of the relevant clusters. This method tends to produce very compact clusters of elements and the clusters are often very similar in size. Average-linkage Clustering: This method is also referred to as unweighted pair-group method average. The average distance is calculated from the distance between each point in a cluster and all other points in another cluster. The two clusters with the lowest average distance are joined together to form a new cluster. Weighted Pair-group Average: This method is identical to average-linkage clustering (as described above), except that the size of the respective clusters is used as a weight in the computations. This method should be used when the cluster sizes are suspected to be greatly uneven. Within-groups Clustering: This method is similar to average-linkage clustering also, except that the clusters are merged and a cluster average is used for further calculations instead of the individual cluster elements. This method tends to produce tighter clusters than average-linkage clustering. Ward's Method: In this method, the calculation of the total sum of squared deviations from the mean of a cluster and joining clusters in order that it produces the smallest possible increase in the sum of squared errors determines the clusters.
  • 8. Frank | PGX Data Mining 7 Figure 32: Hierarchical Clustering Demonstration Figure 3 is a representation of gene expression data that were subjected to average- linkage, complete-linkage and single-linkage hierarchical clustering using a Euclidean distance metric and gene-expression families (A–J) that were color coded for comparison. Genes that are up-regulated appear in red, and those that are down-regulated appear in green, with the relative log2 (ratio) reflected by the intensity of the color [5]. 2 Courtesy of Nature Reviews, Nature Publishing Group
  • 9. Frank | PGX Data Mining 8 The aim and allure of pharmacogenomic data mining is to discovery knowledge from a clinical genomic data warehouse (comprised of both genomic and clinical data), in order to identify and prescribe the most effective and least toxic drug for an individual based the person’s genetic makeup and the targeted disease. References [1] Abbott, A., Nature 425, 760 - 762 (23 October 2003). [2] Genaissance Pharmaceuticals, Inc., Online Glossary (2004). [3] US Department of Energy, Human Genome Information Project, Pharmacogenomics (2004). [4] Branca, M., The New, New Pharmacogenomics, Bio-IT World (Sept. 9, 2002). [5] Quackenbush, J., Nature Reviews Genetics 2, 418-427 (2001). [6] Brown, M., Essentials of Medical Genomics, 163-198 (2003). [7] Hollinger, M.A., Introduction to Pharmacology 2, 288-290 (2003).