SlideShare ist ein Scribd-Unternehmen logo
1 von 21
4th International Conference on Bioinformatics and Computational Biology (BICoB)

Detecting STR peaks in degraded
DNA samples
Emanuela Marasco, Arun Ross, Jeremy Dawson, Tina Moroose, Tanya Ambrose
Lane Department of Computer Science and Electrical Engineering
&
Forensic and Investigative Science Program

March 13, 2012 - Las Vegas, Nevada, USA

Thanks to
Prof. Tom O’Haver
1
Outline
•
•
•
•
•
•

DNA typing
Short Tandem Repeat (STR)
DNA analysis via GeneMapperID
The proposed signal processing approach
Results
Conclusions and future research

2
DNA typing
• Technologies used for
performing DNA analysis
differ in their ability to
differentiate two
individuals for 2 aspects:
• The speed
• The sensitivity

• STR typing offers the
best trade-off

• Our contribution: improve the sensitivity of the DNA
processing methodologies in the presence of degraded samples
J. Butler, Forensic DNA typing, Biology, Technology, and Genetic of STR markers, Second Edition

3
The human genome
• The human genome contained
in every cell consists of 23
pairs chromosomes

• Chromosomes which are dense
packets of DNA and proteins
• Four bases make up DNA:
Adenine (A), Guanine (G),
Cytosine (C) and Thymine (T)

• Most human identity testing is performed using markers on
autosomal chromosomes
• A DNA marker or locus refers to the chromosomal position
J. Butler, Forensic DNA typing, Biology, Technology, and Genetic of STR markers, Second Edition
Alleles
• The 99.7% of our DNA molecules is the
same between people; a small fraction of
0.3% makes us unique individuals
• DNA variation is exhibited in the form of
different alleles; an allele is a variant of the
DNA sequence or length at a given locus
Sequence
polymorphism

Length
polymorphism

5
What is a Short Tandem Repeat (STR)?
ATCTTCTAACACATGACCGATCATGCATGCATGCATGCATGC
ATGCATGCATGCATGCATGCATGTTCCATGATAGCACAT
• An STR is the repetitive section of the sequence
(2-5 base pair)
• STRs are short and fast to be processed
(all together at a time)

• Human identification from DNA is
typically based on 13 core loci
(american system)
• They are very discriminative
among individuals
• There is no overlapping among
loci
6
Steps in DNA sample processing
Biological perspective

DNA
extraction

DNA
quantitation

PCR
amplification
of multiple
STR markers

PCR
products

Technological perspective
Separation and
detection of PCR product
(STR alleles)

•

Sample
genotype
determination

DNA
profile

Amplified data are displayed as fluorescent peaks
7
DNA analysis via GeneMapperID

• the Internal Lane Standard (ILS)
is used to assign the size the peaks

Internal Lane Standard
• only peaks above an analytical threshold are
considered real data (lower peaks are noise)
• actual peaks are higher than a stochastic
threshold
8
Effects of degradation on STR peaks
•

•

Degradation process occurs

rapidly when samples are
exposed to the enviromental
conditions: UV light, humidity,
high temperature and bacterial
contamination
As samples age, DNA begins to
break down (or degrade)

Nondegraded

45 secs
• Degradation can reduce the
height of some peaks or making
them disappear entirely
• Peaks are shifted

http://www.bioforensics.com/articles/champion1/champion1.html

75 secs

9
The proposed signal processing scheme
• A DNA sample is represented by a DNA signal in which x-axis
indicates data point and y-axis the amplitude
• A degraded DNA sample is represented by a weak signal
confounded by several noise sources

Input:
• Let x= {xj} with j=1, …N
be the input DNA signal
• Let NC= {NCj} with
j=1, …N be the signal
due to the Negative
Control
• Let t= {tj} with j=1, …N
be the instances in which
the signals are sampled

Positive Control
(Reagents, DNA, ILS)

Negative Control
(Reagents, ILS)

10
Differentiation of signals
Peak-type signal

Derivative
• Given a peak-type
signal such as those
used in STR analysis,
the location of the
maximum can be
computed as location of
the zero-crossing points
in its first derivative

Peak-type signal

Derivative
• Given two peaks
having the same
height, the wider
peak results in a
lower amplitude in
the first derivative
11
Our proposed method
1. De-noise the DNA signal x:

PC-NC

Negative Control (NC) contains
only reagents and no DNA

2. Compute the first derivative
of the enhanced signal:

Diff(PC-NC)

False
Positive
12
Our proposed method
3. Smooth the derivative of the signal
each point in the signal is replaced with the average of
m adiacent points (m is the smooth width)

http://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm

13
Our proposed method
4. Peak detection

based on an amplitude threshold and a slope threshold
5. Estimate peaks details (location, height and width) on the
un-smoothed signal by using a curve fitting function:
a polynomial of degree 2 is fitted through the detected peaks

and

are mean and standard deviation of a set of
points in the vicinity of the detected peak

14
Dataset 1: ultraviolet degradation
• The first dataset was collected at the WVU Department of
Forensic and Investigative Sciences by performing a controlled
artificial DNA degradation using ultraviolet radiation
• Positive Control
• Degraded samples obtained after an exposure to UV of
35secs, 75secs, 150secs and 240secs

Pang B. C. M. and Cheung B. K. K. One-step generation of degraded DNA by UV irradiation.
Analytical Biochemistry 2007; 360: 163-165

15
Dataset 2: Low Copy Number (LCN) data
• Low Copy Number (LCN) are samples which contain
less than 100pg of DNA template
• One of the reason for LCN is degradation
• The second dataset was provided by NIST obtained
by varying cycle counts for the PCR processing step
•
•

Positive Control, 1 ng/ μL
Low Copy Number (LCN):
100pg/ μL, 30pg/ μL and 10pg/ μL
at 28 cycles and 31 cycles

16
Results on degraded data with UV
GeneMapper results
In the presence of degraded samples,
for th=100
• 1 detected peak for a degradation level
obtained after an UV exposure of 75 secs
• 0 detected peaks for a obtained after an
exposure of 150 secs onward 0

Peak finding results
For Th= 0.37*max(x(t))

7 peaks detected for the sample obtained
after an UV exposure of 150 seconds
• 3 peaks for the sample obtained after an
UV exposure of 240 seconds

17
Results on Low Copy Number (LCN) data
GeneMapper results
Th = 100

Peak finding results

• The success rate of GeneMapper typing system decreases when
decreasing the DNA amount present in the analyzed samples
• The amount of DNA factoring the sample presents a non-significant
impact on the performance detection of the peak finding algorithm

18
Obtained improvement
• Under UV degradation
PC

0%

35 secs

0%

75 secs

80%

150 secs

100%

240 secs

100%

• In the presence of LCN
1ng

-50.0%

100 pg

3.1%

30 pg

34.4%

10 pg

25.0%

• Results are reported for blue dye data by setting the
GeneMapperID threshold to 100 and averaging on two
samples for LCN data

• The peak detection rate improves in the presence of
degraded samples since the algorithm has been designed
to deal with noise signals
19
Conclusions
•

Strength of the proposed approach:
– it uses an adaptive threshold
– High discrimination power against wider peaks provided by
differentiation

•

Our experiments show the robustness of the proposed peak
finding algorithm to high level UV degradation and when dealing
with critical amount of DNA (less than 100pg)

•

Limitation: the peak detection algorithm uses a global threshold

•

Coming up:
– to model the degradation process
– designing a local threshold
– since the adopted derivative was first-order, we will carry out
experiments with higher order
20
Thanks!
Any questions?
Acknowledgements
•
•
•
•
•

The work was funded by Citer
Many thanks to Raghunandan Pasula, Lane Department of Computer
Science and Electrical Engineering, West Virginia University, for his
assistance during the development of the project;
Prof. Thomas C. O’Haver, Department of Chemistry and Biochemistry,
University of Maryland for his assistance with our queries regarding
peak detection;
National Forensic Science Technology Center (NFSTC) for providing
scientific training services.
The Peak Finding code developed by Prof. O’Haver was used in this work
21

Weitere ähnliche Inhalte

Was ist angesagt?

Dna in criminal justice_complete slides
Dna in criminal justice_complete slidesDna in criminal justice_complete slides
Dna in criminal justice_complete slidesKari Ann Bitgue
 
Forensic significance of dna profiling
Forensic significance of dna profilingForensic significance of dna profiling
Forensic significance of dna profilingSONiaChahal1
 
DNA Analysis
DNA Analysis DNA Analysis
DNA Analysis Yosok Pun
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2teamchaotex
 
Forensic Serology
Forensic SerologyForensic Serology
Forensic Serologyannperry09
 
short tandem repeats profile
short tandem repeats profileshort tandem repeats profile
short tandem repeats profileBennie George
 
paternity testing pptx.
paternity testing pptx.paternity testing pptx.
paternity testing pptx.kareem
 
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEPacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEMuunda Mudenda
 
A complete review of Forensic Science and its various branches.
A complete review of Forensic Science and its various branches.A complete review of Forensic Science and its various branches.
A complete review of Forensic Science and its various branches.Hamza Mohammad
 
Genetic Markers and their importance in Forensics
Genetic Markers and their importance in ForensicsGenetic Markers and their importance in Forensics
Genetic Markers and their importance in ForensicsMrinal Vashisth
 
Serology spatter
Serology spatterSerology spatter
Serology spatterkbhetter
 

Was ist angesagt? (20)

TMB test
TMB testTMB test
TMB test
 
Forensic serology
Forensic serologyForensic serology
Forensic serology
 
Dna in criminal justice_complete slides
Dna in criminal justice_complete slidesDna in criminal justice_complete slides
Dna in criminal justice_complete slides
 
Forensic significance of dna profiling
Forensic significance of dna profilingForensic significance of dna profiling
Forensic significance of dna profiling
 
DNA Analysis
DNA Analysis DNA Analysis
DNA Analysis
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Dna profiling
Dna profilingDna profiling
Dna profiling
 
Forensic Serology
Forensic SerologyForensic Serology
Forensic Serology
 
Forensic DNA Typing-M. Asif
Forensic DNA Typing-M. AsifForensic DNA Typing-M. Asif
Forensic DNA Typing-M. Asif
 
DNA in forensics
DNA in forensicsDNA in forensics
DNA in forensics
 
short tandem repeats profile
short tandem repeats profileshort tandem repeats profile
short tandem repeats profile
 
paternity testing pptx.
paternity testing pptx.paternity testing pptx.
paternity testing pptx.
 
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUEPacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
 
DNA Profiling
DNA ProfilingDNA Profiling
DNA Profiling
 
A complete review of Forensic Science and its various branches.
A complete review of Forensic Science and its various branches.A complete review of Forensic Science and its various branches.
A complete review of Forensic Science and its various branches.
 
DNA FORENSIC ANALYSIS
DNA FORENSIC ANALYSISDNA FORENSIC ANALYSIS
DNA FORENSIC ANALYSIS
 
Electrophoresis
ElectrophoresisElectrophoresis
Electrophoresis
 
Different branches in forensic biology
Different branches in forensic biologyDifferent branches in forensic biology
Different branches in forensic biology
 
Genetic Markers and their importance in Forensics
Genetic Markers and their importance in ForensicsGenetic Markers and their importance in Forensics
Genetic Markers and their importance in Forensics
 
Serology spatter
Serology spatterSerology spatter
Serology spatter
 

Andere mochten auch

Gender Estimation from Fingerprints / Image De-identification for Gender
Gender Estimation from Fingerprints / Image De-identification for Gender Gender Estimation from Fingerprints / Image De-identification for Gender
Gender Estimation from Fingerprints / Image De-identification for Gender Emanuela Marasco
 
Dnaprofiling
DnaprofilingDnaprofiling
Dnaprofilingallyjer
 
DNA extraction presentation
DNA extraction presentationDNA extraction presentation
DNA extraction presentationnortje
 
Sherlyn's genetic epidemiology
Sherlyn's genetic epidemiologySherlyn's genetic epidemiology
Sherlyn's genetic epidemiologyvavaponnu
 
State v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic ScienceState v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic Sciencegcpolando
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification Senthil Natesan
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Data Science Thailand
 
Types of pcr
Types of pcr Types of pcr
Types of pcr Asma Gul
 
DNA fingerprinting 7 jan 2015
DNA fingerprinting 7 jan 2015DNA fingerprinting 7 jan 2015
DNA fingerprinting 7 jan 2015ICHHA PURAK
 
Molecular markers used in biotechnology
Molecular markers used in biotechnology Molecular markers used in biotechnology
Molecular markers used in biotechnology sana sana
 
Dna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic ApplicationsDna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic Applicationsdheva B
 
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESMICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESKaran Veer Singh
 
Find the CAT - DNA Profiling Worksheet
Find the CAT - DNA Profiling WorksheetFind the CAT - DNA Profiling Worksheet
Find the CAT - DNA Profiling WorksheetStephen Taylor
 
VNTR and RFLP
VNTR and RFLPVNTR and RFLP
VNTR and RFLPUlaa Iman
 

Andere mochten auch (20)

DNA TYPING
DNA TYPINGDNA TYPING
DNA TYPING
 
Gender Estimation from Fingerprints / Image De-identification for Gender
Gender Estimation from Fingerprints / Image De-identification for Gender Gender Estimation from Fingerprints / Image De-identification for Gender
Gender Estimation from Fingerprints / Image De-identification for Gender
 
Dnaprofiling
DnaprofilingDnaprofiling
Dnaprofiling
 
DNA extraction presentation
DNA extraction presentationDNA extraction presentation
DNA extraction presentation
 
Fs Ch 14
Fs Ch 14Fs Ch 14
Fs Ch 14
 
Ch12 notes
Ch12 notesCh12 notes
Ch12 notes
 
Dna profiling
Dna profilingDna profiling
Dna profiling
 
Sherlyn's genetic epidemiology
Sherlyn's genetic epidemiologySherlyn's genetic epidemiology
Sherlyn's genetic epidemiology
 
State v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic ScienceState v. Mott: A Case Study in Forensic Science
State v. Mott: A Case Study in Forensic Science
 
DNA Fingerprinting
DNA FingerprintingDNA Fingerprinting
DNA Fingerprinting
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification
 
Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)Single Nucleotide Polymorphism Analysis (SNPs)
Single Nucleotide Polymorphism Analysis (SNPs)
 
Types of pcr
Types of pcr Types of pcr
Types of pcr
 
DNA fingerprinting 7 jan 2015
DNA fingerprinting 7 jan 2015DNA fingerprinting 7 jan 2015
DNA fingerprinting 7 jan 2015
 
Molecular markers used in biotechnology
Molecular markers used in biotechnology Molecular markers used in biotechnology
Molecular markers used in biotechnology
 
Dna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic ApplicationsDna Fingerprinting And Forensic Applications
Dna Fingerprinting And Forensic Applications
 
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSESMICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
MICROSATELITE Markers for LIVESTOCK Genetic DIVERSITY ANALYSES
 
Find the CAT - DNA Profiling Worksheet
Find the CAT - DNA Profiling WorksheetFind the CAT - DNA Profiling Worksheet
Find the CAT - DNA Profiling Worksheet
 
VNTR and RFLP
VNTR and RFLPVNTR and RFLP
VNTR and RFLP
 
Dna profiling
Dna profilingDna profiling
Dna profiling
 

Ähnlich wie Detecting STR Peaks in Degraded DNA samples

Microarray technique
Microarray techniqueMicroarray technique
Microarray techniquearunchacko14
 
Microarry andd NGS.pdf
Microarry andd NGS.pdfMicroarry andd NGS.pdf
Microarry andd NGS.pdfnedalalazzwy
 
DNA MICROARRAY TECHNOLOGY FOR PRINCIPLE OF DRUG DISCOVERY
DNA  MICROARRAY  TECHNOLOGY FOR  PRINCIPLE OF DRUG DISCOVERYDNA  MICROARRAY  TECHNOLOGY FOR  PRINCIPLE OF DRUG DISCOVERY
DNA MICROARRAY TECHNOLOGY FOR PRINCIPLE OF DRUG DISCOVERYDhanashri Prakash Sonavane
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods Mrinal Vashisth
 
Dna fingerprinting
Dna fingerprintingDna fingerprinting
Dna fingerprintingmantoshrock
 
Lecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.pptLecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.pptAbdelrhman Abooda
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsAjit Shinde
 
Useful.ppt
Useful.pptUseful.ppt
Useful.pptaaaa bbb
 
DNA Sequencing - DNA sequencing is like reading the instructions inside a cell
DNA Sequencing -  DNA sequencing is like reading the instructions inside a cellDNA Sequencing -  DNA sequencing is like reading the instructions inside a cell
DNA Sequencing - DNA sequencing is like reading the instructions inside a cellAmitSamadhiya1
 
DNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGSDNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGS4RTPCRAnand
 
DNA Microarray introdution and application
DNA Microarray introdution and applicationDNA Microarray introdution and application
DNA Microarray introdution and applicationNeeraj Sharma
 
DNA based diagnosis of geneticdiseases - by Chinmayi Upadhyaya
DNA based diagnosis of geneticdiseases - by Chinmayi UpadhyayaDNA based diagnosis of geneticdiseases - by Chinmayi Upadhyaya
DNA based diagnosis of geneticdiseases - by Chinmayi UpadhyayaChinmayi Upadhyaya
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Mrinal Vashisth
 
Nucleic acid techniques in diagnostic microbiology
Nucleic acid techniques in diagnostic microbiologyNucleic acid techniques in diagnostic microbiology
Nucleic acid techniques in diagnostic microbiologymohit kumar
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0Computer Science Club
 

Ähnlich wie Detecting STR Peaks in Degraded DNA samples (20)

Microarray technique
Microarray techniqueMicroarray technique
Microarray technique
 
Microarry andd NGS.pdf
Microarry andd NGS.pdfMicroarry andd NGS.pdf
Microarry andd NGS.pdf
 
Presentation (4).pptx
Presentation (4).pptxPresentation (4).pptx
Presentation (4).pptx
 
DNA MICROARRAY TECHNOLOGY FOR PRINCIPLE OF DRUG DISCOVERY
DNA  MICROARRAY  TECHNOLOGY FOR  PRINCIPLE OF DRUG DISCOVERYDNA  MICROARRAY  TECHNOLOGY FOR  PRINCIPLE OF DRUG DISCOVERY
DNA MICROARRAY TECHNOLOGY FOR PRINCIPLE OF DRUG DISCOVERY
 
Next generation sequencing methods
Next generation sequencing methods Next generation sequencing methods
Next generation sequencing methods
 
12 arrays
12 arrays12 arrays
12 arrays
 
12 arrays
12 arrays12 arrays
12 arrays
 
Dna fingerprinting
Dna fingerprintingDna fingerprinting
Dna fingerprinting
 
Lecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.pptLecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
Lecture 10 2023Lecture 10 2023Lecture 10 2023.ppt
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Useful.ppt
Useful.pptUseful.ppt
Useful.ppt
 
DNA Sequencing - DNA sequencing is like reading the instructions inside a cell
DNA Sequencing -  DNA sequencing is like reading the instructions inside a cellDNA Sequencing -  DNA sequencing is like reading the instructions inside a cell
DNA Sequencing - DNA sequencing is like reading the instructions inside a cell
 
DNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGSDNA Sequencing: History, methods and NGS
DNA Sequencing: History, methods and NGS
 
DNA Microarray introdution and application
DNA Microarray introdution and applicationDNA Microarray introdution and application
DNA Microarray introdution and application
 
DNA based diagnosis of geneticdiseases - by Chinmayi Upadhyaya
DNA based diagnosis of geneticdiseases - by Chinmayi UpadhyayaDNA based diagnosis of geneticdiseases - by Chinmayi Upadhyaya
DNA based diagnosis of geneticdiseases - by Chinmayi Upadhyaya
 
Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)Next generation sequencing methods (final edit)
Next generation sequencing methods (final edit)
 
Hamas 1
Hamas 1Hamas 1
Hamas 1
 
Nucleic acid techniques in diagnostic microbiology
Nucleic acid techniques in diagnostic microbiologyNucleic acid techniques in diagnostic microbiology
Nucleic acid techniques in diagnostic microbiology
 
20100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_020100509 bioinformatics kapushesky_lecture03-04_0
20100509 bioinformatics kapushesky_lecture03-04_0
 
DNA Chip
DNA ChipDNA Chip
DNA Chip
 

Kürzlich hochgeladen

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 

Kürzlich hochgeladen (20)

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 

Detecting STR Peaks in Degraded DNA samples

  • 1. 4th International Conference on Bioinformatics and Computational Biology (BICoB) Detecting STR peaks in degraded DNA samples Emanuela Marasco, Arun Ross, Jeremy Dawson, Tina Moroose, Tanya Ambrose Lane Department of Computer Science and Electrical Engineering & Forensic and Investigative Science Program March 13, 2012 - Las Vegas, Nevada, USA Thanks to Prof. Tom O’Haver 1
  • 2. Outline • • • • • • DNA typing Short Tandem Repeat (STR) DNA analysis via GeneMapperID The proposed signal processing approach Results Conclusions and future research 2
  • 3. DNA typing • Technologies used for performing DNA analysis differ in their ability to differentiate two individuals for 2 aspects: • The speed • The sensitivity • STR typing offers the best trade-off • Our contribution: improve the sensitivity of the DNA processing methodologies in the presence of degraded samples J. Butler, Forensic DNA typing, Biology, Technology, and Genetic of STR markers, Second Edition 3
  • 4. The human genome • The human genome contained in every cell consists of 23 pairs chromosomes • Chromosomes which are dense packets of DNA and proteins • Four bases make up DNA: Adenine (A), Guanine (G), Cytosine (C) and Thymine (T) • Most human identity testing is performed using markers on autosomal chromosomes • A DNA marker or locus refers to the chromosomal position J. Butler, Forensic DNA typing, Biology, Technology, and Genetic of STR markers, Second Edition
  • 5. Alleles • The 99.7% of our DNA molecules is the same between people; a small fraction of 0.3% makes us unique individuals • DNA variation is exhibited in the form of different alleles; an allele is a variant of the DNA sequence or length at a given locus Sequence polymorphism Length polymorphism 5
  • 6. What is a Short Tandem Repeat (STR)? ATCTTCTAACACATGACCGATCATGCATGCATGCATGCATGC ATGCATGCATGCATGCATGCATGTTCCATGATAGCACAT • An STR is the repetitive section of the sequence (2-5 base pair) • STRs are short and fast to be processed (all together at a time) • Human identification from DNA is typically based on 13 core loci (american system) • They are very discriminative among individuals • There is no overlapping among loci 6
  • 7. Steps in DNA sample processing Biological perspective DNA extraction DNA quantitation PCR amplification of multiple STR markers PCR products Technological perspective Separation and detection of PCR product (STR alleles) • Sample genotype determination DNA profile Amplified data are displayed as fluorescent peaks 7
  • 8. DNA analysis via GeneMapperID • the Internal Lane Standard (ILS) is used to assign the size the peaks Internal Lane Standard • only peaks above an analytical threshold are considered real data (lower peaks are noise) • actual peaks are higher than a stochastic threshold 8
  • 9. Effects of degradation on STR peaks • • Degradation process occurs rapidly when samples are exposed to the enviromental conditions: UV light, humidity, high temperature and bacterial contamination As samples age, DNA begins to break down (or degrade) Nondegraded 45 secs • Degradation can reduce the height of some peaks or making them disappear entirely • Peaks are shifted http://www.bioforensics.com/articles/champion1/champion1.html 75 secs 9
  • 10. The proposed signal processing scheme • A DNA sample is represented by a DNA signal in which x-axis indicates data point and y-axis the amplitude • A degraded DNA sample is represented by a weak signal confounded by several noise sources Input: • Let x= {xj} with j=1, …N be the input DNA signal • Let NC= {NCj} with j=1, …N be the signal due to the Negative Control • Let t= {tj} with j=1, …N be the instances in which the signals are sampled Positive Control (Reagents, DNA, ILS) Negative Control (Reagents, ILS) 10
  • 11. Differentiation of signals Peak-type signal Derivative • Given a peak-type signal such as those used in STR analysis, the location of the maximum can be computed as location of the zero-crossing points in its first derivative Peak-type signal Derivative • Given two peaks having the same height, the wider peak results in a lower amplitude in the first derivative 11
  • 12. Our proposed method 1. De-noise the DNA signal x: PC-NC Negative Control (NC) contains only reagents and no DNA 2. Compute the first derivative of the enhanced signal: Diff(PC-NC) False Positive 12
  • 13. Our proposed method 3. Smooth the derivative of the signal each point in the signal is replaced with the average of m adiacent points (m is the smooth width) http://terpconnect.umd.edu/~toh/spectrum/PeakFindingandMeasurement.htm 13
  • 14. Our proposed method 4. Peak detection based on an amplitude threshold and a slope threshold 5. Estimate peaks details (location, height and width) on the un-smoothed signal by using a curve fitting function: a polynomial of degree 2 is fitted through the detected peaks and are mean and standard deviation of a set of points in the vicinity of the detected peak 14
  • 15. Dataset 1: ultraviolet degradation • The first dataset was collected at the WVU Department of Forensic and Investigative Sciences by performing a controlled artificial DNA degradation using ultraviolet radiation • Positive Control • Degraded samples obtained after an exposure to UV of 35secs, 75secs, 150secs and 240secs Pang B. C. M. and Cheung B. K. K. One-step generation of degraded DNA by UV irradiation. Analytical Biochemistry 2007; 360: 163-165 15
  • 16. Dataset 2: Low Copy Number (LCN) data • Low Copy Number (LCN) are samples which contain less than 100pg of DNA template • One of the reason for LCN is degradation • The second dataset was provided by NIST obtained by varying cycle counts for the PCR processing step • • Positive Control, 1 ng/ μL Low Copy Number (LCN): 100pg/ μL, 30pg/ μL and 10pg/ μL at 28 cycles and 31 cycles 16
  • 17. Results on degraded data with UV GeneMapper results In the presence of degraded samples, for th=100 • 1 detected peak for a degradation level obtained after an UV exposure of 75 secs • 0 detected peaks for a obtained after an exposure of 150 secs onward 0 Peak finding results For Th= 0.37*max(x(t)) 7 peaks detected for the sample obtained after an UV exposure of 150 seconds • 3 peaks for the sample obtained after an UV exposure of 240 seconds 17
  • 18. Results on Low Copy Number (LCN) data GeneMapper results Th = 100 Peak finding results • The success rate of GeneMapper typing system decreases when decreasing the DNA amount present in the analyzed samples • The amount of DNA factoring the sample presents a non-significant impact on the performance detection of the peak finding algorithm 18
  • 19. Obtained improvement • Under UV degradation PC 0% 35 secs 0% 75 secs 80% 150 secs 100% 240 secs 100% • In the presence of LCN 1ng -50.0% 100 pg 3.1% 30 pg 34.4% 10 pg 25.0% • Results are reported for blue dye data by setting the GeneMapperID threshold to 100 and averaging on two samples for LCN data • The peak detection rate improves in the presence of degraded samples since the algorithm has been designed to deal with noise signals 19
  • 20. Conclusions • Strength of the proposed approach: – it uses an adaptive threshold – High discrimination power against wider peaks provided by differentiation • Our experiments show the robustness of the proposed peak finding algorithm to high level UV degradation and when dealing with critical amount of DNA (less than 100pg) • Limitation: the peak detection algorithm uses a global threshold • Coming up: – to model the degradation process – designing a local threshold – since the adopted derivative was first-order, we will carry out experiments with higher order 20
  • 21. Thanks! Any questions? Acknowledgements • • • • • The work was funded by Citer Many thanks to Raghunandan Pasula, Lane Department of Computer Science and Electrical Engineering, West Virginia University, for his assistance during the development of the project; Prof. Thomas C. O’Haver, Department of Chemistry and Biochemistry, University of Maryland for his assistance with our queries regarding peak detection; National Forensic Science Technology Center (NFSTC) for providing scientific training services. The Peak Finding code developed by Prof. O’Haver was used in this work 21