SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Downloaden Sie, um offline zu lesen
Brief	
  explana,on	
  of	
  
“Integra,ng	
  dilu,on-­‐based	
  sequencing	
  
and	
  popula,on	
  genotypes	
  	
  
for	
  single	
  individual	
  haplotyping”	
Hirotaka	
  Matsumoto
INTRODUCTION
Single	
  individual	
  haplotyping	
  (SIH)	
•  Infer	
  haplotypes	
  from	
  sequence	
  fragments.	
(SNP	
  fragments)
Single	
  individual	
  haplotyping	
  (SIH)	
•  Infer	
  haplotypes	
  from	
  sequence	
  fragments.
Single	
  individual	
  haplotyping	
  (SIH)	
•  Infer	
  haplotypes	
  from	
  sequence	
  fragments.
Dilu,on-­‐based	
  sequencing	
•  SIH	
  needs	
  long	
  DNA	
  sequencing	
  reads	
  
•  Dilu,on-­‐based	
  sequencing	
  can	
  produce	
  long	
  reads	
  
–  Fosmid	
  pool-­‐based	
  NGS	
  
	
  
	
  
–  Long	
  fragment	
  technology	
  
–  Dilu,on-­‐amplifica,on-­‐based	
  sequencing
Process	
  of	
  dilu,on-­‐based	
  seq	
DNA	
  fragments	
  are	
  separated	
  into	
  mul,ple	
  low-­‐concentra,on	
  dilu,ons.	
  
	
  
ASer	
  sequencing	
  and	
  mapping	
  an	
  aliquot,	
  mapped	
  reads	
  form	
  clusters	
  
which	
  correspond	
  to	
  DNA	
  fragments.	
  
	
  
Clusters	
  are	
  merged	
  into	
  read	
  fragments	
  (SNP	
  fragments)	
(i)	
  
	
  
(ii)	
  
	
  
	
  
(iii)	
  
	
  
Chimeric	
  fragment	
  (CF)	
•  Problem	
  of	
  producing	
  chimeric	
  fragments	
  (CFs)	
  
–  Reads	
  with	
  different	
  chromosomal	
  origins	
  are	
  regarded	
  as	
  one	
  cluster	
  
and	
  merged	
  into	
  a	
  fragment	
  when	
  an	
  aliquot	
  happen	
  to	
  have	
  some	
  
long	
  DNA	
  fragments	
  derived	
  from	
  the	
  same	
  region.	
  
–  CFs	
  significantly	
  decrease	
  the	
  accuracy	
  of	
  SIH.
METHOD	
  
	
  
target:	
  detec,on	
  of	
  CFs
Detec,on	
  of	
  CFs	
•  Basis	
  of	
  our	
  strategy	
  
– CFs	
  correspond	
  to	
  an	
  ar,ficially	
  recombinant	
  
haplotype	
  and	
  differ	
  from	
  biological	
  haplotypes	
  in	
  
the	
  popula,on.	
  
PHASE	
•  Sta,s,cal	
  phasing	
  method	
  
–  Infer	
  haplotypes	
  from	
  popula,on.	
  
–  The	
  diversity	
  of	
  haplotypes	
  is	
  limited	
  and	
  there	
  are	
  
conserved	
  haplotypes.	
  
•  We	
  use	
  PHASE	
  to	
  obtain	
  the	
  haplotype	
  candidates.	
  
–  Example	
  of	
  output	
  
A	
  candidate	
  of	
  haplotypes	
  
and	
  its	
  probability.
CF	
  detec,on	
  model	
•  We	
  model	
  the	
  probabili,es	
  that	
  a	
  SNP	
  
fragment	
  is	
  normal	
  fragment	
  and	
  chimeric	
  
fragment.	
  
•  With	
  there	
  probabili,es	
  we	
  develop	
  a	
  
indicator	
  “CSP”	
  which	
  evaluates	
  the	
  chimerity	
  
of	
  a	
  SNP	
  fragment.
NF	
  probability	
•  NF	
  probability	
  
–  The	
  probability	
  that	
  a	
  SNP	
  fragment	
  is	
  normal	
  fragment	
  (NF).	
  
–  Calculate	
  the	
  consistency	
  between	
  sta,s,cally	
  phased	
  haplotypes	
  and	
  
a	
  fragment.	
  
CF	
  probability	
•  CF	
  probability	
  
–  The	
  probability	
  that	
  a	
  SNP	
  fragment	
  is	
  chimeric	
  fragment.	
  
–  LeS	
  and	
  right	
  parts	
  are	
  derived	
  from	
  different	
  haplotypes.	
  
	
ll
CSP	
•  Chimericy	
  based	
  on	
  sta,s,cal	
  phasing	
  (CSP)	
  
•  Low	
  CSP	
  values	
  means	
  
– the	
  fragment	
  correspond	
  to	
  recombinant	
  of	
  
sta,s,cally	
  phased	
  haplotypes.	
  
– the	
  fragment	
  is	
  suspected	
  of	
  CF.
Sliding-­‐window	
  approach	
•  Running	
  ,me	
  of	
  PHASE	
  increases	
  according	
  to	
  SNP	
  
fragment	
  size.	
  
–  Complexity	
  of	
  popula,on	
  haplotypes	
  increase	
  
exponen,ally.	
  
•  We	
  use	
  sliding-­‐window	
  approach	
  (W=5).	
sliding-­‐window
RESULT
dataset	
•  Dilu,on-­‐based	
  sequencing	
  
– Kaper’s	
  data	
  
– Duitama’s	
  data	
  
•  True	
  haplotypes	
  
– Trio-­‐based	
  haplotypes	
  
•  True	
  NFs	
  and	
  CFs	
  
– Defined	
  by	
  true	
  haplotypes
CSP	
  distribu,on	
•  CSP	
  of	
  CFs	
  is	
  lower	
  than	
  that	
  of	
  NFs	
Theore,cal	
  lowest	
  value	
  (W=5)	
  
	
  -­‐	
  	
  Change	
  haplotype	
  origin	
  at	
  second	
  or	
  third	
  site.	
Fragment:	
  	
  	
  	
  00011	
  
Haplotypes:	
  00000	
  /	
  11111
CF	
  detec,on	
•  CSP	
  is	
  a	
  highly	
  efficient	
  measure	
  to	
  detect	
  CFs.
SIH	
  accuracy	
  aSer	
  removing	
  CFs	
•  The	
  accuracies	
  of	
  SIH	
  increased	
  significantly	
  
aSer	
  removing	
  CSs	
  detected	
  by	
  CSP.
CONCLUSION	
•  CSP	
  is	
  a	
  highly	
  efficient	
  measure	
  to	
  detect	
  
chimeric	
  fragments.	
  
•  SIH	
  accuracy	
  increased	
  significantly	
  aSer	
  
removing	
  CFs	
  candidates	
  detected	
  using	
  CSP.

Weitere ähnliche Inhalte

Ähnlich wie CSP

Fluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio npptFluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio nppt
Genevia Vincent
 

Ähnlich wie CSP (20)

GENE gene marker blood typing , abo blood typing vntr
GENE gene marker blood typing , abo blood typing vntrGENE gene marker blood typing , abo blood typing vntr
GENE gene marker blood typing , abo blood typing vntr
 
Gene mapping methods
Gene mapping methodsGene mapping methods
Gene mapping methods
 
RNASeq Experiment Design
RNASeq Experiment DesignRNASeq Experiment Design
RNASeq Experiment Design
 
Assembling NGS Data - IMB Winter School - 3 July 2012
Assembling NGS Data - IMB Winter School - 3 July 2012Assembling NGS Data - IMB Winter School - 3 July 2012
Assembling NGS Data - IMB Winter School - 3 July 2012
 
molecular marker RFLP, and application
molecular marker RFLP, and applicationmolecular marker RFLP, and application
molecular marker RFLP, and application
 
GENE marker typing , abo blood grouping ,karl lamdsteiner
GENE marker typing , abo blood grouping ,karl lamdsteinerGENE marker typing , abo blood grouping ,karl lamdsteiner
GENE marker typing , abo blood grouping ,karl lamdsteiner
 
Forsharing cshl2011 sequencing
Forsharing cshl2011 sequencingForsharing cshl2011 sequencing
Forsharing cshl2011 sequencing
 
Single Nucleotide Polymorphisms (2)-1.pptx
Single Nucleotide Polymorphisms (2)-1.pptxSingle Nucleotide Polymorphisms (2)-1.pptx
Single Nucleotide Polymorphisms (2)-1.pptx
 
Gene mapping and sequencing
Gene mapping and sequencingGene mapping and sequencing
Gene mapping and sequencing
 
Genetic fingerprinting
Genetic fingerprintingGenetic fingerprinting
Genetic fingerprinting
 
Fundamentals of Fluorescence in situ Hybridization
Fundamentals of Fluorescence in situ Hybridization Fundamentals of Fluorescence in situ Hybridization
Fundamentals of Fluorescence in situ Hybridization
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies2007. stephen chanock. technologic issues in gwas and follow up studies
2007. stephen chanock. technologic issues in gwas and follow up studies
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Fluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio npptFluoroscent insitu hybridizatio nppt
Fluoroscent insitu hybridizatio nppt
 
Dna fingerprinting by laxmee dahal
Dna fingerprinting by laxmee dahalDna fingerprinting by laxmee dahal
Dna fingerprinting by laxmee dahal
 
Molecular marker by anil bl gather
Molecular marker by anil bl gatherMolecular marker by anil bl gather
Molecular marker by anil bl gather
 
Single Nucleotide Polymorphism
Single Nucleotide PolymorphismSingle Nucleotide Polymorphism
Single Nucleotide Polymorphism
 
Nanoball squencing
Nanoball squencingNanoball squencing
Nanoball squencing
 
EiB Seminar from Antoni Miñarro, Ph.D
EiB Seminar from Antoni Miñarro, Ph.DEiB Seminar from Antoni Miñarro, Ph.D
EiB Seminar from Antoni Miñarro, Ph.D
 

Mehr von Hirotaka Matsumoto (7)

球面と双曲面の幾何学入門の入門
球面と双曲面の幾何学入門の入門球面と双曲面の幾何学入門の入門
球面と双曲面の幾何学入門の入門
 
ISMB/ECCB2019読み会_松本
ISMB/ECCB2019読み会_松本ISMB/ECCB2019読み会_松本
ISMB/ECCB2019読み会_松本
 
ISMB2018読み会
ISMB2018読み会ISMB2018読み会
ISMB2018読み会
 
PRML11.2 - 11.6
PRML11.2 - 11.6PRML11.2 - 11.6
PRML11.2 - 11.6
 
次元圧縮周りでの気付き&1細胞発現データにおける次元圧縮の利用例@第3回wacode
次元圧縮周りでの気付き&1細胞発現データにおける次元圧縮の利用例@第3回wacode次元圧縮周りでの気付き&1細胞発現データにおける次元圧縮の利用例@第3回wacode
次元圧縮周りでの気付き&1細胞発現データにおける次元圧縮の利用例@第3回wacode
 
Prml11 sup
Prml11 supPrml11 sup
Prml11 sup
 
MixSIH: a mixture model for single individual haplotyping
MixSIH: a mixture model for single individual haplotypingMixSIH: a mixture model for single individual haplotyping
MixSIH: a mixture model for single individual haplotyping
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

CSP

  • 1. Brief  explana,on  of   “Integra,ng  dilu,on-­‐based  sequencing   and  popula,on  genotypes     for  single  individual  haplotyping” Hirotaka  Matsumoto
  • 3. Single  individual  haplotyping  (SIH) •  Infer  haplotypes  from  sequence  fragments. (SNP  fragments)
  • 4. Single  individual  haplotyping  (SIH) •  Infer  haplotypes  from  sequence  fragments.
  • 5. Single  individual  haplotyping  (SIH) •  Infer  haplotypes  from  sequence  fragments.
  • 6. Dilu,on-­‐based  sequencing •  SIH  needs  long  DNA  sequencing  reads   •  Dilu,on-­‐based  sequencing  can  produce  long  reads   –  Fosmid  pool-­‐based  NGS       –  Long  fragment  technology   –  Dilu,on-­‐amplifica,on-­‐based  sequencing
  • 7. Process  of  dilu,on-­‐based  seq DNA  fragments  are  separated  into  mul,ple  low-­‐concentra,on  dilu,ons.     ASer  sequencing  and  mapping  an  aliquot,  mapped  reads  form  clusters   which  correspond  to  DNA  fragments.     Clusters  are  merged  into  read  fragments  (SNP  fragments) (i)     (ii)       (iii)    
  • 8. Chimeric  fragment  (CF) •  Problem  of  producing  chimeric  fragments  (CFs)   –  Reads  with  different  chromosomal  origins  are  regarded  as  one  cluster   and  merged  into  a  fragment  when  an  aliquot  happen  to  have  some   long  DNA  fragments  derived  from  the  same  region.   –  CFs  significantly  decrease  the  accuracy  of  SIH.
  • 9. METHOD     target:  detec,on  of  CFs
  • 10. Detec,on  of  CFs •  Basis  of  our  strategy   – CFs  correspond  to  an  ar,ficially  recombinant   haplotype  and  differ  from  biological  haplotypes  in   the  popula,on.  
  • 11. PHASE •  Sta,s,cal  phasing  method   –  Infer  haplotypes  from  popula,on.   –  The  diversity  of  haplotypes  is  limited  and  there  are   conserved  haplotypes.   •  We  use  PHASE  to  obtain  the  haplotype  candidates.   –  Example  of  output   A  candidate  of  haplotypes   and  its  probability.
  • 12. CF  detec,on  model •  We  model  the  probabili,es  that  a  SNP   fragment  is  normal  fragment  and  chimeric   fragment.   •  With  there  probabili,es  we  develop  a   indicator  “CSP”  which  evaluates  the  chimerity   of  a  SNP  fragment.
  • 13. NF  probability •  NF  probability   –  The  probability  that  a  SNP  fragment  is  normal  fragment  (NF).   –  Calculate  the  consistency  between  sta,s,cally  phased  haplotypes  and   a  fragment.  
  • 14. CF  probability •  CF  probability   –  The  probability  that  a  SNP  fragment  is  chimeric  fragment.   –  LeS  and  right  parts  are  derived  from  different  haplotypes.   ll
  • 15. CSP •  Chimericy  based  on  sta,s,cal  phasing  (CSP)   •  Low  CSP  values  means   – the  fragment  correspond  to  recombinant  of   sta,s,cally  phased  haplotypes.   – the  fragment  is  suspected  of  CF.
  • 16. Sliding-­‐window  approach •  Running  ,me  of  PHASE  increases  according  to  SNP   fragment  size.   –  Complexity  of  popula,on  haplotypes  increase   exponen,ally.   •  We  use  sliding-­‐window  approach  (W=5). sliding-­‐window
  • 18. dataset •  Dilu,on-­‐based  sequencing   – Kaper’s  data   – Duitama’s  data   •  True  haplotypes   – Trio-­‐based  haplotypes   •  True  NFs  and  CFs   – Defined  by  true  haplotypes
  • 19. CSP  distribu,on •  CSP  of  CFs  is  lower  than  that  of  NFs Theore,cal  lowest  value  (W=5)    -­‐    Change  haplotype  origin  at  second  or  third  site. Fragment:        00011   Haplotypes:  00000  /  11111
  • 20. CF  detec,on •  CSP  is  a  highly  efficient  measure  to  detect  CFs.
  • 21. SIH  accuracy  aSer  removing  CFs •  The  accuracies  of  SIH  increased  significantly   aSer  removing  CSs  detected  by  CSP.
  • 22. CONCLUSION •  CSP  is  a  highly  efficient  measure  to  detect   chimeric  fragments.   •  SIH  accuracy  increased  significantly  aSer   removing  CFs  candidates  detected  using  CSP.