SlideShare ist ein Scribd-Unternehmen logo
1 von 17
HmmER 3 &Community Profiling Morgan Langille  UC Davis
HMMER 3 – What’s new? Much Faster 100 X HMMER 2 ≈ BLAST More sensitive
What’s new? Alignment column confidence Each residue is given a posterior probability annotation * = 95-100% 9= 85-95% 8= 75-85% etc. fn3 2 	      saPenlsvsevtstsltlsWsppkdgggpitgYeveyqekgegeewqevtvprtttsvtltgLepgteYefrVqavngagegp 84 saP   ++ +  ++ l ++W p +  +gpi+gY++++++++++  + e+ vp+ s+   +++L++gt+Y++ +  +n++gegp 7LESS_DROME 439   SAPVIEHLMGLDDSHLAVHWHPGRFTNGPIEGYRLRLSSSEGNA-TSEQLVPAGRGSYIFSQLQAGTNYTLALSMINKQGEGP 520 	      78999999999*****************************9998.**********************************9997 PP
What’s new? Sequence scores, not alignment scores scoring just a single best alignment can break down if it is a remote homolog scoring sequences by integrating over alignment uncertainty
Single Sequence Queries phmmer ≈ BLASTP Search a sequence against a sequence database.  jackhmmer≈ PSI-BLAST Iteratively search a sequence against a sequence database.  Internally they produce a profile HMM from the query sequence then run an HMM search
Small Changes hmmpfam -> hmmscan Search a sequence against a profile HMM database hmmcalibrate -> built into hmmbuild hmmpress Creates binary hmm files so hmmscan is faster Similar idea to formatting Blast db’s using formatdb New output format options --tblout(seq score, best domain score) --domtblout(seq score, all domain scores with coordinates) Gives a tab-delimited output without alignments 1/5 file size of regular output
Upcoming changes			 Parallelization Multi-threaded, MPI (cluster), GPU Translated comparisons BLASTX, TBLASTN, TBLASTX More input sequence formats GenBank, EMBL, etc Clustal format
Problems/Issues hmmconvert Used to convert hmmer2 profiles into hmmer3 profiles Only converts file format Good: get hmmer3 speedup  Bad: get hmmer2 sensitivity/specificity Should rebuild old HMMER2 HMMs using hmmbuild
Glocalvs local alignments Local Any portion of the HMM can align to any portion of the sequence Glocal The entire HMM is aligned to any portion of the sequence  HMMER2  Had both, but local was not as sensitive as glocal HMMER3 Local was improved so that glocal was thought to be not needed (and was not included in HMMER3) However, some models do very poorly  Short extremely diverse seed alignments such as zinc finger transcription factors may be missed
Community Profiling
Phylogenetic profiling Wu, et al., PLOS Genetics, 2005 C. hydrogenoformansidentified presence or absence of homologs in all other completely sequence genomes Identified many hypothetical proteins that had the same profile as other sporulation proteins
Community Profiling KEGG COG Delong, et al., Science, 2006
Community Profiling Look across multiple metagenomic samples Gene families that have similar profiles may have similar function Similar to using co-expression to identify similar functioning genes
So what have I done?	 Downloaded the GOS peptide file 41M sequences, 80 samples 43GB -> 7GB, by removing extra information Split into ~100 smaller files Downloaded HMMER 3 Pfams (email request) Containing 11098 Pfams Ran hmmscan on genbeo 4 days later 12.5 M pfam predictions Some sequences contain >1 pfam 9643 pfams Used “cluster” to group genes and samples
Results GOS Metagenomic Samples Red = above avg. number of pfams Green = below avg. number of pfams Have not normalized Number of sequences per sample For number of pfams Pfams
Example of phage Pfams clustering together
Future Community Profiling Include other (all) metagenomic samples Try to group Pfams by GO category to see how strong the correlation is between branch length and function Examine if some functionality categories  are more easily predicted by this profiling strategy (i.e. HGTs) Identify novel gene families and sub-families Clustering genes, building HMMs, scanning, …repeat.  Community profiling may help in annotation of these

Weitere ähnliche Inhalte

Was ist angesagt?

The efficiency of transgenesis by restriction enzyme mediated integration s...
The efficiency of transgenesis by restriction enzyme mediated integration   s...The efficiency of transgenesis by restriction enzyme mediated integration   s...
The efficiency of transgenesis by restriction enzyme mediated integration s...Alexander Decker
 
Yeast 2 hybrid system ppt by meera qaiser
Yeast 2 hybrid system ppt by meera qaiserYeast 2 hybrid system ppt by meera qaiser
Yeast 2 hybrid system ppt by meera qaiserQaiser Sethi
 
Yeast two hybrid
Yeast two hybridYeast two hybrid
Yeast two hybridhina ojha
 
Yeast two hybrid system / protein-protein interaction
Yeast two hybrid system / protein-protein interactionYeast two hybrid system / protein-protein interaction
Yeast two hybrid system / protein-protein interactionMaryam Shakeel
 
Yeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction StudiesYeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction Studiesajithnandanam
 
Protein protein interactions-ppt
Protein protein interactions-pptProtein protein interactions-ppt
Protein protein interactions-pptHamid Islampoor
 
Yeast Two Hybrid System
Yeast Two Hybrid SystemYeast Two Hybrid System
Yeast Two Hybrid SystemSuby Mon Benny
 
2. Genetic Control
2. Genetic Control2. Genetic Control
2. Genetic Controlrossbiology
 
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...Discrimination of symbiotic/parasitic bacterial type III secretion system eff...
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...Y-h Taguchi
 
2. Absorption & Secretion Of Materials
2. Absorption & Secretion Of Materials2. Absorption & Secretion Of Materials
2. Absorption & Secretion Of Materialsrossbiology
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactionsPrianca12
 
Assessing the Role of Fic Protein
Assessing the Role of Fic Protein Assessing the Role of Fic Protein
Assessing the Role of Fic Protein Ashlynn Kokaska
 
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...Katie B
 

Was ist angesagt? (20)

The efficiency of transgenesis by restriction enzyme mediated integration s...
The efficiency of transgenesis by restriction enzyme mediated integration   s...The efficiency of transgenesis by restriction enzyme mediated integration   s...
The efficiency of transgenesis by restriction enzyme mediated integration s...
 
Yeast 2 hybrid system ppt by meera qaiser
Yeast 2 hybrid system ppt by meera qaiserYeast 2 hybrid system ppt by meera qaiser
Yeast 2 hybrid system ppt by meera qaiser
 
Yeast two hybrid
Yeast two hybridYeast two hybrid
Yeast two hybrid
 
Yeast hybrid system
Yeast hybrid systemYeast hybrid system
Yeast hybrid system
 
Fehrman Nat Gen 2014 - Journal Club
Fehrman Nat Gen 2014 - Journal ClubFehrman Nat Gen 2014 - Journal Club
Fehrman Nat Gen 2014 - Journal Club
 
Yeast two hybrid system / protein-protein interaction
Yeast two hybrid system / protein-protein interactionYeast two hybrid system / protein-protein interaction
Yeast two hybrid system / protein-protein interaction
 
Yeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction StudiesYeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction Studies
 
Poster
PosterPoster
Poster
 
Protein protein interactions-ppt
Protein protein interactions-pptProtein protein interactions-ppt
Protein protein interactions-ppt
 
Yeast Two-Hybrid
Yeast Two-HybridYeast Two-Hybrid
Yeast Two-Hybrid
 
Yeast Two Hybrid System
Yeast Two Hybrid SystemYeast Two Hybrid System
Yeast Two Hybrid System
 
2. Genetic Control
2. Genetic Control2. Genetic Control
2. Genetic Control
 
Jncl schulz
Jncl schulzJncl schulz
Jncl schulz
 
Yeast n hybrid
Yeast n hybridYeast n hybrid
Yeast n hybrid
 
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...Discrimination of symbiotic/parasitic bacterial type III secretion system eff...
Discrimination of symbiotic/parasitic bacterial type III secretion system eff...
 
2. Absorption & Secretion Of Materials
2. Absorption & Secretion Of Materials2. Absorption & Secretion Of Materials
2. Absorption & Secretion Of Materials
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
Assessing the Role of Fic Protein
Assessing the Role of Fic Protein Assessing the Role of Fic Protein
Assessing the Role of Fic Protein
 
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...
A Brief Introduction to Mannose-Binding Lectin (MBL) and its Clinical Signifi...
 
Influence of micro-RNAs in Eukaryotic Gene Expression
Influence of micro-RNAs in Eukaryotic Gene ExpressionInfluence of micro-RNAs in Eukaryotic Gene Expression
Influence of micro-RNAs in Eukaryotic Gene Expression
 

Ähnlich wie HMMER 3 & Community Profiling

Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema
 
Aug2013 tumor normal whole genome sequencing
Aug2013 tumor normal whole genome sequencingAug2013 tumor normal whole genome sequencing
Aug2013 tumor normal whole genome sequencingGenomeInABottle
 
Genomic insight of__sperm_motility
Genomic insight of__sperm_motilityGenomic insight of__sperm_motility
Genomic insight of__sperm_motilitySanjay Kumar
 
Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015Monica Munoz-Torres
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012Koppolu Ravi
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1ICGEB
 
The introduction of supernova system: a vector system for single-cell labelin...
The introduction of supernova system: a vector system for single-cell labelin...The introduction of supernova system: a vector system for single-cell labelin...
The introduction of supernova system: a vector system for single-cell labelin...Div. of Neurogenet., NIG
 
Multiple mouse reference genomes and strain specific gene annotations
Multiple mouse reference genomes and strain specific gene annotationsMultiple mouse reference genomes and strain specific gene annotations
Multiple mouse reference genomes and strain specific gene annotationsThomas Keane
 
Satkartar Khalsa's paper on hematopoiesis
Satkartar Khalsa's paper on hematopoiesis Satkartar Khalsa's paper on hematopoiesis
Satkartar Khalsa's paper on hematopoiesis Satkartar Khalsa
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3GenomeInABottle
 
Thesis Project Luke Morton 2016
Thesis Project Luke Morton 2016Thesis Project Luke Morton 2016
Thesis Project Luke Morton 2016Luke Morton
 
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdfONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdfamzonknr
 
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdfONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdfamzonknr
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-researchc.titus.brown
 
Help2
Help2Help2
Help2YaCui
 

Ähnlich wie HMMER 3 & Community Profiling (20)

Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
 
Fans
FansFans
Fans
 
Aug2013 tumor normal whole genome sequencing
Aug2013 tumor normal whole genome sequencingAug2013 tumor normal whole genome sequencing
Aug2013 tumor normal whole genome sequencing
 
Genomic insight of__sperm_motility
Genomic insight of__sperm_motilityGenomic insight of__sperm_motility
Genomic insight of__sperm_motility
 
Levitan
LevitanLevitan
Levitan
 
Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015Apollo Exercises Kansas State University 2015
Apollo Exercises Kansas State University 2015
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012
 
Protein Science 2004
Protein Science 2004Protein Science 2004
Protein Science 2004
 
2011 Rna Course Part 1
2011 Rna Course Part 12011 Rna Course Part 1
2011 Rna Course Part 1
 
The introduction of supernova system: a vector system for single-cell labelin...
The introduction of supernova system: a vector system for single-cell labelin...The introduction of supernova system: a vector system for single-cell labelin...
The introduction of supernova system: a vector system for single-cell labelin...
 
Molecular Biology Assignment Help
Molecular Biology Assignment HelpMolecular Biology Assignment Help
Molecular Biology Assignment Help
 
Multiple mouse reference genomes and strain specific gene annotations
Multiple mouse reference genomes and strain specific gene annotationsMultiple mouse reference genomes and strain specific gene annotations
Multiple mouse reference genomes and strain specific gene annotations
 
Satkartar Khalsa's paper on hematopoiesis
Satkartar Khalsa's paper on hematopoiesis Satkartar Khalsa's paper on hematopoiesis
Satkartar Khalsa's paper on hematopoiesis
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
Thesis Project Luke Morton 2016
Thesis Project Luke Morton 2016Thesis Project Luke Morton 2016
Thesis Project Luke Morton 2016
 
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdfONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE B.pdf
 
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdfONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdf
ONLY THE LAST QUESTION IS THE POINT OF POST. THE OTHER PAGES ARE BAC.pdf
 
Honors ~ Dna 1314
Honors ~ Dna 1314Honors ~ Dna 1314
Honors ~ Dna 1314
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-research
 
Help2
Help2Help2
Help2
 

Mehr von Morgan Langille

GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopMorgan Langille
 
Leveraging ancestral state reconstruction to infer community function from a ...
Leveraging ancestral state reconstruction to infer community function from a ...Leveraging ancestral state reconstruction to infer community function from a ...
Leveraging ancestral state reconstruction to infer community function from a ...Morgan Langille
 
Inferring microbial community function from taxonomic composition
Inferring microbial community function from taxonomic compositionInferring microbial community function from taxonomic composition
Inferring microbial community function from taxonomic compositionMorgan Langille
 
BioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataBioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataMorgan Langille
 
Unknown Genes, Community Profiling, & Biotorrents.net
Unknown Genes, Community Profiling, & Biotorrents.netUnknown Genes, Community Profiling, & Biotorrents.net
Unknown Genes, Community Profiling, & Biotorrents.netMorgan Langille
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Morgan Langille
 
Microbial Genomics 2008 Conference Review
Microbial Genomics 2008 Conference ReviewMicrobial Genomics 2008 Conference Review
Microbial Genomics 2008 Conference ReviewMorgan Langille
 
A graduate student's experience in bioinformatics
A graduate student's experience in bioinformaticsA graduate student's experience in bioinformatics
A graduate student's experience in bioinformaticsMorgan Langille
 

Mehr von Morgan Langille (9)

GLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics WorkshopGLBIO/CCBC Metagenomics Workshop
GLBIO/CCBC Metagenomics Workshop
 
Leveraging ancestral state reconstruction to infer community function from a ...
Leveraging ancestral state reconstruction to infer community function from a ...Leveraging ancestral state reconstruction to infer community function from a ...
Leveraging ancestral state reconstruction to infer community function from a ...
 
Inferring microbial community function from taxonomic composition
Inferring microbial community function from taxonomic compositionInferring microbial community function from taxonomic composition
Inferring microbial community function from taxonomic composition
 
BioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific DataBioTorrents: A File Sharing Service for Scientific Data
BioTorrents: A File Sharing Service for Scientific Data
 
Unknown Genes, Community Profiling, & Biotorrents.net
Unknown Genes, Community Profiling, & Biotorrents.netUnknown Genes, Community Profiling, & Biotorrents.net
Unknown Genes, Community Profiling, & Biotorrents.net
 
MicrobeDB Overview
MicrobeDB OverviewMicrobeDB Overview
MicrobeDB Overview
 
Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...Computational prediction and characterization of genomic islands: insights i...
Computational prediction and characterization of genomic islands: insights i...
 
Microbial Genomics 2008 Conference Review
Microbial Genomics 2008 Conference ReviewMicrobial Genomics 2008 Conference Review
Microbial Genomics 2008 Conference Review
 
A graduate student's experience in bioinformatics
A graduate student's experience in bioinformaticsA graduate student's experience in bioinformatics
A graduate student's experience in bioinformatics
 

Kürzlich hochgeladen

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

HMMER 3 & Community Profiling

  • 1. HmmER 3 &Community Profiling Morgan Langille UC Davis
  • 2. HMMER 3 – What’s new? Much Faster 100 X HMMER 2 ≈ BLAST More sensitive
  • 3. What’s new? Alignment column confidence Each residue is given a posterior probability annotation * = 95-100% 9= 85-95% 8= 75-85% etc. fn3 2 saPenlsvsevtstsltlsWsppkdgggpitgYeveyqekgegeewqevtvprtttsvtltgLepgteYefrVqavngagegp 84 saP ++ + ++ l ++W p + +gpi+gY++++++++++ + e+ vp+ s+ +++L++gt+Y++ + +n++gegp 7LESS_DROME 439 SAPVIEHLMGLDDSHLAVHWHPGRFTNGPIEGYRLRLSSSEGNA-TSEQLVPAGRGSYIFSQLQAGTNYTLALSMINKQGEGP 520 78999999999*****************************9998.**********************************9997 PP
  • 4. What’s new? Sequence scores, not alignment scores scoring just a single best alignment can break down if it is a remote homolog scoring sequences by integrating over alignment uncertainty
  • 5. Single Sequence Queries phmmer ≈ BLASTP Search a sequence against a sequence database. jackhmmer≈ PSI-BLAST Iteratively search a sequence against a sequence database. Internally they produce a profile HMM from the query sequence then run an HMM search
  • 6. Small Changes hmmpfam -> hmmscan Search a sequence against a profile HMM database hmmcalibrate -> built into hmmbuild hmmpress Creates binary hmm files so hmmscan is faster Similar idea to formatting Blast db’s using formatdb New output format options --tblout(seq score, best domain score) --domtblout(seq score, all domain scores with coordinates) Gives a tab-delimited output without alignments 1/5 file size of regular output
  • 7. Upcoming changes Parallelization Multi-threaded, MPI (cluster), GPU Translated comparisons BLASTX, TBLASTN, TBLASTX More input sequence formats GenBank, EMBL, etc Clustal format
  • 8. Problems/Issues hmmconvert Used to convert hmmer2 profiles into hmmer3 profiles Only converts file format Good: get hmmer3 speedup Bad: get hmmer2 sensitivity/specificity Should rebuild old HMMER2 HMMs using hmmbuild
  • 9. Glocalvs local alignments Local Any portion of the HMM can align to any portion of the sequence Glocal The entire HMM is aligned to any portion of the sequence HMMER2 Had both, but local was not as sensitive as glocal HMMER3 Local was improved so that glocal was thought to be not needed (and was not included in HMMER3) However, some models do very poorly Short extremely diverse seed alignments such as zinc finger transcription factors may be missed
  • 11. Phylogenetic profiling Wu, et al., PLOS Genetics, 2005 C. hydrogenoformansidentified presence or absence of homologs in all other completely sequence genomes Identified many hypothetical proteins that had the same profile as other sporulation proteins
  • 12. Community Profiling KEGG COG Delong, et al., Science, 2006
  • 13. Community Profiling Look across multiple metagenomic samples Gene families that have similar profiles may have similar function Similar to using co-expression to identify similar functioning genes
  • 14. So what have I done? Downloaded the GOS peptide file 41M sequences, 80 samples 43GB -> 7GB, by removing extra information Split into ~100 smaller files Downloaded HMMER 3 Pfams (email request) Containing 11098 Pfams Ran hmmscan on genbeo 4 days later 12.5 M pfam predictions Some sequences contain >1 pfam 9643 pfams Used “cluster” to group genes and samples
  • 15. Results GOS Metagenomic Samples Red = above avg. number of pfams Green = below avg. number of pfams Have not normalized Number of sequences per sample For number of pfams Pfams
  • 16. Example of phage Pfams clustering together
  • 17. Future Community Profiling Include other (all) metagenomic samples Try to group Pfams by GO category to see how strong the correlation is between branch length and function Examine if some functionality categories are more easily predicted by this profiling strategy (i.e. HGTs) Identify novel gene families and sub-families Clustering genes, building HMMs, scanning, …repeat. Community profiling may help in annotation of these