This document discusses assembly tools and visualization software. It describes assemblers like ABySS and SOAPdenovo that use de Bruijn graphs to assemble genomes from short reads. It also discusses tools for visualizing assemblies, like Tablet and ABySS-Explorer. Finally, it covers read mapping with SAM/BAM formats and tools like BWA, and visualization of mappings with Artemis and IGV.
11. Read mapping http://samtools.sourceforge.net/SAM1.pdf
• SAM / BAM
• Sequence Alignment / Map format (SAM)
• Binary form of SAM (BAM)
• generic format
• Flexible and simple
• Compact (BAM)
• Allow indexing
• Load regions
• Support streaming
11 25.04.11 Assemblers
12. SAM
• Header
• File format version information
• Sequence dictionary (name/length/..)
• Read group (platform/library/...)
• Program info
• Body
• Alignment information
12 25.04.11 Assemblers
13. SAM Header
• '@' followed by record type (two characters)
@HD VN:1.0
@SQ SN:chr20 LN:62435964
@RG ID:L1 PU:SC_1_10 LB:SC_1 SM:NA12891
@RG ID:L2 PU:SC_2_12 LB:SC_2 SM:NA12891
13 25.04.11 Assemblers
14. SAM Alignment
• Tab delimited lines
14 25.04.11 Assemblers
17. BWA
• Burrows-Wheeler Alignment Tool
• Map (singe/paired-end/long) reads to a sequence
• Index database
• bwa index -a bwtsw database.fasta
• Align reads
• bwa aln database.fasta short_read.fastq > aln_sa.sai
• Generate alignments
• bwa sampe database.fasta aln_sa1.sai aln_sa2.sai read1.fq read2.fq > aln.sam
• Long reads
• bwa bwasw database.fasta long_read.fastq > aln.sam
17 25.04.11 Assemblers
18. SAM tools
• Utilities for SAM format
• samtools <command> ...
• Commands:
• view: SAM <-> BAM
• sort: sort BAM file
• index: build BAM file index
• merge: merges x BAM files
• pileup: alignment in the pileup format
• tview: integrated Text alignment viewer
18 25.04.11 Assemblers
19. Visualisation Integrative Genomics Viewer
http://www.broadinstitute.org/igv/
• IGV
• Good integration
• Formats
• DAS
• BAM
• GFF
• ...
• Tools
• Run scripts
• Export region
• ...
19 25.04.11 Assemblers