SlideShare wird heruntergeladen. ×
0
Taming Snakemake
1/27/14
Why Make?


What are Make's advantages (over Perl and shell scripts)?


Make forces you to think about file transformati...
Make review

http://github.research.chop.edu/BiG/err_chip_seq/blob/master/Makefile
Pipelines and Workflows
Other pipelines
Ruffus

Queue

GKNO
Why Snakemake?
 Addresses Makefile weaknesses without
throwing out the good stuff
 Difficult to implement control flow
...
Syntax
Make
Variables

Targets

Rules

Snakemake
Utilities
 Logs - wire them up manually

 Cluster support pretty decent
source /nas/is1/leipzig/martin/variome-env/bin/a...
Useful stuff
 dry-runs
 keep-going
 touch
 version changes
 workflow diagrams
Python legal
Client websites with Jekyll
 Jekyll is a templating engine for blogs that
accepts Markdown
 Layouts use the Liquid marku...
A workflow that reports itself
Avoiding Sweave-Hell
The bad way
Cache-ing chunks?
Avoiding Sweave-Hell
Avoiding Sweave-Hell
R/Snakemake integration
git submodule add git@github.research.chop.edu:BiG/rna-seq-common-functions.git common/rna-seq
Leave a paper trail
Reproducible Checklist
 repository github.research.chop.edu
 workflow of some kind from beginning to end
 website at my...
Ties that bind
Nächste SlideShare
Wird geladen in ...5
×

Taming Snakemake

958

Published on

Some notes about transitioning from Make to Snakemake and its benefits.

Published in: Technologie
0 Kommentare
2 Gefällt mir
Statistiken
Notizen
  • Hinterlassen Sie den ersten Kommentar

Keine Downloads
Views
Gesamtviews
958
Bei Slideshare
0
Aus Einbettungen
0
Anzahl an Einbettungen
1
Aktionen
Geteilt
0
Downloads
9
Kommentare
0
Gefällt mir
2
Einbettungen 0
No embeds

No notes for slide

Transcript of "Taming Snakemake"

  1. 1. Taming Snakemake 1/27/14
  2. 2. Why Make?  What are Make's advantages (over Perl and shell scripts)?  Make forces you to think about file transformation in terms of inputs and outputs, recipes and rules. In Perl you are forced to think at the level of variables, conditionals, and loops. In Shell you are forced to think like a caveman.  Unfortunately, bioinformatics is still largely about files and their suffixes. Make has a very powerful syntax based almost entirely around file suffixes.  Make knows what's been made and what hasn't. Make can be interrupted and restarted safely, and without overwriting finished work.  Make knows what's changed and what hasn't. If an input is newer than an output, it will attempt to rebuild the output.  Make allows you to add new input files without worrying about overwriting old ones.  Make is well supported. There are 1333 Make questions on SO alone.  When people see a Makefile, they immediately know how to run it.  Make does not force you to wrap shell statements in quotes.  Make is a DSL. It will attempt to validate your syntax.  Make is ancient, ubiquitous, and reliable.  Make can parallelize with --jobs.  Make recipes encourage reuse. https://share.chop.edu/pages/viewpage.action?pageId=138478819
  3. 3. Make review http://github.research.chop.edu/BiG/err_chip_seq/blob/master/Makefile
  4. 4. Pipelines and Workflows
  5. 5. Other pipelines Ruffus Queue GKNO
  6. 6. Why Snakemake?  Addresses Makefile weaknesses without throwing out the good stuff  Difficult to implement control flow  No cluster support  Inflexible wildcards  Too much reliance on sentinal files  No reporting mechanism Johannes Köster
  7. 7. Syntax Make Variables Targets Rules Snakemake
  8. 8. Utilities  Logs - wire them up manually  Cluster support pretty decent source /nas/is1/leipzig/martin/variome-env/bin/activate snakemake --directory /nas/is1/leipzig/martin/snake-env --snakefile /nas/is1/leipzig/martin/snake-env/Snakefile -c qsub -j 16 source /mnt/isilon/cbmi/variome/leipzig/martin/respublica-env/bin/activate snakemake --directory /mnt/isilon/cbmi/variome/leipzig/martin/snake-env --snakefile /mnt/isilon/cbmi/variome/leipzig/martin/snake-env/Snakefile -c qsub -j 16  Cores/jobs/resources
  9. 9. Useful stuff  dry-runs  keep-going  touch  version changes  workflow diagrams
  10. 10. Python legal
  11. 11. Client websites with Jekyll  Jekyll is a templating engine for blogs that accepts Markdown  Layouts use the Liquid markup http://mitomap.org/martin-rnaseq/
  12. 12. A workflow that reports itself
  13. 13. Avoiding Sweave-Hell
  14. 14. The bad way
  15. 15. Cache-ing chunks?
  16. 16. Avoiding Sweave-Hell
  17. 17. Avoiding Sweave-Hell
  18. 18. R/Snakemake integration git submodule add git@github.research.chop.edu:BiG/rna-seq-common-functions.git common/rna-seq
  19. 19. Leave a paper trail
  20. 20. Reproducible Checklist  repository github.research.chop.edu  workflow of some kind from beginning to end  website at mybic.chop.edu
  21. 21. Ties that bind
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×