This document discusses analyzing protein structures through data processing with Ruby. It provides information on protein sequences and structures, raw distance distribution function data of a particle, existing software limitations, and how Ruby can be used to wrap external programs, write shell scripts to run them, and define relationships between inputs and outputs to launch programs after dependencies are satisfied. Ruby allows for both quick prototypes and more robust frameworks to help process and analyze protein data.
6. Existing Software
Svergun group @ EMBL
http://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html
Works well, but...
requires running each program multiple times
“interactive” interfaces
not easily scriptable
no really... you have to see it to believe it
7. Help from Ruby
We want to use linux clusters with hundreds of CPUs
Ruby
wrap external programs
write shell scripts to run external programs
Rake
define relationships between inputs/outputs of
different programs
launch external programs after dependencies
are satisfied
8. Do more with Ruby
quick and dirty...
Define input parameters in a script
Define common tasks in a library
more robust...
Ruby API for running commands
More sophisticated information processing
Evolve towards a micro-framework
9. Acknowledgements
Lab (Scripps Research Institute)
John Tainer
Scott Williams
Chris Putnam
Data Collection Funding
Beamline 12.3.1 NIH, DOE, NCI
The Advanced Light
Source (ALS, LBNL)