Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology
1. PyPedia
The free programming environment
that anyone can edit!
AlexandrosKanterakis
Genomics Coordination Center, Department of Genetics,
University Medical Center, Groningen, The Netherlands
3. How not to be a bioinformatician
• Stay low level at every level
• Be open source without being open
• Make tools that make no sense to scientists
• Do not ever share your results and do not reuse
• Never maintain your databases and web services
• Be unreachable and isolated
4. So, you think you can be a
bioinformatician…
• Imagine you only have: A personal computer
with a browser and an Internet connection
• Answer the following question:
- Who is the current prime minister of Latvia?
5. SYTYCBAB
• Imagine you only have: A personal computer with
a browser and an Internet connection
• Answer the following question:
Compute the Hardy-Weinberg equilibriums of a set of
genotypes
Execute
Source
Documentation
Execute
Source
Documentation
Execute
Source
Documentation
6. Execute
Source
Documentation
But what about…
? Web environment, online execution
? Open Source
? Integrate with other tools
? Edit a method and share it
? Examples and Unit tests
? Deploy in the cloud
? Frequency of new releases
7. Apython sandbox to the rescue
From:
http://wiki.python.org/moin/SandboxedPython
So:
Google App Engine + MediaWiki = PyPedia
12. Executing a method in a remote computer
• Edit your user page and add an “ssh” section:
==ssh==
host=ec2-107-22-59-115.compute-1.amazonaws.com
username=JohnDoe
path=/home/JohnDoe/runPyPedia
• This content is NOT shown to anyone
• Install the PyPedia client on remote
computer(details on pypedia.com)
13. “Execute on remote computer”
Example:
Fixed_point_user_JohnDoe
The cloud instance contains:
numpy, scipy, matplotlib
Like SAGE but with custom
execution environments
(i.eBioPython, PyCogent, …)
14. Cool, but I want to call the function from my local computer..
• Install the PyPedia python library:
git clone git://github.com/kantale/pypedia.git
• Load the function in python:
import pypedia
from pypedia importPairwise_linkage_disequilibrium
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G),
(G,A)], [(A,A), (A,G), (G,G), (A,A)])
{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',
2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498,
0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq':
0.59999999983318408, 'Dprime': 0.99999999986098675}
• You can call the method of any user and your method can be
called by anyone.
• Edit locally, push changes.
15. • On the top of each article there is a button:
• Creates a personalized version of the article that only
you can edit.
• This is similar to the Github’s “fork” feature.
16. Using PyPedia for open science
• A complete analysis can be hosted in PyPedia
• Any finding generated or published should be
easily shared and reproduced.
• The reproduction of a finding takes time even
when the source code is released.
17. Reproducible science
• PyPedia offers a REST interface:
• www.pypedia.com/index.php?
b_timestamp=YYYYMMDDHHMMSS
get_code=python code
• Get the most recent version of the python
code that is edited before the timestamp.
• Reproduce the analysis by sharing a single URL:
http://www.pypedia.com/index.php?b_timestamp=20120102101010get_code=print
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G),
(A,A)])
19. Meta-webserver
• HTML injection is allowed
and encouraged!
http://www.pypedia.com/index.php/Draw_face_user_Kantale
• Example run an HTML code
posted on gist:
http://www.pypedia.com/index.php?
run_code=
import urllib2
print urllib2.urlopen(
‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04
b3f7a23ba43f558fba98b/index_full.html’).read()
Click me!
20. • All content is under the Simplified BSD License
• Two namespaces:
– Validated articles. i.e: Minor_allele_frequency
• Safe, only admins can edit
– User articles. i.e: Minor_allele_frequency_user_John
• Unsafe, edited by individual user
– Qualitative articles from User namespace is
promoted to the Validated namespace
– Validated articles cannot call User articles (duh..)
21.
22. Some thoughts
(in the embarrassing occasion I have some minutes left)
Code as wiki, program as wiki concept
• Multidimensional expansion
• As Mao said: Let a thousand flowers scripts bloom (and
some of them rot in hell)
• Minimize the distance:
Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists)
• Encyclopedialize™ your scripts because open source isn’t
enough!
Future steps:
• Attract editors, make communities!
• If it can be done in python, why not Ruby, …?