SlideShare ist ein Scribd-Unternehmen logo
1 von 23
PyPedia
The free programming environment
       that anyone can edit!
       AlexandrosKanterakis



        Genomics Coordination Center, Department of Genetics,
        University Medical Center, Groningen, The Netherlands
Introduction
How not to be a bioinformatician
•   Stay low level at every level
•   Be open source without being open
•   Make tools that make no sense to scientists
•   Do not ever share your results and do not reuse
•   Never maintain your databases and web services
•   Be unreachable and isolated
So, you think you can be a
             bioinformatician…
• Imagine you only have: A personal computer
  with a browser and an Internet connection
• Answer the following question:
     - Who is the current prime minister of Latvia?
SYTYCBAB
• Imagine you only have: A personal computer with
  a browser and an Internet connection
• Answer the following question:
 Compute the Hardy-Weinberg equilibriums of a set of
 genotypes
                                                Execute
                                                 Source
                                                Documentation


                                                Execute
                                                 Source
                                                Documentation



                                                Execute
                                                 Source
                                                
                                                Documentation
Execute
 Source
 Documentation
But what about…
? Web environment, online execution
? Open Source
? Integrate with other tools
? Edit a method and share it
? Examples and Unit tests
? Deploy in the cloud
? Frequency of new releases
Apython sandbox to the rescue
From:
http://wiki.python.org/moin/SandboxedPython




So:
Google App Engine + MediaWiki = PyPedia
www.pypedia.com
Code as wiki
HTML input as wiki
Executing a method in a remote computer

• Edit your user page and add an “ssh” section:

                      ==ssh==
                      host=ec2-107-22-59-115.compute-1.amazonaws.com
                      username=JohnDoe
                      path=/home/JohnDoe/runPyPedia




• This content is NOT shown to anyone
• Install the PyPedia client on remote
  computer(details on pypedia.com)
“Execute on remote computer”

Example:
Fixed_point_user_JohnDoe


The cloud instance contains:
numpy, scipy, matplotlib


Like SAGE but with custom
execution environments
(i.eBioPython, PyCogent, …)
Cool, but I want to call the function from my local computer..

• Install the PyPedia python library:
git clone git://github.com/kantale/pypedia.git

• Load the function in python:
 import pypedia
 from pypedia importPairwise_linkage_disequilibrium
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G),
   (G,A)], [(A,A), (A,G), (G,G), (A,A)])

   {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',
   2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498,
   0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq':
   0.59999999983318408, 'Dprime': 0.99999999986098675}


• You can call the method of any user and your method can be
  called by anyone.
• Edit locally, push changes.
• On the top of each article there is a button:

• Creates a personalized version of the article that only
  you can edit.

• This is similar to the Github’s “fork” feature.
Using PyPedia for open science
• A complete analysis can be hosted in PyPedia

• Any finding generated or published should be
  easily shared and reproduced.

• The reproduction of a finding takes time even
  when the source code is released.
Reproducible science
• PyPedia offers a REST interface:
• www.pypedia.com/index.php?
     b_timestamp=YYYYMMDDHHMMSS
get_code=python code
• Get the most recent version of the python
  code that is edited before the timestamp.

• Reproduce the analysis by sharing a single URL:
  http://www.pypedia.com/index.php?b_timestamp=20120102101010get_code=print
  Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G),
  (A,A)])
Reproducing an experiment
# curl 
--data-urlencode 'b_timestamp=20120501010101' 
--data-urlencode 'get_code=print
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G),
(G,A)], [(A,A), (A,G), (G,G), (A,A)])' 
http://www.pypedia.com/index.php 
--output code.py

# python code.py
{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',
    2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125),
    ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408,
    'Dprime': 0.99999999986098675}
Meta-webserver
• HTML injection is allowed
  and encouraged!
http://www.pypedia.com/index.php/Draw_face_user_Kantale



• Example run an HTML code
  posted on gist:
    http://www.pypedia.com/index.php?
      run_code=
            import urllib2
            print urllib2.urlopen(
                ‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04
                b3f7a23ba43f558fba98b/index_full.html’).read()
      Click me!
• All content is under the Simplified BSD License
• Two namespaces:
  – Validated articles. i.e: Minor_allele_frequency
     • Safe, only admins can edit
  – User articles. i.e: Minor_allele_frequency_user_John
     • Unsafe, edited by individual user
  – Qualitative articles from User namespace is
    promoted to the Validated namespace
  – Validated articles cannot call User articles (duh..)
Some thoughts
    (in the embarrassing occasion I have some minutes left)

Code as wiki, program as wiki concept
• Multidimensional expansion
• As Mao said: Let a thousand flowers scripts bloom (and
   some of them rot in hell)
• Minimize the distance:
Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists)
• Encyclopedialize™ your scripts because open source isn’t
   enough!

Future steps:
• Attract editors, make communities!
• If it can be done in python, why not Ruby, …?
• Contact: admin@pypedia.com
• Source code license: GPL v3
• Content license: Simplified BSD license
• Join us in google groups:
  http://groups.google.com/group/pypedia
• Twitter: @PyPedia

• PyPedia’s source code:
    – Mediawiki extension:
       https://github.com/kantale/PyPedia_server
    – Python library:
    https://github.com/kantale/pypedia

• Acknowledgements:
    – Despoina Antonakaki
    – Kostas Tselios                               Posters:
    – Morris A. Swertz                                 BOSC: 11
                                                       ISMB: E12

Weitere ähnliche Inhalte

Was ist angesagt?

PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...Plotly
 
Make an Instant Website with Webhooks
Make an Instant Website with WebhooksMake an Instant Website with Webhooks
Make an Instant Website with WebhooksAnne Gentle
 
Puppet DSL: back to the basics
Puppet DSL: back to the basicsPuppet DSL: back to the basics
Puppet DSL: back to the basicsJulien Pivotto
 
What is version control software and why do you need it?
What is version control software and why do you need it?What is version control software and why do you need it?
What is version control software and why do you need it?Leonid Mamchenkov
 
Introduction to IPython & Notebook
Introduction to IPython & NotebookIntroduction to IPython & Notebook
Introduction to IPython & NotebookAreski Belaid
 
Do you know all of Puppet?
Do you know all of Puppet?Do you know all of Puppet?
Do you know all of Puppet?Julien Pivotto
 
C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)Rick Beerendonk
 
Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014Mandi Walls
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)PyData
 
Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2Vinci Rufus
 
Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891Brian Okinyi
 
Package Management on Windows with Chocolatey
Package Management on Windows with ChocolateyPackage Management on Windows with Chocolatey
Package Management on Windows with ChocolateyPuppet
 

Was ist angesagt? (20)

PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
 
Make an Instant Website with Webhooks
Make an Instant Website with WebhooksMake an Instant Website with Webhooks
Make an Instant Website with Webhooks
 
Puppet DSL: back to the basics
Puppet DSL: back to the basicsPuppet DSL: back to the basics
Puppet DSL: back to the basics
 
What is version control software and why do you need it?
What is version control software and why do you need it?What is version control software and why do you need it?
What is version control software and why do you need it?
 
Introduction to IPython & Notebook
Introduction to IPython & NotebookIntroduction to IPython & Notebook
Introduction to IPython & Notebook
 
Inside GitHub with Chris Wanstrath
Inside GitHub with Chris WanstrathInside GitHub with Chris Wanstrath
Inside GitHub with Chris Wanstrath
 
Do you know all of Puppet?
Do you know all of Puppet?Do you know all of Puppet?
Do you know all of Puppet?
 
C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)
 
Git Tutorial
Git TutorialGit Tutorial
Git Tutorial
 
Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
 
Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2
 
Git 101 for Beginners
Git 101 for Beginners Git 101 for Beginners
Git 101 for Beginners
 
Git tutorial
Git tutorialGit tutorial
Git tutorial
 
Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891
 
Intro to Jupyter Notebooks
Intro to Jupyter NotebooksIntro to Jupyter Notebooks
Intro to Jupyter Notebooks
 
Git Introduction
Git IntroductionGit Introduction
Git Introduction
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
git and github
git and githubgit and github
git and github
 
Package Management on Windows with Chocolatey
Package Management on Windows with ChocolateyPackage Management on Windows with Chocolatey
Package Management on Windows with Chocolatey
 

Andere mochten auch

Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Jan Aerts
 
Python programming for Bioinformatics
Python programming for BioinformaticsPython programming for Bioinformatics
Python programming for BioinformaticsHyungyong Kim
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Jan Aerts
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Jan Aerts
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationJan Aerts
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Jan Aerts
 

Andere mochten auch (6)

Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Python programming for Bioinformatics
Python programming for BioinformaticsPython programming for Bioinformatics
Python programming for Bioinformatics
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 

Ähnlich wie A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology

Python Spyder IDE | Edureka
Python Spyder IDE | EdurekaPython Spyder IDE | Edureka
Python Spyder IDE | EdurekaEdureka!
 
A Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfA Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfLuciano Resende
 
Sonian, Open Source and Sensu
Sonian, Open Source and SensuSonian, Open Source and Sensu
Sonian, Open Source and SensuPete Cheslock
 
Introduction to Python Programming
Introduction to Python ProgrammingIntroduction to Python Programming
Introduction to Python ProgrammingAkhil Kaushik
 
Getting Started With Jenkins And Drupal
Getting Started With Jenkins And DrupalGetting Started With Jenkins And Drupal
Getting Started With Jenkins And DrupalPhilip Norton
 
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20Michael Lihs
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentFrederick Reiss
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Den Delimarsky
 
Resumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUSResumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUSkhangtoh
 
On the Edge Systems Administration with Golang
On the Edge Systems Administration with GolangOn the Edge Systems Administration with Golang
On the Edge Systems Administration with GolangChris McEniry
 
Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at workSammy Fung
 
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony AppsSymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony AppsPablo Godel
 
Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...Jody Garnett
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practiceC. Tobin Magle
 
Using nu get the way you should svcc
Using nu get the way you should   svccUsing nu get the way you should   svcc
Using nu get the way you should svccMaarten Balliauw
 
Django dev-env-my-way
Django dev-env-my-wayDjango dev-env-my-way
Django dev-env-my-wayRobert Lujo
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning processDenis Dus
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsSimon Cockell
 

Ähnlich wie A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology (20)

G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Python Spyder IDE | Edureka
Python Spyder IDE | EdurekaPython Spyder IDE | Edureka
Python Spyder IDE | Edureka
 
A Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfA Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdf
 
Sonian, Open Source and Sensu
Sonian, Open Source and SensuSonian, Open Source and Sensu
Sonian, Open Source and Sensu
 
Introduction to Python Programming
Introduction to Python ProgrammingIntroduction to Python Programming
Introduction to Python Programming
 
Getting Started With Jenkins And Drupal
Getting Started With Jenkins And DrupalGetting Started With Jenkins And Drupal
Getting Started With Jenkins And Drupal
 
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
Resumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUSResumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUS
 
On the Edge Systems Administration with Golang
On the Edge Systems Administration with GolangOn the Edge Systems Administration with Golang
On the Edge Systems Administration with Golang
 
Case study
Case studyCase study
Case study
 
Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at work
 
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony AppsSymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
 
Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Using nu get the way you should svcc
Using nu get the way you should   svccUsing nu get the way you should   svcc
Using nu get the way you should svcc
 
Django dev-env-my-way
Django dev-env-my-wayDjango dev-env-my-way
Django dev-env-my-way
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning process
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 

Mehr von Jan Aerts

Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data AnalysisJan Aerts
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualizationJan Aerts
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsJan Aerts
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...Jan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumJan Aerts
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJan Aerts
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloudJan Aerts
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisJan Aerts
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...Jan Aerts
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...Jan Aerts
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsJan Aerts
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesJan Aerts
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoJan Aerts
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy UpdateJan Aerts
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpaceJan Aerts
 
CT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloudCT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloudJan Aerts
 
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...Jan Aerts
 
D Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological dataD Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological dataJan Aerts
 

Mehr von Jan Aerts (20)

Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUno
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy Update
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpace
 
CT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloudCT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloud
 
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
 
D Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological dataD Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological data
 

Kürzlich hochgeladen

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 

Kürzlich hochgeladen (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology

  • 1. PyPedia The free programming environment that anyone can edit! AlexandrosKanterakis Genomics Coordination Center, Department of Genetics, University Medical Center, Groningen, The Netherlands
  • 3. How not to be a bioinformatician • Stay low level at every level • Be open source without being open • Make tools that make no sense to scientists • Do not ever share your results and do not reuse • Never maintain your databases and web services • Be unreachable and isolated
  • 4. So, you think you can be a bioinformatician… • Imagine you only have: A personal computer with a browser and an Internet connection • Answer the following question: - Who is the current prime minister of Latvia?
  • 5. SYTYCBAB • Imagine you only have: A personal computer with a browser and an Internet connection • Answer the following question: Compute the Hardy-Weinberg equilibriums of a set of genotypes Execute Source Documentation Execute Source Documentation Execute Source Documentation
  • 6. Execute Source Documentation But what about… ? Web environment, online execution ? Open Source ? Integrate with other tools ? Edit a method and share it ? Examples and Unit tests ? Deploy in the cloud ? Frequency of new releases
  • 7. Apython sandbox to the rescue From: http://wiki.python.org/moin/SandboxedPython So: Google App Engine + MediaWiki = PyPedia
  • 9.
  • 12. Executing a method in a remote computer • Edit your user page and add an “ssh” section: ==ssh== host=ec2-107-22-59-115.compute-1.amazonaws.com username=JohnDoe path=/home/JohnDoe/runPyPedia • This content is NOT shown to anyone • Install the PyPedia client on remote computer(details on pypedia.com)
  • 13. “Execute on remote computer” Example: Fixed_point_user_JohnDoe The cloud instance contains: numpy, scipy, matplotlib Like SAGE but with custom execution environments (i.eBioPython, PyCogent, …)
  • 14. Cool, but I want to call the function from my local computer.. • Install the PyPedia python library: git clone git://github.com/kantale/pypedia.git • Load the function in python: import pypedia from pypedia importPairwise_linkage_disequilibrium Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)]) {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675} • You can call the method of any user and your method can be called by anyone. • Edit locally, push changes.
  • 15. • On the top of each article there is a button: • Creates a personalized version of the article that only you can edit. • This is similar to the Github’s “fork” feature.
  • 16. Using PyPedia for open science • A complete analysis can be hosted in PyPedia • Any finding generated or published should be easily shared and reproduced. • The reproduction of a finding takes time even when the source code is released.
  • 17. Reproducible science • PyPedia offers a REST interface: • www.pypedia.com/index.php? b_timestamp=YYYYMMDDHHMMSS get_code=python code • Get the most recent version of the python code that is edited before the timestamp. • Reproduce the analysis by sharing a single URL: http://www.pypedia.com/index.php?b_timestamp=20120102101010get_code=print Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)])
  • 18. Reproducing an experiment # curl --data-urlencode 'b_timestamp=20120501010101' --data-urlencode 'get_code=print Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)])' http://www.pypedia.com/index.php --output code.py # python code.py {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}
  • 19. Meta-webserver • HTML injection is allowed and encouraged! http://www.pypedia.com/index.php/Draw_face_user_Kantale • Example run an HTML code posted on gist: http://www.pypedia.com/index.php? run_code= import urllib2 print urllib2.urlopen( ‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04 b3f7a23ba43f558fba98b/index_full.html’).read() Click me!
  • 20. • All content is under the Simplified BSD License • Two namespaces: – Validated articles. i.e: Minor_allele_frequency • Safe, only admins can edit – User articles. i.e: Minor_allele_frequency_user_John • Unsafe, edited by individual user – Qualitative articles from User namespace is promoted to the Validated namespace – Validated articles cannot call User articles (duh..)
  • 21.
  • 22. Some thoughts (in the embarrassing occasion I have some minutes left) Code as wiki, program as wiki concept • Multidimensional expansion • As Mao said: Let a thousand flowers scripts bloom (and some of them rot in hell) • Minimize the distance: Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists) • Encyclopedialize™ your scripts because open source isn’t enough! Future steps: • Attract editors, make communities! • If it can be done in python, why not Ruby, …?
  • 23. • Contact: admin@pypedia.com • Source code license: GPL v3 • Content license: Simplified BSD license • Join us in google groups: http://groups.google.com/group/pypedia • Twitter: @PyPedia • PyPedia’s source code: – Mediawiki extension: https://github.com/kantale/PyPedia_server – Python library: https://github.com/kantale/pypedia • Acknowledgements: – Despoina Antonakaki – Kostas Tselios Posters: – Morris A. Swertz BOSC: 11 ISMB: E12