SlideShare ist ein Scribd-Unternehmen logo
1 von 23
PyPedia
The free programming environment
       that anyone can edit!
       AlexandrosKanterakis



        Genomics Coordination Center, Department of Genetics,
        University Medical Center, Groningen, The Netherlands
Introduction
How not to be a bioinformatician
•   Stay low level at every level
•   Be open source without being open
•   Make tools that make no sense to scientists
•   Do not ever share your results and do not reuse
•   Never maintain your databases and web services
•   Be unreachable and isolated
So, you think you can be a
             bioinformatician…
• Imagine you only have: A personal computer
  with a browser and an Internet connection
• Answer the following question:
     - Who is the current prime minister of Latvia?
SYTYCBAB
• Imagine you only have: A personal computer with
  a browser and an Internet connection
• Answer the following question:
 Compute the Hardy-Weinberg equilibriums of a set of
 genotypes
                                                Execute
                                                 Source
                                                Documentation


                                                Execute
                                                 Source
                                                Documentation



                                                Execute
                                                 Source
                                                
                                                Documentation
Execute
 Source
 Documentation
But what about…
? Web environment, online execution
? Open Source
? Integrate with other tools
? Edit a method and share it
? Examples and Unit tests
? Deploy in the cloud
? Frequency of new releases
Apython sandbox to the rescue
From:
http://wiki.python.org/moin/SandboxedPython




So:
Google App Engine + MediaWiki = PyPedia
www.pypedia.com
Code as wiki
HTML input as wiki
Executing a method in a remote computer

• Edit your user page and add an “ssh” section:

                      ==ssh==
                      host=ec2-107-22-59-115.compute-1.amazonaws.com
                      username=JohnDoe
                      path=/home/JohnDoe/runPyPedia




• This content is NOT shown to anyone
• Install the PyPedia client on remote
  computer(details on pypedia.com)
“Execute on remote computer”

Example:
Fixed_point_user_JohnDoe


The cloud instance contains:
numpy, scipy, matplotlib


Like SAGE but with custom
execution environments
(i.eBioPython, PyCogent, …)
Cool, but I want to call the function from my local computer..

• Install the PyPedia python library:
git clone git://github.com/kantale/pypedia.git

• Load the function in python:
 import pypedia
 from pypedia importPairwise_linkage_disequilibrium
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G),
   (G,A)], [(A,A), (A,G), (G,G), (A,A)])

   {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',
   2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498,
   0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq':
   0.59999999983318408, 'Dprime': 0.99999999986098675}


• You can call the method of any user and your method can be
  called by anyone.
• Edit locally, push changes.
• On the top of each article there is a button:

• Creates a personalized version of the article that only
  you can edit.

• This is similar to the Github’s “fork” feature.
Using PyPedia for open science
• A complete analysis can be hosted in PyPedia

• Any finding generated or published should be
  easily shared and reproduced.

• The reproduction of a finding takes time even
  when the source code is released.
Reproducible science
• PyPedia offers a REST interface:
• www.pypedia.com/index.php?
     b_timestamp=YYYYMMDDHHMMSS
get_code=python code
• Get the most recent version of the python
  code that is edited before the timestamp.

• Reproduce the analysis by sharing a single URL:
  http://www.pypedia.com/index.php?b_timestamp=20120102101010get_code=print
  Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G),
  (A,A)])
Reproducing an experiment
# curl 
--data-urlencode 'b_timestamp=20120501010101' 
--data-urlencode 'get_code=print
Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G),
(G,A)], [(A,A), (A,G), (G,G), (A,A)])' 
http://www.pypedia.com/index.php 
--output code.py

# python code.py
{'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG',
    2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125),
    ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408,
    'Dprime': 0.99999999986098675}
Meta-webserver
• HTML injection is allowed
  and encouraged!
http://www.pypedia.com/index.php/Draw_face_user_Kantale



• Example run an HTML code
  posted on gist:
    http://www.pypedia.com/index.php?
      run_code=
            import urllib2
            print urllib2.urlopen(
                ‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04
                b3f7a23ba43f558fba98b/index_full.html’).read()
      Click me!
• All content is under the Simplified BSD License
• Two namespaces:
  – Validated articles. i.e: Minor_allele_frequency
     • Safe, only admins can edit
  – User articles. i.e: Minor_allele_frequency_user_John
     • Unsafe, edited by individual user
  – Qualitative articles from User namespace is
    promoted to the Validated namespace
  – Validated articles cannot call User articles (duh..)
Some thoughts
    (in the embarrassing occasion I have some minutes left)

Code as wiki, program as wiki concept
• Multidimensional expansion
• As Mao said: Let a thousand flowers scripts bloom (and
   some of them rot in hell)
• Minimize the distance:
Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists)
• Encyclopedialize™ your scripts because open source isn’t
   enough!

Future steps:
• Attract editors, make communities!
• If it can be done in python, why not Ruby, …?
• Contact: admin@pypedia.com
• Source code license: GPL v3
• Content license: Simplified BSD license
• Join us in google groups:
  http://groups.google.com/group/pypedia
• Twitter: @PyPedia

• PyPedia’s source code:
    – Mediawiki extension:
       https://github.com/kantale/PyPedia_server
    – Python library:
    https://github.com/kantale/pypedia

• Acknowledgements:
    – Despoina Antonakaki
    – Kostas Tselios                               Posters:
    – Morris A. Swertz                                 BOSC: 11
                                                       ISMB: E12

Weitere ähnliche Inhalte

Was ist angesagt?

Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014
Mandi Walls
 
Package Management on Windows with Chocolatey
Package Management on Windows with ChocolateyPackage Management on Windows with Chocolatey
Package Management on Windows with Chocolatey
Puppet
 

Was ist angesagt? (20)

PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...
 
Make an Instant Website with Webhooks
Make an Instant Website with WebhooksMake an Instant Website with Webhooks
Make an Instant Website with Webhooks
 
Puppet DSL: back to the basics
Puppet DSL: back to the basicsPuppet DSL: back to the basics
Puppet DSL: back to the basics
 
What is version control software and why do you need it?
What is version control software and why do you need it?What is version control software and why do you need it?
What is version control software and why do you need it?
 
Introduction to IPython & Notebook
Introduction to IPython & NotebookIntroduction to IPython & Notebook
Introduction to IPython & Notebook
 
Inside GitHub with Chris Wanstrath
Inside GitHub with Chris WanstrathInside GitHub with Chris Wanstrath
Inside GitHub with Chris Wanstrath
 
Do you know all of Puppet?
Do you know all of Puppet?Do you know all of Puppet?
Do you know all of Puppet?
 
C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)C# - Raise the bar with functional & immutable constructs (Dutch)
C# - Raise the bar with functional & immutable constructs (Dutch)
 
Git Tutorial
Git TutorialGit Tutorial
Git Tutorial
 
Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014Open Source Tools for Leveling Up Operations FOSSET 2014
Open Source Tools for Leveling Up Operations FOSSET 2014
 
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
IPython: A Modern Vision of Interactive Computing (PyData SV 2013)
 
Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2Re-thinking Performance tuning with HTTP2
Re-thinking Performance tuning with HTTP2
 
Git 101 for Beginners
Git 101 for Beginners Git 101 for Beginners
Git 101 for Beginners
 
Git tutorial
Git tutorialGit tutorial
Git tutorial
 
Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891Gitgithub101slideshare 150922131830-lva1-app6891
Gitgithub101slideshare 150922131830-lva1-app6891
 
Intro to Jupyter Notebooks
Intro to Jupyter NotebooksIntro to Jupyter Notebooks
Intro to Jupyter Notebooks
 
Git Introduction
Git IntroductionGit Introduction
Git Introduction
 
Introduction to Git
Introduction to GitIntroduction to Git
Introduction to Git
 
git and github
git and githubgit and github
git and github
 
Package Management on Windows with Chocolatey
Package Management on Windows with ChocolateyPackage Management on Windows with Chocolatey
Package Management on Windows with Chocolatey
 

Andere mochten auch (6)

Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?Visual Analytics in Omics - why, what, how?
Visual Analytics in Omics - why, what, how?
 
Python programming for Bioinformatics
Python programming for BioinformaticsPython programming for Bioinformatics
Python programming for Bioinformatics
 
Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013Visual Analytics talk at ISMB2013
Visual Analytics talk at ISMB2013
 
Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?Visual Analytics in Omics: why, what, how?
Visual Analytics in Omics: why, what, how?
 
VIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic VariationVIZBI 2014 - Visualizing Genomic Variation
VIZBI 2014 - Visualizing Genomic Variation
 
Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)Visualizing the Structural Variome (VMLS-Eurovis 2013)
Visualizing the Structural Variome (VMLS-Eurovis 2013)
 

Ähnlich wie A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology

Sonian, Open Source and Sensu
Sonian, Open Source and SensuSonian, Open Source and Sensu
Sonian, Open Source and Sensu
Pete Cheslock
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
Frederick Reiss
 

Ähnlich wie A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology (20)

G3 talk rld_2
G3 talk rld_2G3 talk rld_2
G3 talk rld_2
 
Python Spyder IDE | Edureka
Python Spyder IDE | EdurekaPython Spyder IDE | Edureka
Python Spyder IDE | Edureka
 
A Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdfA Jupyter kernel for Scala and Apache Spark.pdf
A Jupyter kernel for Scala and Apache Spark.pdf
 
Sonian, Open Source and Sensu
Sonian, Open Source and SensuSonian, Open Source and Sensu
Sonian, Open Source and Sensu
 
Introduction to Python Programming
Introduction to Python ProgrammingIntroduction to Python Programming
Introduction to Python Programming
 
Getting Started With Jenkins And Drupal
Getting Started With Jenkins And DrupalGetting Started With Jenkins And Drupal
Getting Started With Jenkins And Drupal
 
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
Continuous Integration with Open Source Tools - PHPUgFfm 2014-11-20
 
The Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter DeploymentThe Five Stages of Enterprise Jupyter Deployment
The Five Stages of Enterprise Jupyter Deployment
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
Resumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUSResumable File Upload API using GridFS and TUS
Resumable File Upload API using GridFS and TUS
 
On the Edge Systems Administration with Golang
On the Edge Systems Administration with GolangOn the Edge Systems Administration with Golang
On the Edge Systems Administration with Golang
 
Case study
Case studyCase study
Case study
 
Use open source software to develop ideas at work
Use open source software to develop ideas at workUse open source software to develop ideas at work
Use open source software to develop ideas at work
 
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony AppsSymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
SymfonyCon Madrid 2014 - Rock Solid Deployment of Symfony Apps
 
Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...Everyone wants (someone else) to do it: writing documentation for open source...
Everyone wants (someone else) to do it: writing documentation for open source...
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Using nu get the way you should svcc
Using nu get the way you should   svccUsing nu get the way you should   svcc
Using nu get the way you should svcc
 
Django dev-env-my-way
Django dev-env-my-wayDjango dev-env-my-way
Django dev-env-my-way
 
Reproducibility and automation of machine learning process
Reproducibility and automation of machine learning processReproducibility and automation of machine learning process
Reproducibility and automation of machine learning process
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 

Mehr von Jan Aerts

Mehr von Jan Aerts (20)

Humanizing Data Analysis
Humanizing Data AnalysisHumanizing Data Analysis
Humanizing Data Analysis
 
Intro to data visualization
Intro to data visualizationIntro to data visualization
Intro to data visualization
 
L Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformaticsL Fu - Dao: a novel programming language for bioinformatics
L Fu - Dao: a novel programming language for bioinformatics
 
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
J Wang - bioKepler: a comprehensive bioinformatics scientific workflow module...
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing ConsortiumB Temperton - The Bioinformatics Testing Consortium
B Temperton - The Bioinformatics Testing Consortium
 
J Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis FrameworkJ Goecks - The Galaxy Visual Analysis Framework
J Goecks - The Galaxy Visual Analysis Framework
 
S Cain - GMOD in the cloud
S Cain - GMOD in the cloudS Cain - GMOD in the cloud
S Cain - GMOD in the cloud
 
B Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysisB Chapman - Toolkit for variation comparison and analysis
B Chapman - Toolkit for variation comparison and analysis
 
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
P Rocca-Serra - The open source ISA metadata tracking framework: from data cu...
 
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...
 
S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...S Cheng - eagle-i: development and expansion of a scientific resource discove...
S Cheng - eagle-i: development and expansion of a scientific resource discove...
 
A Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining componentsA Kalderimis - InterMine: Embeddable datamining components
A Kalderimis - InterMine: Embeddable datamining components
 
E Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutesE Afgan - Zero to a bioinformatics analysis platform in four minutes
E Afgan - Zero to a bioinformatics analysis platform in four minutes
 
B Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUnoB Kinoshita - Creating biology pipelines with BioUno
B Kinoshita - Creating biology pipelines with BioUno
 
D Baker - Galaxy Update
D Baker - Galaxy UpdateD Baker - Galaxy Update
D Baker - Galaxy Update
 
M Reich - GenomeSpace
M Reich - GenomeSpaceM Reich - GenomeSpace
M Reich - GenomeSpace
 
CT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloudCT Brown - Doing next-gen sequencing analysis in the cloud
CT Brown - Doing next-gen sequencing analysis in the cloud
 
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
L Forer - Cloudgene: an execution platform for MapReduce programs in public a...
 
D Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological dataD Robinson - Using HDF5 to work with large quantities of rich biological data
D Robinson - Using HDF5 to work with large quantities of rich biological data
 

Kürzlich hochgeladen

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

A Kanterakis - PyPedia: a python crowdsourcing development environment for bioinformatics and computational biology

  • 1. PyPedia The free programming environment that anyone can edit! AlexandrosKanterakis Genomics Coordination Center, Department of Genetics, University Medical Center, Groningen, The Netherlands
  • 3. How not to be a bioinformatician • Stay low level at every level • Be open source without being open • Make tools that make no sense to scientists • Do not ever share your results and do not reuse • Never maintain your databases and web services • Be unreachable and isolated
  • 4. So, you think you can be a bioinformatician… • Imagine you only have: A personal computer with a browser and an Internet connection • Answer the following question: - Who is the current prime minister of Latvia?
  • 5. SYTYCBAB • Imagine you only have: A personal computer with a browser and an Internet connection • Answer the following question: Compute the Hardy-Weinberg equilibriums of a set of genotypes Execute Source Documentation Execute Source Documentation Execute Source Documentation
  • 6. Execute Source Documentation But what about… ? Web environment, online execution ? Open Source ? Integrate with other tools ? Edit a method and share it ? Examples and Unit tests ? Deploy in the cloud ? Frequency of new releases
  • 7. Apython sandbox to the rescue From: http://wiki.python.org/moin/SandboxedPython So: Google App Engine + MediaWiki = PyPedia
  • 9.
  • 12. Executing a method in a remote computer • Edit your user page and add an “ssh” section: ==ssh== host=ec2-107-22-59-115.compute-1.amazonaws.com username=JohnDoe path=/home/JohnDoe/runPyPedia • This content is NOT shown to anyone • Install the PyPedia client on remote computer(details on pypedia.com)
  • 13. “Execute on remote computer” Example: Fixed_point_user_JohnDoe The cloud instance contains: numpy, scipy, matplotlib Like SAGE but with custom execution environments (i.eBioPython, PyCogent, …)
  • 14. Cool, but I want to call the function from my local computer.. • Install the PyPedia python library: git clone git://github.com/kantale/pypedia.git • Load the function in python: import pypedia from pypedia importPairwise_linkage_disequilibrium Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)]) {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675} • You can call the method of any user and your method can be called by anyone. • Edit locally, push changes.
  • 15. • On the top of each article there is a button: • Creates a personalized version of the article that only you can edit. • This is similar to the Github’s “fork” feature.
  • 16. Using PyPedia for open science • A complete analysis can be hosted in PyPedia • Any finding generated or published should be easily shared and reproduced. • The reproduction of a finding takes time even when the source code is released.
  • 17. Reproducible science • PyPedia offers a REST interface: • www.pypedia.com/index.php? b_timestamp=YYYYMMDDHHMMSS get_code=python code • Get the most recent version of the python code that is edited before the timestamp. • Reproduce the analysis by sharing a single URL: http://www.pypedia.com/index.php?b_timestamp=20120102101010get_code=print Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)])
  • 18. Reproducing an experiment # curl --data-urlencode 'b_timestamp=20120501010101' --data-urlencode 'get_code=print Pairwise_linkage_disequilibrium([(A,A), (A,G), (G,G), (G,A)], [(A,A), (A,G), (G,G), (A,A)])' http://www.pypedia.com/index.php --output code.py # python code.py {'haplotypes': [('AA', 0.49999999997393502, 0.3125), ('AG', 2.606498430642265e-11, 0.1875), ('GA', 0.12500000002606498, 0.3125), ('GG', 0.37499999997393502, 0.1875)], 'R_sq': 0.59999999983318408, 'Dprime': 0.99999999986098675}
  • 19. Meta-webserver • HTML injection is allowed and encouraged! http://www.pypedia.com/index.php/Draw_face_user_Kantale • Example run an HTML code posted on gist: http://www.pypedia.com/index.php? run_code= import urllib2 print urllib2.urlopen( ‘https://raw.github.com/gist/2689822/bbea0c43b278d7c4c04 b3f7a23ba43f558fba98b/index_full.html’).read() Click me!
  • 20. • All content is under the Simplified BSD License • Two namespaces: – Validated articles. i.e: Minor_allele_frequency • Safe, only admins can edit – User articles. i.e: Minor_allele_frequency_user_John • Unsafe, edited by individual user – Qualitative articles from User namespace is promoted to the Validated namespace – Validated articles cannot call User articles (duh..)
  • 21.
  • 22. Some thoughts (in the embarrassing occasion I have some minutes left) Code as wiki, program as wiki concept • Multidimensional expansion • As Mao said: Let a thousand flowers scripts bloom (and some of them rot in hell) • Minimize the distance: Dsanity(SCRIPTmade_by_IT_guy, SCRIPTuseful_to_biologists) • Encyclopedialize™ your scripts because open source isn’t enough! Future steps: • Attract editors, make communities! • If it can be done in python, why not Ruby, …?
  • 23. • Contact: admin@pypedia.com • Source code license: GPL v3 • Content license: Simplified BSD license • Join us in google groups: http://groups.google.com/group/pypedia • Twitter: @PyPedia • PyPedia’s source code: – Mediawiki extension: https://github.com/kantale/PyPedia_server – Python library: https://github.com/kantale/pypedia • Acknowledgements: – Despoina Antonakaki – Kostas Tselios Posters: – Morris A. Swertz BOSC: 11 ISMB: E12