SlideShare ist ein Scribd-Unternehmen logo
1 von 36
SoS
Script of Scripts
Bo Peng, PhD
Department of Bioinformatics and Computational Biology
The University of Texas MD Anderson Cancer Center
Polyglot Notebook and Workflow System for both Interactive
Multi-language Data Analysis and Batch Data Processing
SoS
A quick survey
Introduction
• Have you used more than one Jupyter kernels?
• Have you used more than one Jupyter kernels for a single project?
• Have you used Jupyter to analyze large data?
• Have you used any workflow system for your work?
SoS
Who we are and what we do
Introduction
SoS
Our computational environment
Introduction
SoS
Write and manage scripts written in different
languages for different environments
Understand and reproduce others’
(and sometimes my own) projects
workflow
Manage data and workflows on different
environments for batch data processing
SoS
The promises of Jupyter ecosystem
Introduction
• Supports virtually all scripting
languages
• Unified notebook format and interface
• Flexible client/server architecture
• JupyterHub for enterprise
• JupyterLab was around the corner
(now ready for users)
• Binder for reproducible data analysis
SoS
What was missing for our work?
Introduction
More IDE features for
interactive data analysis
Multi-language support Integrated workflow system for
batch data processing
snakemake
SoS
SoS Polyglot Notebook
Introduction
Notebook
Server
Kernel
Notebook
Server
Kernel
Notebook
Server
Kernel
SoS
SoS Polyglot Notebook
Introduction
Kernel
Notebook
Server
Kernel
Kernel
Kernel
SoS Introduction
SoS Workflow System
Kernel
Notebook
Server
Kernel
Kernel
Kernel
Workflow
System
SoS Polyglot Notebook
+ =Polyglot
Notebook
Working
Environment
Workflow
System
SoS
A super kernel to all jupyter kernels
Polyglot Notebook
Kernel
Subkernel
• Starts and shuts down subkernels
• Receives input from frontend,
(optionally) processes it, sends it to
subkernels
• Receives output from subkernels,
(optionally) processes it, sends to
frontend
%expand %capture
SoS
Prepare input and capture output of subkernels
Polyglot Notebook
SoS
Data Exchange (magics %get, %put, and %with)
Polyglot Notebook
SoS
How data exchange works
Polyglot Notebook
arr: [1, 2, 3]
df: data.frame(…)
Kernel
Kernel
arr <- c(1, 2, 3)
df = feather.read_dataframe(tmpfile)
write_feather(df, tmpfile)
%put arr --to R
arr: c(1, 2, 3)
%put df
df: pandas.DataFrame(…)
SoS
Kernel
Kernel
• Create independent variables in another kernel
• Direct data exchange between subkernels, or by
way of SoS
• Create variables of similar types
• One to many (e.g. 1, c(1,2) in R)
• Many to one (e.g. Char and str in Julia)
• Intended to support a majority of datatypes, but
with no guarantee of lossless data exchange
• Supports kernels for 11 languages now
Data exchange between SoS and supported subkernels
Polyglot Notebook
Kernel
a=1
b=c(1,2)
a=1
b=[1,2]
c='x'
d='Hello'
c='x'
d="Hello"
SoS
Line-by-line execution in side panel (Ctrl-Shift-Enter)
Polyglot Notebook
Command notebook:run-in-console is available in JupyterLab to execute code in a console panel, a default shortcut is not yet assigned.
SoS
Preview of expressions and files
Polyglot Notebook
JupyterLab PR #4879 for displaying transient information from kernels is pending.
SoS
%revisions, %sessioninfo, and %sossave
Polyglot Notebook
%sossave is equivalent to sos convert from command line. Multiple templates are available.
SoS Workflow System
+ =Polyglot
Notebook
Working
Environment
Workflow
System
SoS
Overview of SoS Workflow Syntax
Workflow System
Script format of function calls
• Indentation is recommended but not required
• Alternative sigil is allowed (e.g. expand='${ }')
Function format
Script format
3.6+
Step header and statements
• Headers define “steps” of workflows
• input, output, and depends specify input, output and
dependent targets of the step
• task defines the rest of the step as external tasks
SoS
From subkernels to SoS kernel
Workflow System
Subkernels
(possibly incomplete scripts)
Kernel
(complete scripts)
SoS
Embedded workflows in notebook
Workflow System
Kernel
(shared kernel namespace)
Workflow
(independent workflow namespace)
SoS
Parameters and runtime signatures
Workflow System
SoS
Process-oriented vs outcome-oriented workflows
Workflow System
• Numerically numbered steps of a “process”
• Execute sequentially (logically)
• Steps can provides targets for others
• Workflow constructed to generate specified targets
(option –t)
SoS
Concurrent execution and external tasks
Workflow System
SoS
hosts.yml
SoS task model
Workflow System
input: “c:Projectf1.fastq”
output: “c:Projectf1.bam”
sh: expand=True
some_command_to_process {_input}
77e3c2ef7079a236.task
input: “/home/bpeng/Project/f1.fastq”
output: “/home/bpeng/Project/f1.bam”
sh: expand=True
some_command_to_process {_input}
77e3c2ef7079a236.task
c:Projectf1.fastq
/Project/f1.fastq
#PBS –N 77e3c2ef7079a236
#PBS –l nodes=1:ppn=1:mem=10G
#PBS –l walltime=24:00:00
cd /home/bpeng1/Project
sos execute 77e3c2ef7079a236
77e3c2ef7079a236.sh /Project/f1.bam
c:Projectf1.bam
SoS
Execute scripts in docker containers
Workflow System
SoS
DAG and workflow reports
Workflow System
SoS Summary
+ =Polyglot
Notebook
Working
Environment
Workflow
System
SoS
Our previous computational environment
Summary
SoS
Our new computational environment
Summary
SoS
SoS notebooks for reproducible data analysis
Summary
+ =
• Multi-language data analysis
with data exchange
• Side panel and magics for
interactive data analysis
Polyglot
Notebook
• Powerful Python-based multi-
style workflow system
• Remote execution of external
tasks
Workflow
System
• Environment for both
interactive data analysis and
batch data analysis
• Reproducible notebooks
Working
Environment
SoS
SoS Status
Summary
https://vatlab.github.io/SoS https://github.com/vatlab
https://vatlab.github.io/blogbpeng@mdanderson.org ScriptOfScripts
Browser:
Languages:
OS: Jupyter:
Container:Task queue:
License:
sos 0.16.9
sos-notebook 0.16.10
jupyterlab-sos 0.2.4
SoS
Acknowledgements
Summary
• Gao Wang (U Chicago)
• Jun Ma
• Man Chong Leong
• Chris Wakefield
• James Melott
• Yulun Chiu
• Di Du
• Dr. John Weinstein
• Dr. Christopher Amos (BCM)
• Dr. Paul Scheet
• Dr. Suzanne Leal (BCM)
• Grant R01HG008972
• Grant 1R01HG005859 (Dr. Paul Scheet)
• CPRIT RP130397
• Gordon and Berry Moore Foundation (#4559)
• The Michael and Susan Dell Foundation
• The Chapman Foundation
SoS Summary
https://vatlab.github.io/sos/live

Weitere ähnliche Inhalte

Was ist angesagt?

Course 102: Lecture 5: File Handling Internals
Course 102: Lecture 5: File Handling Internals Course 102: Lecture 5: File Handling Internals
Course 102: Lecture 5: File Handling Internals Ahmed El-Arabawy
 
Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling Ahmed El-Arabawy
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduceNewvewm
 
A brief history of system calls
A brief history of system callsA brief history of system calls
A brief history of system callsSysdig
 
Course 102: Lecture 24: Archiving and Compression of Files
Course 102: Lecture 24: Archiving and Compression of Files Course 102: Lecture 24: Archiving and Compression of Files
Course 102: Lecture 24: Archiving and Compression of Files Ahmed El-Arabawy
 
Introduction to-linux
Introduction to-linuxIntroduction to-linux
Introduction to-linuxkishore1986
 
Linux Interview Questions Quiz
Linux Interview Questions QuizLinux Interview Questions Quiz
Linux Interview Questions QuizUtkarsh Sengar
 
Ganesh naik linux_kernel_internals
Ganesh naik linux_kernel_internalsGanesh naik linux_kernel_internals
Ganesh naik linux_kernel_internalsGanesh Naik
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testingGaruda Trainings
 
Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell Ahmed El-Arabawy
 
Shell Scripting in Linux
Shell Scripting in LinuxShell Scripting in Linux
Shell Scripting in LinuxAnu Chaudhry
 

Was ist angesagt? (19)

Intro to Python programming and iPython
Intro to Python programming and iPython Intro to Python programming and iPython
Intro to Python programming and iPython
 
Course 102: Lecture 5: File Handling Internals
Course 102: Lecture 5: File Handling Internals Course 102: Lecture 5: File Handling Internals
Course 102: Lecture 5: File Handling Internals
 
Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling Course 102: Lecture 12: Basic Text Handling
Course 102: Lecture 12: Basic Text Handling
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduce
 
Linux Shell Basics
Linux Shell BasicsLinux Shell Basics
Linux Shell Basics
 
A brief history of system calls
A brief history of system callsA brief history of system calls
A brief history of system calls
 
Course 102: Lecture 24: Archiving and Compression of Files
Course 102: Lecture 24: Archiving and Compression of Files Course 102: Lecture 24: Archiving and Compression of Files
Course 102: Lecture 24: Archiving and Compression of Files
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
 
Introduction to-linux
Introduction to-linuxIntroduction to-linux
Introduction to-linux
 
Linux Interview Questions Quiz
Linux Interview Questions QuizLinux Interview Questions Quiz
Linux Interview Questions Quiz
 
Ganesh naik linux_kernel_internals
Ganesh naik linux_kernel_internalsGanesh naik linux_kernel_internals
Ganesh naik linux_kernel_internals
 
Unix - Filters/Editors
Unix - Filters/EditorsUnix - Filters/Editors
Unix - Filters/Editors
 
50 Most Frequently Used UNIX Linux Commands -hmftj
50 Most Frequently Used UNIX  Linux Commands -hmftj50 Most Frequently Used UNIX  Linux Commands -hmftj
50 Most Frequently Used UNIX Linux Commands -hmftj
 
Linux Fundamentals
Linux FundamentalsLinux Fundamentals
Linux Fundamentals
 
Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
 
Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell Course 102: Lecture 10: Learning About the Shell
Course 102: Lecture 10: Learning About the Shell
 
50 most frequently used unix
50 most frequently used unix50 most frequently used unix
50 most frequently used unix
 
Shell Scripting in Linux
Shell Scripting in LinuxShell Scripting in Linux
Shell Scripting in Linux
 
Curious Case of SQLi
Curious Case of SQLiCurious Case of SQLi
Curious Case of SQLi
 

Ähnlich wie Script of Scripts Polyglot Notebook and Workflow System

Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Andrey Vykhodtsev
 
Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008guestd9065
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaBrendan Gregg
 
Using existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsUsing existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsMicrosoft Tech Community
 
Rust All Hands Winter 2011
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011Patrick Walton
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Nicolas Morales
 
MozillaPH Rust Hack & Learn Session 1
MozillaPH Rust Hack & Learn Session 1MozillaPH Rust Hack & Learn Session 1
MozillaPH Rust Hack & Learn Session 1Robert 'Bob' Reyes
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)Michael Rys
 
NASM Introduction.pptx
NASM Introduction.pptxNASM Introduction.pptx
NASM Introduction.pptxAnshKarwa
 
Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Itzik Kotler
 
Tips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyTips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyOlivier Bourgeois
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflowsDaniel S. Katz
 
PASS Summit - SQL Server 2017 Deep Dive
PASS Summit - SQL Server 2017 Deep DivePASS Summit - SQL Server 2017 Deep Dive
PASS Summit - SQL Server 2017 Deep DiveTravis Wright
 
Introduction-to-Linux.pptx
Introduction-to-Linux.pptxIntroduction-to-Linux.pptx
Introduction-to-Linux.pptxDavidMaina47
 

Ähnlich wie Script of Scripts Polyglot Notebook and Workflow System (20)

Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
 
Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008Purdue CS354 Operating Systems 2008
Purdue CS354 Operating Systems 2008
 
What Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versaWhat Linux can learn from Solaris performance and vice-versa
What Linux can learn from Solaris performance and vice-versa
 
Using existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analyticsUsing existing language skillsets to create large-scale, cloud-based analytics
Using existing language skillsets to create large-scale, cloud-based analytics
 
Rust All Hands Winter 2011
Rust All Hands Winter 2011Rust All Hands Winter 2011
Rust All Hands Winter 2011
 
Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014Big SQL 3.0 - Toronto Meetup -- May 2014
Big SQL 3.0 - Toronto Meetup -- May 2014
 
MozillaPH Rust Hack & Learn Session 1
MozillaPH Rust Hack & Learn Session 1MozillaPH Rust Hack & Learn Session 1
MozillaPH Rust Hack & Learn Session 1
 
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
The Road to U-SQL: Experiences in Language Design (SQL Konferenz 2017 Keynote)
 
Intro reverse engineering
Intro reverse engineeringIntro reverse engineering
Intro reverse engineering
 
NASM Introduction.pptx
NASM Introduction.pptxNASM Introduction.pptx
NASM Introduction.pptx
 
Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)Hack Like It's 2013 (The Workshop)
Hack Like It's 2013 (The Workshop)
 
Tips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development EfficiencyTips and Tricks for Increased Development Efficiency
Tips and Tricks for Increased Development Efficiency
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
Implement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVMImplement Runtime Environments for HSA using LLVM
Implement Runtime Environments for HSA using LLVM
 
Basics of C
Basics of CBasics of C
Basics of C
 
Ansible - A 'crowd' introduction
Ansible - A 'crowd' introductionAnsible - A 'crowd' introduction
Ansible - A 'crowd' introduction
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
PASS Summit - SQL Server 2017 Deep Dive
PASS Summit - SQL Server 2017 Deep DivePASS Summit - SQL Server 2017 Deep Dive
PASS Summit - SQL Server 2017 Deep Dive
 
Introduction-to-Linux.pptx
Introduction-to-Linux.pptxIntroduction-to-Linux.pptx
Introduction-to-Linux.pptx
 
Introduction-to-Linux.pptx
Introduction-to-Linux.pptxIntroduction-to-Linux.pptx
Introduction-to-Linux.pptx
 

Kürzlich hochgeladen

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 

Script of Scripts Polyglot Notebook and Workflow System

  • 1. SoS Script of Scripts Bo Peng, PhD Department of Bioinformatics and Computational Biology The University of Texas MD Anderson Cancer Center Polyglot Notebook and Workflow System for both Interactive Multi-language Data Analysis and Batch Data Processing
  • 2. SoS A quick survey Introduction • Have you used more than one Jupyter kernels? • Have you used more than one Jupyter kernels for a single project? • Have you used Jupyter to analyze large data? • Have you used any workflow system for your work?
  • 3. SoS Who we are and what we do Introduction
  • 5. SoS Write and manage scripts written in different languages for different environments Understand and reproduce others’ (and sometimes my own) projects workflow Manage data and workflows on different environments for batch data processing
  • 6. SoS The promises of Jupyter ecosystem Introduction • Supports virtually all scripting languages • Unified notebook format and interface • Flexible client/server architecture • JupyterHub for enterprise • JupyterLab was around the corner (now ready for users) • Binder for reproducible data analysis
  • 7. SoS What was missing for our work? Introduction More IDE features for interactive data analysis Multi-language support Integrated workflow system for batch data processing snakemake
  • 10. SoS Introduction SoS Workflow System Kernel Notebook Server Kernel Kernel Kernel Workflow System
  • 11. SoS Polyglot Notebook + =Polyglot Notebook Working Environment Workflow System
  • 12. SoS A super kernel to all jupyter kernels Polyglot Notebook Kernel Subkernel • Starts and shuts down subkernels • Receives input from frontend, (optionally) processes it, sends it to subkernels • Receives output from subkernels, (optionally) processes it, sends to frontend %expand %capture
  • 13. SoS Prepare input and capture output of subkernels Polyglot Notebook
  • 14. SoS Data Exchange (magics %get, %put, and %with) Polyglot Notebook
  • 15. SoS How data exchange works Polyglot Notebook arr: [1, 2, 3] df: data.frame(…) Kernel Kernel arr <- c(1, 2, 3) df = feather.read_dataframe(tmpfile) write_feather(df, tmpfile) %put arr --to R arr: c(1, 2, 3) %put df df: pandas.DataFrame(…)
  • 16. SoS Kernel Kernel • Create independent variables in another kernel • Direct data exchange between subkernels, or by way of SoS • Create variables of similar types • One to many (e.g. 1, c(1,2) in R) • Many to one (e.g. Char and str in Julia) • Intended to support a majority of datatypes, but with no guarantee of lossless data exchange • Supports kernels for 11 languages now Data exchange between SoS and supported subkernels Polyglot Notebook Kernel a=1 b=c(1,2) a=1 b=[1,2] c='x' d='Hello' c='x' d="Hello"
  • 17. SoS Line-by-line execution in side panel (Ctrl-Shift-Enter) Polyglot Notebook Command notebook:run-in-console is available in JupyterLab to execute code in a console panel, a default shortcut is not yet assigned.
  • 18. SoS Preview of expressions and files Polyglot Notebook JupyterLab PR #4879 for displaying transient information from kernels is pending.
  • 19. SoS %revisions, %sessioninfo, and %sossave Polyglot Notebook %sossave is equivalent to sos convert from command line. Multiple templates are available.
  • 20. SoS Workflow System + =Polyglot Notebook Working Environment Workflow System
  • 21. SoS Overview of SoS Workflow Syntax Workflow System Script format of function calls • Indentation is recommended but not required • Alternative sigil is allowed (e.g. expand='${ }') Function format Script format 3.6+ Step header and statements • Headers define “steps” of workflows • input, output, and depends specify input, output and dependent targets of the step • task defines the rest of the step as external tasks
  • 22. SoS From subkernels to SoS kernel Workflow System Subkernels (possibly incomplete scripts) Kernel (complete scripts)
  • 23. SoS Embedded workflows in notebook Workflow System Kernel (shared kernel namespace) Workflow (independent workflow namespace)
  • 24. SoS Parameters and runtime signatures Workflow System
  • 25. SoS Process-oriented vs outcome-oriented workflows Workflow System • Numerically numbered steps of a “process” • Execute sequentially (logically) • Steps can provides targets for others • Workflow constructed to generate specified targets (option –t)
  • 26. SoS Concurrent execution and external tasks Workflow System
  • 27. SoS hosts.yml SoS task model Workflow System input: “c:Projectf1.fastq” output: “c:Projectf1.bam” sh: expand=True some_command_to_process {_input} 77e3c2ef7079a236.task input: “/home/bpeng/Project/f1.fastq” output: “/home/bpeng/Project/f1.bam” sh: expand=True some_command_to_process {_input} 77e3c2ef7079a236.task c:Projectf1.fastq /Project/f1.fastq #PBS –N 77e3c2ef7079a236 #PBS –l nodes=1:ppn=1:mem=10G #PBS –l walltime=24:00:00 cd /home/bpeng1/Project sos execute 77e3c2ef7079a236 77e3c2ef7079a236.sh /Project/f1.bam c:Projectf1.bam
  • 28. SoS Execute scripts in docker containers Workflow System
  • 29. SoS DAG and workflow reports Workflow System
  • 31. SoS Our previous computational environment Summary
  • 32. SoS Our new computational environment Summary
  • 33. SoS SoS notebooks for reproducible data analysis Summary + = • Multi-language data analysis with data exchange • Side panel and magics for interactive data analysis Polyglot Notebook • Powerful Python-based multi- style workflow system • Remote execution of external tasks Workflow System • Environment for both interactive data analysis and batch data analysis • Reproducible notebooks Working Environment
  • 34. SoS SoS Status Summary https://vatlab.github.io/SoS https://github.com/vatlab https://vatlab.github.io/blogbpeng@mdanderson.org ScriptOfScripts Browser: Languages: OS: Jupyter: Container:Task queue: License: sos 0.16.9 sos-notebook 0.16.10 jupyterlab-sos 0.2.4
  • 35. SoS Acknowledgements Summary • Gao Wang (U Chicago) • Jun Ma • Man Chong Leong • Chris Wakefield • James Melott • Yulun Chiu • Di Du • Dr. John Weinstein • Dr. Christopher Amos (BCM) • Dr. Paul Scheet • Dr. Suzanne Leal (BCM) • Grant R01HG008972 • Grant 1R01HG005859 (Dr. Paul Scheet) • CPRIT RP130397 • Gordon and Berry Moore Foundation (#4559) • The Michael and Susan Dell Foundation • The Chapman Foundation

Hinweis der Redaktion

  1. My answers to all these questions are yes.
  2. We are MD Anderson Cancer One of the largest and best cancer hospital in the world One of the largest bioinformatics department in the nation We have 15 faculty have who made major contribution to many of the national and international projects such as TCGA and ICGC. We have a large statistical analysts team with 20 PhDs (or double MS) who worked on almost 400 projects for more than 100 Principal Investigators at MD Anderson. Basically, we deal with a lot of data.
  3. Data usually come from our labs Bioinformatics need to use all different tools in many languages
  4. JupyterCon so I will save the time
  5. Compared to R Studio Line-by-line execution in console window Variable inspector Preview of variables, figures etc Jupyter supports only one kernel in a notebook Multiple notebooks BeakerX does not support MATLAB and SAS Needs workflow system for batch data processing Usain Bolt competing with Michael Phelps for swimming Different environments counter productive
  6. Start at 8
  7. Three ways but all based on the first magic
  8. Start at 18
  9. Explain what this workflow does
  10. Start at 36
  11. SoS has really changed the way we work, and it should work wonder for you! Please test and let us know what you think.