SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Future: Parallel & Distributed Processing in R for Everyone
Henrik Bengtsson
University of California
@HenrikBengtsson
HenrikBengtsson/future
jottr.org
Acknowledgments
- eRum 2018
- R Consortium
- R Core, CRAN, devels & users!
A 20-minute presentation, eRum 2018, Budapest, 2018-05-16
Why do we parallelize software?
Parallel & distributed processing can be used to:
1. speed up processing (wall time)
2. decrease memory footprint (per machine)
3. avoid data transfers
Comment: I'll focuses on the first two in this talk.
2 / 22
Definition: Future
A future is an abstraction for a value that will be available later
The value is the result of an evaluated expression
The state of a future is unevaluated or evaluated
Friedman & Wise (1976, 1977), Hibbard (1976), Baker & Hewitt (1977)
3 / 22
Standard R:
v <- expr
Future API:
f <- future(expr)
v <- value(f)
Definition: Future
A future is an abstraction for a value that will be available later
The value is the result of an evaluated expression
The state of a future is unevaluated or evaluated
3 / 22
Example: Sum of 1:50 and 51:100 in parallel
> library(future)
> plan(multiprocess)
> fa <- future( slow_sum( 1:50 ) )
> fb <- future( slow_sum(51:100) )
> 1:3
[1] 1 2 3
> value(fa)
[1] 1275
> value(fb)
[1] 3775
> value(fa) + value(fb)
[1] 5050
3 / 22
Standard R:
v <- expr
Future API (implicit):
v %<-% expr
Definition: Future
4 / 22
(implicit API)Example: Sum of 1:50 and 51:100 in parallel
> library(future)
> plan(multiprocess)
> a %<-% slow_sum( 1:50 )
> b %<-% slow_sum(51:100)
> 1:3
[1] 1 2 3
> a + b
[1] 5050
5 / 22
Many ways to resolve futures
plan(sequential)
plan(multiprocess)
plan(cluster, workers = c("n1", "n2", "n3"))
plan(cluster, workers = c("remote1.org", "remote2.org"))
...
> a %<-% slow_sum( 1:50 )
> b %<-% slow_sum(51:100)
> a + b
[1] 5050
6 / 22
CRANCRAN 1.8.11.8.1 codecovcodecov 90%90%
R package: future
"Write once, run anywhere"
A simple unified API ("interface of interfaces")
100% cross platform
Easy to install (~0.4 MiB total)
Very well tested, lots of CPU mileage, production ready
╔════════════════════════════════════════════════════════╗
║ < Future API > ║
║ ║
║ future(), value(), %<-%, ... ║
╠════════════════════════════════════════════════════════╣
║ future ║
╠════════════════════════════════╦═══════════╦═══════════╣
║ parallel ║ globals ║ (listenv) ║
╠══════════╦══════════╦══════════╬═══════════╬═══════════╝
║ snow ║ Rmpi ║ nws ║ codetools ║
╚══════════╩══════════╩══════════╩═══════════╝
7 / 22
Problem: heterogeneityWhy a Future API?
Different parallel backends ⇔ different APIs
Choosing API/backend, limits user's options
x <- list(a = 1:50, b = 51:100)
y <- lapply(x, FUN = slow_sum)
y <- parallel::mclapply(x, FUN = slow_sum)
library(parallel)
cluster <- makeCluster(4)
y <- parLapply(cluster, x, fun = slow_sum)
stopCluster(cluster)
8 / 22
Solution: "interface of interfaces"Why a Future API?
The Future API encapsulates heterogeneity
fever decisions for developer to make
more power to the end user
Philosophy:
developer decides what to parallelize - user decides how to
Provides atomic building blocks for richer parallel constructs, e.g. 'foreach'
and 'future.apply'
Easy to implement new backends, e.g. 'future.batchtools' and 'future.callr'
9 / 22
99% Worry FreeWhy a Future API?
Globals: automatically identified & exported
Packages: automatically identified & exported
Static-code inspection by walking the AST
x <- rnorm(n = 100)
y <- future({ slow_sum(x) })
Globals identified and exported:
1. slow_sum() - a function (also searched recursively)
2. x - a numeric vector of length 100
Globals & packages can be specified manually too
10 / 22
High Performance Compute (HPC) clusters
10 / 22
CRANCRAN 0.7.00.7.0 codecovcodecov 89%89%
Backend: future.batchtools
batchtools: Map-Reduce API for HPC schedulers,
e.g. LSF, OpenLava, SGE, Slurm, and TORQUE / PBS
future.batchtools: Future API on top of batchtools
╔═══════════════════════════════════════════════════╗
║ < Future API > ║
╠═══════════════════════════════════════════════════╣
║ future <-> future.batchtools ║
╠═════════════════════════╦═════════════════════════╣
║ parallel ║ batchtools ║
╚═════════════════════════╬═════════════════════════╣
║ SGE, Slurm, TORQUE, ... ║
╚═════════════════════════╝
CRANCRAN 0.7.00.7.0 codecovcodecov 89%89%
Backend: future.batchtools
> library(future.batchtools)
> plan(batchtools_sge)
> a %<-% slow_sum(1:50)
> b %<-% slow_sum(51:100)
> a + b
[1] 5050
12 / 22
Real Example: DNA Sequence Analysis
DNA sequences from 100 cancer patients
200 GiB data / patient (~ 10 hours)
raw <- dir(pattern = "[.]fq$")
aligned <- listenv()
for (i in seq_along(raw)) {
aligned[[i]] %<-% DNAseq::align(raw[i])
}
aligned <- as.list(aligned)
plan(multiprocess)
plan(cluster, workers = c("n1", "n2", "n3"))
plan(batchtools_sge)
Comment: The use of `listenv` is non-critical and only needed for implicit futures when assigning them by index
(instead of by name).
13 / 22
Building on top of Future API
13 / 22
CRANCRAN 0.2.00.2.0 codecovcodecov 95%95%
Frontend: future.apply
Futurized version of base R's lapply(), vapply(), replicate(), ...
... on all future-compatible backends
╔═══════════════════════════════════════════════════════════╗
║ future_lapply(), future_vapply(), future_replicate(), ... ║
╠═══════════════════════════════════════════════════════════╣
║ < Future API > ║
╠═══════════════════════════════════════════════════════════╣
║ "wherever" ║
╚═══════════════════════════════════════════════════════════╝
aligned <- lapply(raw, DNAseq::align)
CRANCRAN 0.2.00.2.0 codecovcodecov 95%95%
Frontend: future.apply
Futurized version of base R's lapply(), vapply(), replicate(), ...
... on all future-compatible backends
╔═══════════════════════════════════════════════════════════╗
║ future_lapply(), future_vapply(), future_replicate(), ... ║
╠═══════════════════════════════════════════════════════════╣
║ < Future API > ║
╠═══════════════════════════════════════════════════════════╣
║ "wherever" ║
╚═══════════════════════════════════════════════════════════╝
aligned <- future_lapply(raw, DNAseq::align)
plan(multiprocess)
plan(cluster, workers = c("n1", "n2", "n3"))
plan(batchtools_sge)
15 / 22
CRANCRAN 0.6.00.6.0 codecovcodecov 83%83%
Frontend: doFuture
A foreach adapter on top of the Future API
Foreach on all future-compatible backends
╔═══════════════════════════════════════════════════════╗
║ foreach API ║
╠════════════╦══════╦════════╦═══════╦══════════════════╣
║ doParallel ║ doMC ║ doSNOW ║ doMPI ║ doFuture ║
╠════════════╩══╦═══╩════════╬═══════╬══════════════════╣
║ parallel ║ snow ║ Rmpi ║ < Future API > ║
╚═══════════════╩════════════╩═══════╬══════════════════╣
║ "wherever" ║
╚══════════════════╝
doFuture::registerDoFuture()
plan(batchtools_sge)
aligned <- foreach(x = raw) %dopar% {
DNAseq::align(x)
}
16 / 22
1,200+ packages can now parallelize on HPC
┌───────────────────────────────────────────────────────┐
│ │
│ caret, gam, glmnet, plyr, ... (1,200 pkgs) │
│ │
╠═══════════════════════════════════════════════════════╣
║ foreach API ║
╠══════╦════════════╦════════╦═══════╦══════════════════╣
║ doMC ║ doParallel ║ doSNOW ║ doMPI ║ doFuture ║
╠══════╩════════╦═══╩════════╬═══════╬══════════════════╣
║ parallel ║ snow ║ Rmpi ║ < Future API > ║
╚═══════════════╩════════════╩═══════╬══════════════════╣
║ "wherever" ║
╚══════════════════╝
doFuture::registerDoFuture()
plan(future.batchtools::batchtools_sge)
library(caret)
model <- train(y ~ ., data = training)
17 / 22
Futures in the Wild
17 / 22
Frontend: drake - A Workflow Manager
 
tasks <- drake_plan(
raw_data = readxl::read_xlsx(file_in("raw-data.xlsx")),
data = raw_data %>% mutate(Species =
forcats::fct_inorder(Species)) %>% select(-X__1),
hist = ggplot(data, aes(x = Petal.Width, fill = Species))
+ geom_histogram(),
fit = lm(Sepal.Width ~ Petal.Width + Species, data),
rmarkdown::render(knitr_in("report.Rmd"),
output_file = file_out("report.pdf"))
)
future::plan("multiprocess")
make(tasks, parallelism = "future")
19 / 22
shiny - Now with Asynchronous UI
library(shiny)
future::plan("multiprocess")
...
20 / 22
Summary of features
Unified API
Portable code
Worry-free
Developer decides what to parallelize - user decides how to
For beginners as well as advanced users
Nested parallelism on nested heterogeneous backends
Protects against recursive parallelism
Easy to add new backends
Easy to build new frontends
21 / 22
In the near future ...
Capturing standard output
Benchmarking (time and memory)
Killing futures
Building a better future
I 💜 feedback,
bug reports,
and suggestions
@HenrikBengtsson
HenrikBengtsson/future
jottr.org
Thank you!
Appendix (Random Slides)
A1. Features - more details
A1.1 Well Tested
Large number of unit tests
System tests
High code coverage (union of all platform near 100%)
Cross platform testing
CI testing
Testing several R versions (many generations back)
Reverse package dependency tests
All backends highly tested
Large of tests via doFuture across backends on example():s from foreach, NMF, TSP, glmnet,
plyr, caret, etc. (example link)
R Consortium Infrastructure Steering Committee (ISC) Support Project
Backend Conformance Test Suite - an effort to formalizing and standardizing the above tests into
a unified go-to test environment.
A1.2 Nested futures
raw <- dir(pattern = "[.]fq$")
aligned <- listenv()
for (i in seq_along(raw)) {
aligned[[i]] %<-% {
chrs <- listenv()
for (j in 1:24) {
chrs[[j]] %<-% DNAseq::align(raw[i], chr = j)
}
merge_chromosomes(chrs)
}
}
plan(batchtools_sge)
plan(list(batchtools_sge, sequential))
plan(list(batchtools_sge, multiprocess))
A1.3 Lazy evaluation
By default all futures are resolved using eager evaluation, but the
developer has the option to use lazy evaluation.
Explicit API:
f <- future(..., lazy = TRUE)
v <- value(f)
Implicit API:
v %<-% { ... } %lazy% TRUE
A1.4 False-negative & false-positive globals
Identification of globals from static-code inspection has limitations (but defaults cover a
large number of use cases):
False negatives, e.g. my_fcn is not found in do.call("my_fcn", x). Avoid by using
do.call(my_fcn, x).
False positives - non-existing variables, e.g. NSE and variables in formulas. Ignore and
leave it to run-time.
x <- "this FP will be exported"
data <- data.frame(x = rnorm(1000), y = rnorm(1000))
fit %<-% lm(x ~ y, data = data)
Comment: ... so, the above works.
A1.5 Full control of globals (explicit API)
Automatic (default):
x <- rnorm(n = 100)
y <- future({ slow_sum(x) }, globals = TRUE)
By names:
y <- future({ slow_sum(x) }, globals = c("slow_sum", "x"))
As name-value pairs:
y <- future({ slow_sum(x) }, globals =
list(slow_sum = slow_sum, x = rnorm(n = 100)))
Disable:
y <- future({ slow_sum(x) }, globals = FALSE)
A1.5 Full control of globals (implicit API)
Automatic (default):
x <- rnorm(n = 100)
y %<-% { slow_sum(x) } %globals% TRUE
By names:
y %<-% { slow_sum(x) } %globals% c("slow_sum", "x")
As name-value pairs:
y %<-% { slow_sum(x) } %globals% list(slow_sum = slow_sum, x = rnorm(n = 100))
Disable:
y %<-% { slow_sum(x) } %globals% FALSE
A1.6 Protection: Exporting too large objects
x <- lapply(1:100, FUN = function(i) rnorm(1024 ^ 2))
y <- list()
for (i in seq_along(x)) {
y[[i]] <- future( mean(x[[i]]) )
}
gives error: "The total size of the 2 globals that need to be exported for the future expression
('mean(x[[i]])') is 800.00 MiB. This exceeds the maximum allowed size of 500.00 MiB (option
'future.globals.maxSize'). There are two globals: 'x' (800.00 MiB of class 'list') and 'i' (48 bytes of
class 'numeric')."
for (i in seq_along(x)) {
x_i <- x[[i]] ## Fix: subset before creating future
y[[i]] <- future( mean(x_i) )
}
Comment: Interesting research project to automate via code inspection.
A1.7 Free futures are resolved
Implicit futures are always resolved:
a %<-% sum(1:10)
b %<-% { 2 * a }
print(b)
## [1] 110
Explicit futures require care by developer:
fa <- future( sum(1:10) )
a <- value(fa)
fb <- future( 2 * a )
For the lazy developer - not recommended (may be expensive):
options(future.globals.resolve = TRUE)
fa <- future( sum(1:10) )
fb <- future( 2 * value(fa) )
A1.8 What's under the hood?
Future class and corresponding methods:
abstract S3 class with common parts implemented,
e.g. globals and protection
new backends extend this class and implement core methods,
e.g. value() and resolved()
built-in classes implement backends on top the parallel package
A1.9 Universal union of parallel frameworks
future parallel foreach batchtools BiocParallel
future parallel foreach batchtools BiocParallel
Synchronous ✓ ✓ ✓ ✓ ✓
Asynchronous ✓ ✓ ✓ ✓ ✓
Uniform API ✓ ✓ ✓ ✓
Extendable API ✓ ✓ ✓ ✓
Globals ✓ (✓)
Packages ✓
Map-reduce
("lapply")
✓ ✓ foreach() ✓ ✓
Load balancing ✓ ✓ ✓ ✓ ✓
For loops ✓
While loops ✓
Nested config ✓
Recursive
protection
✓ mc mc mc mc
RNG stream ✓+ ✓ doRNG (soon) SNOW
Early stopping ✓ ✓
Traceback ✓ ✓
A2 Bells & whistles
A2.1 availableCores() & availableWorkers()
availableCores() is a "nicer" version of parallel::detectCores() that returns the number of
cores allotted to the process by acknowledging known settings, e.g.
getOption("mc.cores")
HPC environment variables, e.g. NSLOTS, PBS_NUM_PPN, SLURM_CPUS_PER_TASK, ...
_R_CHECK_LIMIT_CORES_
availableWorkers() returns a vector of hostnames based on:
HPC environment information, e.g. PE_HOSTFILE, PBS_NODEFILE, ...
Fallback to rep("localhost", availableCores())
Provide safe defaults to for instance
plan(multiprocess)
plan(cluster)
A2.2: makeClusterPSOCK()
makeClusterPSOCK():
Improves upon parallel::makePSOCKcluster()
Simplifies cluster setup, especially remote ones
Avoids common issues when workers connect back to master:
uses SSH reverse tunneling
no need for port-forwarding / firewall configuration
no need for DNS lookup
Makes option -l <user> optional (such that ~/.ssh/config is respected)
A2.3 HPC resource parameters
With 'future.batchtools' one can also specify computational resources, e.g.
cores per node and memory needs.
plan(batchtools_sge, resources = list(mem = "128gb"))
y %<-% { large_memory_method(x) }
Specific to scheduler: resources is passed to the job-script template where
the parameters are interpreted and passed to the scheduler.
Each future needs one node with 24 cores and 128 GiB of RAM:
resources = list(l = "nodes=1:ppn=24", mem = "128gb")
A3. More Examples
A3.1 Plot remotely - display locally
> library(future)
> plan(cluster, workers = "remote.org")
## Plot remotely
> g %<-% R.devices::capturePlot({
filled.contour(volcano, color.palette = terrain.colors)
title("volcano data: filled contour map")
})
## Display locally
> g
A3.2 Profile code remotely - display locally
> plan(cluster, workers = "remote.org")
> dat <- data.frame(
+ x = rnorm(50e3),
+ y = rnorm(50e3)
+ )
## Profile remotely
> p %<-% profvis::profvis({
+ plot(x ~ y, data = dat)
+ m <- lm(x ~ y, data = dat)
+ abline(m, col = "red")
+ })
## Browse locally
> p
A3.3 fiery - flexible lightweight web server
"... framework for building web servers in R. ... from serving static
content to full-blown dynamic web-apps"
A3.4 "It kinda just works" (furrr = future + purrr)
plan(multisession)
mtcars %>%
split(.$cyl) %>%
map(~ future(lm(mpg ~ wt, data = .x))) %>% values %>%
map(summary) %>%
map_dbl("r.squared")
## 4 6 8
## 0.5086326 0.4645102 0.4229655
Comment: This approach not do load balancing. I have a few ideas how support for this may be
implemented in future framework, which would be beneficial here and elsewhere.
A3.5 Backend: Google Cloud Engine Cluster 
library(googleComputeEngineR)
vms <- lapply(paste0("node", 1:10),
FUN = gce_vm, template = "r-base")
cl <- as.cluster(lapply(vms, FUN = gce_ssh_setup),
docker_image = "henrikbengtsson/r-base-future")
plan(cluster, workers = cl)
data <- future_lapply(1:100, montecarlo_pi, B = 10e3)
pi_hat <- Reduce(calculate_pi, data)
print(pi_hat)
## 3.1415
A4. Future Work
A4.1 Standard resource types(?)
For any type of futures, the develop may wish to control:
memory requirements, e.g. future(..., memory = 8e9)
local machine only, e.g. remote = FALSE
dependencies, e.g. dependencies = c("R (>= 3.5.0)", "rio"))
file-system availability, e.g. mounts = "/share/lab/files"
data locality, e.g. vars = c("gene_db", "mtcars")
containers, e.g. container = "docker://rocker/r-base"
generic resources, e.g. tokens = c("a", "b")
...?
Risk for bloating the Future API: Which need to be included? Don't want to reinvent the HPC
scheduler and Spark.

Weitere ähnliche Inhalte

Ähnlich wie A Future for R: Parallel and Distributed Processing in R for Everyone

Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
inside-BigData.com
 

Ähnlich wie A Future for R: Parallel and Distributed Processing in R for Everyone (20)

Hierarchical free monads and software design in fp
Hierarchical free monads and software design in fpHierarchical free monads and software design in fp
Hierarchical free monads and software design in fp
 
Model-based GUI testing using UPPAAL
Model-based GUI testing using UPPAALModel-based GUI testing using UPPAAL
Model-based GUI testing using UPPAAL
 
Real time-collaborative-editor-presentation
Real time-collaborative-editor-presentationReal time-collaborative-editor-presentation
Real time-collaborative-editor-presentation
 
Apache Flink & Graph Processing
Apache Flink & Graph ProcessingApache Flink & Graph Processing
Apache Flink & Graph Processing
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
OW2con'14 - XLcoud, 3D rendering in the cloud, Marius Preda, Institut Mines T...
OW2con'14 - XLcoud, 3D rendering in the cloud, Marius Preda, Institut Mines T...OW2con'14 - XLcoud, 3D rendering in the cloud, Marius Preda, Institut Mines T...
OW2con'14 - XLcoud, 3D rendering in the cloud, Marius Preda, Institut Mines T...
 
NGRX Apps in Depth
NGRX Apps in DepthNGRX Apps in Depth
NGRX Apps in Depth
 
Hadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologiesHadoop trainting in hyderabad@kelly technologies
Hadoop trainting in hyderabad@kelly technologies
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 
Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015Ehsan parallel accelerator-dec2015
Ehsan parallel accelerator-dec2015
 
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
Accelerating Spark MLlib and DataFrame with Vector Processor “SX-Aurora TSUBASA”
 
Spark training-in-bangalore
Spark training-in-bangaloreSpark training-in-bangalore
Spark training-in-bangalore
 
new
newnew
new
 
งาน
งานงาน
งาน
 
GCF
GCFGCF
GCF
 
Hpx runtime system
Hpx runtime systemHpx runtime system
Hpx runtime system
 
Hpx runtime system
Hpx runtime systemHpx runtime system
Hpx runtime system
 
Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011Advanced Graphics Workshop - GFX2011
Advanced Graphics Workshop - GFX2011
 
Data-Centric Parallel Programming
Data-Centric Parallel ProgrammingData-Centric Parallel Programming
Data-Centric Parallel Programming
 
Tips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software EngineeringTips And Tricks For Bioinformatics Software Engineering
Tips And Tricks For Bioinformatics Software Engineering
 

Mehr von inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
inside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
inside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
inside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
inside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
inside-BigData.com
 

Mehr von inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Kürzlich hochgeladen

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Kürzlich hochgeladen (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 

A Future for R: Parallel and Distributed Processing in R for Everyone

  • 1. Future: Parallel & Distributed Processing in R for Everyone Henrik Bengtsson University of California @HenrikBengtsson HenrikBengtsson/future jottr.org Acknowledgments - eRum 2018 - R Consortium - R Core, CRAN, devels & users! A 20-minute presentation, eRum 2018, Budapest, 2018-05-16
  • 2. Why do we parallelize software? Parallel & distributed processing can be used to: 1. speed up processing (wall time) 2. decrease memory footprint (per machine) 3. avoid data transfers Comment: I'll focuses on the first two in this talk. 2 / 22
  • 3. Definition: Future A future is an abstraction for a value that will be available later The value is the result of an evaluated expression The state of a future is unevaluated or evaluated Friedman & Wise (1976, 1977), Hibbard (1976), Baker & Hewitt (1977) 3 / 22
  • 4. Standard R: v <- expr Future API: f <- future(expr) v <- value(f) Definition: Future A future is an abstraction for a value that will be available later The value is the result of an evaluated expression The state of a future is unevaluated or evaluated 3 / 22
  • 5. Example: Sum of 1:50 and 51:100 in parallel > library(future) > plan(multiprocess) > fa <- future( slow_sum( 1:50 ) ) > fb <- future( slow_sum(51:100) ) > 1:3 [1] 1 2 3 > value(fa) [1] 1275 > value(fb) [1] 3775 > value(fa) + value(fb) [1] 5050 3 / 22
  • 6. Standard R: v <- expr Future API (implicit): v %<-% expr Definition: Future 4 / 22
  • 7. (implicit API)Example: Sum of 1:50 and 51:100 in parallel > library(future) > plan(multiprocess) > a %<-% slow_sum( 1:50 ) > b %<-% slow_sum(51:100) > 1:3 [1] 1 2 3 > a + b [1] 5050 5 / 22
  • 8. Many ways to resolve futures plan(sequential) plan(multiprocess) plan(cluster, workers = c("n1", "n2", "n3")) plan(cluster, workers = c("remote1.org", "remote2.org")) ... > a %<-% slow_sum( 1:50 ) > b %<-% slow_sum(51:100) > a + b [1] 5050 6 / 22
  • 9. CRANCRAN 1.8.11.8.1 codecovcodecov 90%90% R package: future "Write once, run anywhere" A simple unified API ("interface of interfaces") 100% cross platform Easy to install (~0.4 MiB total) Very well tested, lots of CPU mileage, production ready ╔════════════════════════════════════════════════════════╗ ║ < Future API > ║ ║ ║ ║ future(), value(), %<-%, ... ║ ╠════════════════════════════════════════════════════════╣ ║ future ║ ╠════════════════════════════════╦═══════════╦═══════════╣ ║ parallel ║ globals ║ (listenv) ║ ╠══════════╦══════════╦══════════╬═══════════╬═══════════╝ ║ snow ║ Rmpi ║ nws ║ codetools ║ ╚══════════╩══════════╩══════════╩═══════════╝ 7 / 22
  • 10. Problem: heterogeneityWhy a Future API? Different parallel backends ⇔ different APIs Choosing API/backend, limits user's options x <- list(a = 1:50, b = 51:100) y <- lapply(x, FUN = slow_sum) y <- parallel::mclapply(x, FUN = slow_sum) library(parallel) cluster <- makeCluster(4) y <- parLapply(cluster, x, fun = slow_sum) stopCluster(cluster) 8 / 22
  • 11. Solution: "interface of interfaces"Why a Future API? The Future API encapsulates heterogeneity fever decisions for developer to make more power to the end user Philosophy: developer decides what to parallelize - user decides how to Provides atomic building blocks for richer parallel constructs, e.g. 'foreach' and 'future.apply' Easy to implement new backends, e.g. 'future.batchtools' and 'future.callr' 9 / 22
  • 12. 99% Worry FreeWhy a Future API? Globals: automatically identified & exported Packages: automatically identified & exported Static-code inspection by walking the AST x <- rnorm(n = 100) y <- future({ slow_sum(x) }) Globals identified and exported: 1. slow_sum() - a function (also searched recursively) 2. x - a numeric vector of length 100 Globals & packages can be specified manually too 10 / 22
  • 13. High Performance Compute (HPC) clusters 10 / 22
  • 14. CRANCRAN 0.7.00.7.0 codecovcodecov 89%89% Backend: future.batchtools batchtools: Map-Reduce API for HPC schedulers, e.g. LSF, OpenLava, SGE, Slurm, and TORQUE / PBS future.batchtools: Future API on top of batchtools ╔═══════════════════════════════════════════════════╗ ║ < Future API > ║ ╠═══════════════════════════════════════════════════╣ ║ future <-> future.batchtools ║ ╠═════════════════════════╦═════════════════════════╣ ║ parallel ║ batchtools ║ ╚═════════════════════════╬═════════════════════════╣ ║ SGE, Slurm, TORQUE, ... ║ ╚═════════════════════════╝
  • 15. CRANCRAN 0.7.00.7.0 codecovcodecov 89%89% Backend: future.batchtools > library(future.batchtools) > plan(batchtools_sge) > a %<-% slow_sum(1:50) > b %<-% slow_sum(51:100) > a + b [1] 5050 12 / 22
  • 16. Real Example: DNA Sequence Analysis DNA sequences from 100 cancer patients 200 GiB data / patient (~ 10 hours) raw <- dir(pattern = "[.]fq$") aligned <- listenv() for (i in seq_along(raw)) { aligned[[i]] %<-% DNAseq::align(raw[i]) } aligned <- as.list(aligned) plan(multiprocess) plan(cluster, workers = c("n1", "n2", "n3")) plan(batchtools_sge) Comment: The use of `listenv` is non-critical and only needed for implicit futures when assigning them by index (instead of by name). 13 / 22
  • 17. Building on top of Future API 13 / 22
  • 18. CRANCRAN 0.2.00.2.0 codecovcodecov 95%95% Frontend: future.apply Futurized version of base R's lapply(), vapply(), replicate(), ... ... on all future-compatible backends ╔═══════════════════════════════════════════════════════════╗ ║ future_lapply(), future_vapply(), future_replicate(), ... ║ ╠═══════════════════════════════════════════════════════════╣ ║ < Future API > ║ ╠═══════════════════════════════════════════════════════════╣ ║ "wherever" ║ ╚═══════════════════════════════════════════════════════════╝ aligned <- lapply(raw, DNAseq::align)
  • 19. CRANCRAN 0.2.00.2.0 codecovcodecov 95%95% Frontend: future.apply Futurized version of base R's lapply(), vapply(), replicate(), ... ... on all future-compatible backends ╔═══════════════════════════════════════════════════════════╗ ║ future_lapply(), future_vapply(), future_replicate(), ... ║ ╠═══════════════════════════════════════════════════════════╣ ║ < Future API > ║ ╠═══════════════════════════════════════════════════════════╣ ║ "wherever" ║ ╚═══════════════════════════════════════════════════════════╝ aligned <- future_lapply(raw, DNAseq::align) plan(multiprocess) plan(cluster, workers = c("n1", "n2", "n3")) plan(batchtools_sge) 15 / 22
  • 20. CRANCRAN 0.6.00.6.0 codecovcodecov 83%83% Frontend: doFuture A foreach adapter on top of the Future API Foreach on all future-compatible backends ╔═══════════════════════════════════════════════════════╗ ║ foreach API ║ ╠════════════╦══════╦════════╦═══════╦══════════════════╣ ║ doParallel ║ doMC ║ doSNOW ║ doMPI ║ doFuture ║ ╠════════════╩══╦═══╩════════╬═══════╬══════════════════╣ ║ parallel ║ snow ║ Rmpi ║ < Future API > ║ ╚═══════════════╩════════════╩═══════╬══════════════════╣ ║ "wherever" ║ ╚══════════════════╝ doFuture::registerDoFuture() plan(batchtools_sge) aligned <- foreach(x = raw) %dopar% { DNAseq::align(x) } 16 / 22
  • 21. 1,200+ packages can now parallelize on HPC ┌───────────────────────────────────────────────────────┐ │ │ │ caret, gam, glmnet, plyr, ... (1,200 pkgs) │ │ │ ╠═══════════════════════════════════════════════════════╣ ║ foreach API ║ ╠══════╦════════════╦════════╦═══════╦══════════════════╣ ║ doMC ║ doParallel ║ doSNOW ║ doMPI ║ doFuture ║ ╠══════╩════════╦═══╩════════╬═══════╬══════════════════╣ ║ parallel ║ snow ║ Rmpi ║ < Future API > ║ ╚═══════════════╩════════════╩═══════╬══════════════════╣ ║ "wherever" ║ ╚══════════════════╝ doFuture::registerDoFuture() plan(future.batchtools::batchtools_sge) library(caret) model <- train(y ~ ., data = training) 17 / 22
  • 22. Futures in the Wild 17 / 22
  • 23. Frontend: drake - A Workflow Manager   tasks <- drake_plan( raw_data = readxl::read_xlsx(file_in("raw-data.xlsx")), data = raw_data %>% mutate(Species = forcats::fct_inorder(Species)) %>% select(-X__1), hist = ggplot(data, aes(x = Petal.Width, fill = Species)) + geom_histogram(), fit = lm(Sepal.Width ~ Petal.Width + Species, data), rmarkdown::render(knitr_in("report.Rmd"), output_file = file_out("report.pdf")) ) future::plan("multiprocess") make(tasks, parallelism = "future")
  • 25. shiny - Now with Asynchronous UI library(shiny) future::plan("multiprocess") ... 20 / 22
  • 26. Summary of features Unified API Portable code Worry-free Developer decides what to parallelize - user decides how to For beginners as well as advanced users Nested parallelism on nested heterogeneous backends Protects against recursive parallelism Easy to add new backends Easy to build new frontends 21 / 22
  • 27. In the near future ... Capturing standard output Benchmarking (time and memory) Killing futures
  • 28. Building a better future I 💜 feedback, bug reports, and suggestions @HenrikBengtsson HenrikBengtsson/future jottr.org Thank you!
  • 30. A1. Features - more details
  • 31. A1.1 Well Tested Large number of unit tests System tests High code coverage (union of all platform near 100%) Cross platform testing CI testing Testing several R versions (many generations back) Reverse package dependency tests All backends highly tested Large of tests via doFuture across backends on example():s from foreach, NMF, TSP, glmnet, plyr, caret, etc. (example link) R Consortium Infrastructure Steering Committee (ISC) Support Project Backend Conformance Test Suite - an effort to formalizing and standardizing the above tests into a unified go-to test environment.
  • 32. A1.2 Nested futures raw <- dir(pattern = "[.]fq$") aligned <- listenv() for (i in seq_along(raw)) { aligned[[i]] %<-% { chrs <- listenv() for (j in 1:24) { chrs[[j]] %<-% DNAseq::align(raw[i], chr = j) } merge_chromosomes(chrs) } } plan(batchtools_sge) plan(list(batchtools_sge, sequential)) plan(list(batchtools_sge, multiprocess))
  • 33. A1.3 Lazy evaluation By default all futures are resolved using eager evaluation, but the developer has the option to use lazy evaluation. Explicit API: f <- future(..., lazy = TRUE) v <- value(f) Implicit API: v %<-% { ... } %lazy% TRUE
  • 34. A1.4 False-negative & false-positive globals Identification of globals from static-code inspection has limitations (but defaults cover a large number of use cases): False negatives, e.g. my_fcn is not found in do.call("my_fcn", x). Avoid by using do.call(my_fcn, x). False positives - non-existing variables, e.g. NSE and variables in formulas. Ignore and leave it to run-time. x <- "this FP will be exported" data <- data.frame(x = rnorm(1000), y = rnorm(1000)) fit %<-% lm(x ~ y, data = data) Comment: ... so, the above works.
  • 35. A1.5 Full control of globals (explicit API) Automatic (default): x <- rnorm(n = 100) y <- future({ slow_sum(x) }, globals = TRUE) By names: y <- future({ slow_sum(x) }, globals = c("slow_sum", "x")) As name-value pairs: y <- future({ slow_sum(x) }, globals = list(slow_sum = slow_sum, x = rnorm(n = 100))) Disable: y <- future({ slow_sum(x) }, globals = FALSE)
  • 36. A1.5 Full control of globals (implicit API) Automatic (default): x <- rnorm(n = 100) y %<-% { slow_sum(x) } %globals% TRUE By names: y %<-% { slow_sum(x) } %globals% c("slow_sum", "x") As name-value pairs: y %<-% { slow_sum(x) } %globals% list(slow_sum = slow_sum, x = rnorm(n = 100)) Disable: y %<-% { slow_sum(x) } %globals% FALSE
  • 37. A1.6 Protection: Exporting too large objects x <- lapply(1:100, FUN = function(i) rnorm(1024 ^ 2)) y <- list() for (i in seq_along(x)) { y[[i]] <- future( mean(x[[i]]) ) } gives error: "The total size of the 2 globals that need to be exported for the future expression ('mean(x[[i]])') is 800.00 MiB. This exceeds the maximum allowed size of 500.00 MiB (option 'future.globals.maxSize'). There are two globals: 'x' (800.00 MiB of class 'list') and 'i' (48 bytes of class 'numeric')." for (i in seq_along(x)) { x_i <- x[[i]] ## Fix: subset before creating future y[[i]] <- future( mean(x_i) ) } Comment: Interesting research project to automate via code inspection.
  • 38. A1.7 Free futures are resolved Implicit futures are always resolved: a %<-% sum(1:10) b %<-% { 2 * a } print(b) ## [1] 110 Explicit futures require care by developer: fa <- future( sum(1:10) ) a <- value(fa) fb <- future( 2 * a ) For the lazy developer - not recommended (may be expensive): options(future.globals.resolve = TRUE) fa <- future( sum(1:10) ) fb <- future( 2 * value(fa) )
  • 39. A1.8 What's under the hood? Future class and corresponding methods: abstract S3 class with common parts implemented, e.g. globals and protection new backends extend this class and implement core methods, e.g. value() and resolved() built-in classes implement backends on top the parallel package
  • 40. A1.9 Universal union of parallel frameworks future parallel foreach batchtools BiocParallel future parallel foreach batchtools BiocParallel Synchronous ✓ ✓ ✓ ✓ ✓ Asynchronous ✓ ✓ ✓ ✓ ✓ Uniform API ✓ ✓ ✓ ✓ Extendable API ✓ ✓ ✓ ✓ Globals ✓ (✓) Packages ✓ Map-reduce ("lapply") ✓ ✓ foreach() ✓ ✓ Load balancing ✓ ✓ ✓ ✓ ✓ For loops ✓ While loops ✓ Nested config ✓ Recursive protection ✓ mc mc mc mc RNG stream ✓+ ✓ doRNG (soon) SNOW Early stopping ✓ ✓ Traceback ✓ ✓
  • 41. A2 Bells & whistles
  • 42. A2.1 availableCores() & availableWorkers() availableCores() is a "nicer" version of parallel::detectCores() that returns the number of cores allotted to the process by acknowledging known settings, e.g. getOption("mc.cores") HPC environment variables, e.g. NSLOTS, PBS_NUM_PPN, SLURM_CPUS_PER_TASK, ... _R_CHECK_LIMIT_CORES_ availableWorkers() returns a vector of hostnames based on: HPC environment information, e.g. PE_HOSTFILE, PBS_NODEFILE, ... Fallback to rep("localhost", availableCores()) Provide safe defaults to for instance plan(multiprocess) plan(cluster)
  • 43. A2.2: makeClusterPSOCK() makeClusterPSOCK(): Improves upon parallel::makePSOCKcluster() Simplifies cluster setup, especially remote ones Avoids common issues when workers connect back to master: uses SSH reverse tunneling no need for port-forwarding / firewall configuration no need for DNS lookup Makes option -l <user> optional (such that ~/.ssh/config is respected)
  • 44. A2.3 HPC resource parameters With 'future.batchtools' one can also specify computational resources, e.g. cores per node and memory needs. plan(batchtools_sge, resources = list(mem = "128gb")) y %<-% { large_memory_method(x) } Specific to scheduler: resources is passed to the job-script template where the parameters are interpreted and passed to the scheduler. Each future needs one node with 24 cores and 128 GiB of RAM: resources = list(l = "nodes=1:ppn=24", mem = "128gb")
  • 46. A3.1 Plot remotely - display locally > library(future) > plan(cluster, workers = "remote.org") ## Plot remotely > g %<-% R.devices::capturePlot({ filled.contour(volcano, color.palette = terrain.colors) title("volcano data: filled contour map") }) ## Display locally > g
  • 47. A3.2 Profile code remotely - display locally > plan(cluster, workers = "remote.org") > dat <- data.frame( + x = rnorm(50e3), + y = rnorm(50e3) + ) ## Profile remotely > p %<-% profvis::profvis({ + plot(x ~ y, data = dat) + m <- lm(x ~ y, data = dat) + abline(m, col = "red") + }) ## Browse locally > p
  • 48. A3.3 fiery - flexible lightweight web server "... framework for building web servers in R. ... from serving static content to full-blown dynamic web-apps"
  • 49. A3.4 "It kinda just works" (furrr = future + purrr) plan(multisession) mtcars %>% split(.$cyl) %>% map(~ future(lm(mpg ~ wt, data = .x))) %>% values %>% map(summary) %>% map_dbl("r.squared") ## 4 6 8 ## 0.5086326 0.4645102 0.4229655 Comment: This approach not do load balancing. I have a few ideas how support for this may be implemented in future framework, which would be beneficial here and elsewhere.
  • 50. A3.5 Backend: Google Cloud Engine Cluster  library(googleComputeEngineR) vms <- lapply(paste0("node", 1:10), FUN = gce_vm, template = "r-base") cl <- as.cluster(lapply(vms, FUN = gce_ssh_setup), docker_image = "henrikbengtsson/r-base-future") plan(cluster, workers = cl) data <- future_lapply(1:100, montecarlo_pi, B = 10e3) pi_hat <- Reduce(calculate_pi, data) print(pi_hat) ## 3.1415
  • 52. A4.1 Standard resource types(?) For any type of futures, the develop may wish to control: memory requirements, e.g. future(..., memory = 8e9) local machine only, e.g. remote = FALSE dependencies, e.g. dependencies = c("R (>= 3.5.0)", "rio")) file-system availability, e.g. mounts = "/share/lab/files" data locality, e.g. vars = c("gene_db", "mtcars") containers, e.g. container = "docker://rocker/r-base" generic resources, e.g. tokens = c("a", "b") ...? Risk for bloating the Future API: Which need to be included? Don't want to reinvent the HPC scheduler and Spark.