Title: | Unified Parallel and Distributed Processing in R for Everyone |
---|---|
Description: | The purpose of this package is to provide a lightweight and unified Future API for sequential and parallel processing of R expression via futures. The simplest way to evaluate an expression in parallel is to use `x %<-% { expression }` with `plan(multisession)`. This package implements sequential, multicore, multisession, and cluster futures. With these, R expressions can be evaluated on the local machine, in parallel a set of local machines, or distributed on a mix of local and remote machines. Extensions to this package implement additional backends for processing futures via compute cluster schedulers, etc. Because of its unified API, there is no need to modify any code in order switch from sequential on the local machine to, say, distributed processing on a remote compute cluster. Another strength of this package is that global variables and functions are automatically identified and exported as needed, making it straightforward to tweak existing code to make use of futures. |
Authors: | Henrik Bengtsson [aut, cre, cph]
|
Maintainer: | Henrik Bengtsson <[email protected]> |
License: | LGPL (>= 2.1) |
Version: | 1.34.0-9400 |
Built: | 2025-04-01 19:51:00 UTC |
Source: | https://github.com/futureverse/future |
Back trace the expressions evaluated when an error was caught
backtrace(future, envir = parent.frame(), ...)
backtrace(future, envir = parent.frame(), ...)
future |
A future with a caught error. |
envir |
the environment where to locate the future. |
... |
Not used. |
A list with the future's call stack that led up to the error.
my_log <- function(x) log(x) foo <- function(...) my_log(...) f <- future({ foo("a") }) res <- tryCatch({ v <- value(f) }, error = function(ex) { t <- backtrace(f) print(t) })
my_log <- function(x) log(x) foo <- function(...) my_log(...) f <- future({ foo("a") }) res <- tryCatch({ v <- value(f) }, error = function(ex) { t <- backtrace(f) print(t) })
A cluster future is a future that uses cluster evaluation, which means that its value is computed and resolved in parallel in another process.
cluster( ..., workers = availableWorkers(), gc = FALSE, earlySignal = FALSE, passthrough = FALSE, persistent = FALSE, envir = parent.frame() )
cluster( ..., workers = availableWorkers(), gc = FALSE, earlySignal = FALSE, passthrough = FALSE, persistent = FALSE, envir = parent.frame() )
workers |
A |
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
passthrough |
If TRUE, futures are resolved on the next, nested future backend. |
persistent |
If FALSE, the evaluation environment is cleared from objects prior to the evaluation of the future. |
envir |
The environment from where global objects should be identified. |
... |
Additional named elements passed to |
This function is not meant to be called directly. Instead, the typical usages are:
# Evaluate futures via a single background R process on the local machine plan(cluster, workers = 1) # Evaluate futures via two background R processes on the local machine plan(cluster, workers = 2) # Evaluate futures via a single R process on another machine on on the # local area network (LAN) plan(cluster, workers = "raspberry-pi") # Evaluate futures via a single R process running on a remote machine plan(cluster, workers = "pi.example.org") # Evaluate futures via four R processes, one running on the local machine, # two running on LAN machine 'n1' and one on a remote machine plan(cluster, workers = c("localhost", "n1", "n1", "pi.example.org"))
A ClusterFuture.
## Use cluster futures cl <- parallel::makeCluster(2, timeout = 60) plan(cluster, workers = cl) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A cluster future is evaluated in a separate process. ## Regardless, changing the value of a global variable will ## not affect the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0) ## CLEANUP parallel::stopCluster(cl)
## Use cluster futures cl <- parallel::makeCluster(2, timeout = 60) plan(cluster, workers = cl) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A cluster future is evaluated in a separate process. ## Regardless, changing the value of a global variable will ## not affect the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0) ## CLEANUP parallel::stopCluster(cl)
Creates a future that evaluates an R expression or
a future that calls an R function with a set of arguments.
How, when, and where these futures are evaluated can be configured
using
plan()
such that it is evaluated in parallel on,
for instance, the current machine, on a remote machine, or via a
job queue on a compute cluster.
Importantly, any R code using futures remains the same regardless
on these settings and there is no need to modify the code when
switching from, say, sequential to parallel processing.
future( expr, envir = parent.frame(), substitute = TRUE, lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", label = NULL, gc = FALSE, earlySignal = FALSE, ... ) futureCall( FUN, args = list(), envir = parent.frame(), lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", earlySignal = FALSE, label = NULL, gc = FALSE, ... ) minifuture( expr, substitute = TRUE, globals = NULL, packages = NULL, stdout = NA, conditions = NULL, seed = NULL, ..., envir = parent.frame() )
future( expr, envir = parent.frame(), substitute = TRUE, lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", label = NULL, gc = FALSE, earlySignal = FALSE, ... ) futureCall( FUN, args = list(), envir = parent.frame(), lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", earlySignal = FALSE, label = NULL, gc = FALSE, ... ) minifuture( expr, substitute = TRUE, globals = NULL, packages = NULL, stdout = NA, conditions = NULL, seed = NULL, ..., envir = parent.frame() )
expr |
An R expression. |
envir |
The environment from where global objects should be identified. |
substitute |
If TRUE, argument |
lazy |
If FALSE (default), the future is resolved eagerly (starting immediately), otherwise not. |
seed |
(optional) If TRUE, the random seed, that is, the state of the
random number generator (RNG) will be set such that statistically sound
random numbers are produced (also during parallelization).
If FALSE (default), it is assumed that the future expression does neither
need nor use random numbers generation.
To use a fixed random seed, specify a L'Ecuyer-CMRG seed (seven integer)
or a regular RNG seed (a single integer). If the latter, then a
L'Ecuyer-CMRG seed will be automatically created based on the given seed.
Furthermore, if FALSE, then the future will be monitored to make sure it
does not use random numbers. If it does and depending on the value of
option future.rng.onMisuse, the check is
ignored, an informative warning, or error will be produced.
If |
globals |
(optional) a logical, a character vector, or a named list
to control how globals are handled.
For details, see section 'Globals used by future expressions'
in the help for |
packages |
(optional) a character vector specifying packages to be attached in the R environment evaluating the future. |
stdout |
If TRUE (default), then the standard output is captured,
and re-outputted when |
conditions |
A character string of conditions classes to be captured
and relayed. The default is to relay all conditions, including messages
and warnings. To drop all conditions, use |
label |
A character string label attached to the future. |
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
FUN |
A function to be evaluated. |
args |
A list of arguments passed to function |
... |
Additional arguments passed to |
The state of a future is either unresolved or resolved.
The value of a future can be retrieved using v <- value(f)
.
Querying the value of a non-resolved future will block the call
until the future is resolved.
It is possible to check whether a future is resolved or not
without blocking by using resolved(f)
.
The futureCall()
function works analogously to
do.call()
, which calls a function with a set of
arguments. The difference is that do.call()
returns the value of
the call whereas futureCall()
returns a future.
future()
returns Future that evaluates expression expr
.
futureCall()
returns a Future that calls function FUN
with
arguments args
.
minifuture(expr)
creates a future with minimal overhead, by disabling
user-friendly behaviors, e.g. automatic identification of global
variables and packages needed, and relaying of output.
By default, a future is resolved using eager evaluation
(lazy = FALSE
). This means that the expression starts to
be evaluated as soon as the future is created.
As an alternative, the future can be resolved using lazy
evaluation (lazy = TRUE
). This means that the expression
will only be evaluated when the value of the future is requested.
Note that this means that the expression may not be evaluated
at all - it is guaranteed to be evaluated if the value is requested.
Global objects (short globals) are objects (e.g. variables and functions) that are needed in order for the future expression to be evaluated while not being local objects that are defined by the future expression. For example, in
a <- 42 f <- future({ b <- 2; a * b })
variable a
is a global of future assignment f
whereas
b
is a local variable.
In order for the future to be resolved successfully (and correctly),
all globals need to be gathered when the future is created such that
they are available whenever and wherever the future is resolved.
The default behavior (globals = TRUE
),
is that globals are automatically identified and gathered.
More precisely, globals are identified via code inspection of the
future expression expr
and their values are retrieved with
environment envir
as the starting point (basically via
get(global, envir = envir, inherits = TRUE)
).
In most cases, such automatic collection of globals is sufficient
and less tedious and error prone than if they are manually specified.
However, for full control, it is also possible to explicitly specify exactly which the globals are by providing their names as a character vector. In the above example, we could use
a <- 42 f <- future({ b <- 2; a * b }, globals = "a")
Yet another alternative is to explicitly specify also their values using a named list as in
a <- 42 f <- future({ b <- 2; a * b }, globals = list(a = a))
or
f <- future({ b <- 2; a * b }, globals = list(a = 42))
Specifying globals explicitly avoids the overhead added from automatically identifying the globals and gathering their values. Furthermore, if we know that the future expression does not make use of any global variables, we can disable the automatic search for globals by using
f <- future({ a <- 42; b <- 2; a * b }, globals = FALSE)
Future expressions often make use of functions from one or more packages. As long as these functions are part of the set of globals, the future package will make sure that those packages are attached when the future is resolved. Because there is no need for such globals to be frozen or exported, the future package will not export them, which reduces the amount of transferred objects. For example, in
x <- rnorm(1000) f <- future({ median(x) })
variable x
and median()
are globals, but only x
is exported whereas median()
, which is part of the stats
package, is not exported. Instead it is made sure that the stats
package is on the search path when the future expression is evaluated.
Effectively, the above becomes
x <- rnorm(1000) f <- future({ library(stats) median(x) })
To manually specify this, one can either do
x <- rnorm(1000) f <- future({ median(x) }, globals = list(x = x, median = stats::median)
or
x <- rnorm(1000) f <- future({ library(stats) median(x) }, globals = list(x = x))
Both are effectively the same.
Although rarely needed, a combination of automatic identification and manual
specification of globals is supported via attributes add
(to add
false negatives) and ignore
(to ignore false positives) on value
TRUE
. For example, with
globals = structure(TRUE, ignore = "b", add = "a")
any globals
automatically identified except b
will be used in addition to
global a
.
The future logo was designed by Dan LaBar and tweaked by Henrik Bengtsson.
How, when and where futures are resolved is given by the
future strategy, which can be set by the end user using the
plan()
function. The future strategy must not be
set by the developer, e.g. it must not be called within a package.
## Evaluate futures in parallel plan(multisession) ## Data x <- rnorm(100) y <- 2 * x + 0.2 + rnorm(100) w <- 1 + x ^ 2 ## EXAMPLE: Regular assignments (evaluated sequentially) fitA <- lm(y ~ x, weights = w) ## with offset fitB <- lm(y ~ x - 1, weights = w) ## without offset fitC <- { w <- 1 + abs(x) ## Different weights lm(y ~ x, weights = w) } print(fitA) print(fitB) print(fitC) ## EXAMPLE: Future assignments (evaluated in parallel) fitA %<-% lm(y ~ x, weights = w) ## with offset fitB %<-% lm(y ~ x - 1, weights = w) ## without offset fitC %<-% { w <- 1 + abs(x) lm(y ~ x, weights = w) } print(fitA) print(fitB) print(fitC) ## EXAMPLE: Explicitly create futures (evaluated in parallel) ## and retrieve their values fA <- future( lm(y ~ x, weights = w) ) fB <- future( lm(y ~ x - 1, weights = w) ) fC <- future({ w <- 1 + abs(x) lm(y ~ x, weights = w) }) fitA <- value(fA) fitB <- value(fB) fitC <- value(fC) print(fitA) print(fitB) print(fitC) ## EXAMPLE: futureCall() and do.call() x <- 1:100 y0 <- do.call(sum, args = list(x)) print(y0) f1 <- futureCall(sum, args = list(x)) y1 <- value(f1) print(y1)
## Evaluate futures in parallel plan(multisession) ## Data x <- rnorm(100) y <- 2 * x + 0.2 + rnorm(100) w <- 1 + x ^ 2 ## EXAMPLE: Regular assignments (evaluated sequentially) fitA <- lm(y ~ x, weights = w) ## with offset fitB <- lm(y ~ x - 1, weights = w) ## without offset fitC <- { w <- 1 + abs(x) ## Different weights lm(y ~ x, weights = w) } print(fitA) print(fitB) print(fitC) ## EXAMPLE: Future assignments (evaluated in parallel) fitA %<-% lm(y ~ x, weights = w) ## with offset fitB %<-% lm(y ~ x - 1, weights = w) ## without offset fitC %<-% { w <- 1 + abs(x) lm(y ~ x, weights = w) } print(fitA) print(fitB) print(fitC) ## EXAMPLE: Explicitly create futures (evaluated in parallel) ## and retrieve their values fA <- future( lm(y ~ x, weights = w) ) fB <- future( lm(y ~ x - 1, weights = w) ) fC <- future({ w <- 1 + abs(x) lm(y ~ x, weights = w) }) fitA <- value(fA) fitB <- value(fB) fitC <- value(fC) print(fitA) print(fitB) print(fitC) ## EXAMPLE: futureCall() and do.call() x <- 1:100 y0 <- do.call(sum, args = list(x)) print(y0) f1 <- futureCall(sum, args = list(x)) y1 <- value(f1) print(y1)
x %<-% value
(also known as a "future assignment") and
futureAssign("x", value)
create a Future that evaluates the expression
(value
) and binds it to variable x
(as a
promise). The expression is evaluated in parallel
in the background. Later on, when x
is first queried, the value of future
is automatically retrieved as it were a regular variable and x
is
materialized as a regular value.
futureAssign( x, value, envir = parent.frame(), substitute = TRUE, lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", earlySignal = FALSE, label = NULL, gc = FALSE, ..., assign.env = envir ) x %<-% value fassignment %globals% globals fassignment %packages% packages fassignment %seed% seed fassignment %stdout% capture fassignment %conditions% capture fassignment %lazy% lazy fassignment %label% label fassignment %plan% strategy fassignment %tweak% tweaks
futureAssign( x, value, envir = parent.frame(), substitute = TRUE, lazy = FALSE, seed = FALSE, globals = TRUE, packages = NULL, stdout = TRUE, conditions = "condition", earlySignal = FALSE, label = NULL, gc = FALSE, ..., assign.env = envir ) x %<-% value fassignment %globals% globals fassignment %packages% packages fassignment %seed% seed fassignment %stdout% capture fassignment %conditions% capture fassignment %lazy% lazy fassignment %label% label fassignment %plan% strategy fassignment %tweak% tweaks
x |
the name of a future variable, which will hold the value of the future expression (as a promise). |
value |
An R expression. |
envir |
The environment from where global objects should be identified. |
substitute |
If TRUE, argument |
lazy |
If FALSE (default), the future is resolved eagerly (starting immediately), otherwise not. |
seed |
(optional) If TRUE, the random seed, that is, the state of the
random number generator (RNG) will be set such that statistically sound
random numbers are produced (also during parallelization).
If FALSE (default), it is assumed that the future expression does neither
need nor use random numbers generation.
To use a fixed random seed, specify a L'Ecuyer-CMRG seed (seven integer)
or a regular RNG seed (a single integer). If the latter, then a
L'Ecuyer-CMRG seed will be automatically created based on the given seed.
Furthermore, if FALSE, then the future will be monitored to make sure it
does not use random numbers. If it does and depending on the value of
option future.rng.onMisuse, the check is
ignored, an informative warning, or error will be produced.
If |
globals |
(optional) a logical, a character vector, or a named list
to control how globals are handled.
For details, see section 'Globals used by future expressions'
in the help for |
packages |
(optional) a character vector specifying packages to be attached in the R environment evaluating the future. |
stdout |
If TRUE (default), then the standard output is captured,
and re-outputted when |
conditions |
A character string of conditions classes to be captured
and relayed. The default is to relay all conditions, including messages
and warnings. To drop all conditions, use |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
label |
A character string label attached to the future. |
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
assign.env |
The environment to which the variable should be assigned. |
fassignment |
The future assignment, e.g.
|
capture |
If TRUE, the standard output will be captured, otherwise not. |
strategy |
The mechanism for how the future should be
resolved. See |
tweaks |
A named list (or vector) with arguments that should be changed relative to the current strategy. |
... |
Additional arguments passed to |
For a future created via a future assignment, x %<-% value
or
futureAssign("x", value)
, the value is bound to a promise, which when
queried will internally call value()
on the future and which will then
be resolved into a regular variable bound to that value. For example, with
future assignment x %<-% value
, the first time variable x
is queried
the call blocks if, and only if, the future is not yet resolved. As soon
as it is resolved, and any succeeding queries, querying x
will
immediately give the value.
The future assignment construct x %<-% value
is not a formal assignment
per se, but a binary infix operator on objects x
and expression value
.
However, by using non-standard evaluation, this constructs can emulate an
assignment operator similar to x <- value
. Due to R's precedence rules
of operators, future expressions often need to be explicitly bracketed,
e.g. x %<-% { a + b }
.
futureAssign()
and x %<-% expr
returns the Future invisibly,
e.g. f <- futureAssign("x", expr)
and f <- (x %<-% expr)
.
future()
and futureAssign()
take several arguments that can be used
to explicitly specify what global variables and packages the future should
use. They can also be used to override default behaviors of the future,
e.g. whether output should be relayed or not. When using a future
assignment, these arguments can be specified via corresponding
assignment expression. For example, x %<-% { rnorm(10) } %seed% TRUE
corresponds to futureAssign("x", { rnorm(10) }, seed = TRUE)
. Here are
a several examples.
To explicitly specify variables and functions that a future assignment
should use, use %globals%
. To explicitly specify which packages need
to be attached for the evaluate to success, use %packages%
. For
example,
> x <- rnorm(1000) > y %<-% { median(x) } %globals% list(x = x) %packages% "stats" > y [1] -0.03956372
The median()
function is part of the 'stats' package.
To declare that you will generate random numbers, use %seed%
, e.g.
> x %<-% { rnorm(3) } %seed% TRUE > x [1] -0.2590562 -1.2262495 0.8858702
To disable relaying of standard output (e.g. print()
, cat()
, and
str()
), while keeping relaying of conditions (e.g. message()
and
> x %<-% { cat("Hello\n"); message("Hi there"); 42 } %stdout% FALSE > y <- 13 > z <- x + y Hi there > z [1] 55
To disable relaying of conditions, use %conditions%
, e.g.
> x %<-% { cat("Hello\n"); message("Hi there"); 42 } %conditions% character(0) > y <- 13 > z <- x + y Hello > z [1] 55
> x %<-% { print(1:10); message("Hello"); 42 } %stdout% FALSE > y <- 13 > z <- x + y Hello > z [1] 55
To create a future without launching in such that it will only be
processed if the value is really needed, use %lazy%
, e.g.
> x %<-% { Sys.sleep(5); 42 } %lazy% TRUE > y <- sum(1:10) > system.time(z <- x + y) user system elapsed 0.004 0.000 5.008 > z [1] 97
Because future assignments are promises, errors produced by the the future expression will not be signaled until the value of the future is requested. For example, if you create a future assignment that produce an error, you will not be affected by the error until you "touch" the future-assignment variable. For example,
> x %<-% { stop("boom") } > y <- sum(1:10) > z <- x + y Error in eval(quote({ : boom
Futures are evaluated on the future backend that the user has specified
by plan()
. With regular futures, we can temporarily use another future
backend by wrapping our code in [withPlan()]
, of temporarily inside a
function using [localPlan()]
. To achieve the same for a specific
future assignment, use %plan%
, e.g.
> plan(multisession) > x %<-% { 42 } > y %<-% { 13 } %plan% sequential > z <- x + y > z [1] 55
Here x
is resolved in the background via the multisession backend,
whereas y
is resolved sequentially in the main R session.
The underlying Future of a future variable x
can be retrieved without
blocking using f <- futureOf(x)
, e.g.
> x %<-% { stop("boom") } > f_x <- futureOf(x) > resolved(f_x) [1] TRUE > x Error in eval(quote({ : boom > value(f_x) Error in eval(quote({ : boom
Technically, both the future and the variable (promise) are assigned at
the same time to environment assign.env
where the name of the future is
.future_<name>
.
Get the future of a future variable that has been created directly
or indirectly via future()
.
futureOf( var = NULL, envir = parent.frame(), mustExist = TRUE, default = NA, drop = FALSE )
futureOf( var = NULL, envir = parent.frame(), mustExist = TRUE, default = NA, drop = FALSE )
var |
the variable. If NULL, all futures in the environment are returned. |
envir |
the environment where to search from. |
mustExist |
If TRUE and the variable does not exists, then an informative error is thrown, otherwise NA is returned. |
default |
the default value if future was not found. |
drop |
if TRUE and |
A Future (or default
).
If var
is NULL, then a named list of Future:s are returned.
a %<-% { 1 } f <- futureOf(a) print(f) b %<-% { 2 } f <- futureOf(b) print(f) ## All futures fs <- futureOf() print(fs) ## Futures part of environment env <- new.env() env$c %<-% { 3 } f <- futureOf(env$c) print(f) f2 <- futureOf(c, envir = env) print(f2) f3 <- futureOf("c", envir = env) print(f3) fs <- futureOf(envir = env) print(fs)
a %<-% { 1 } f <- futureOf(a) print(f) b %<-% { 2 } f <- futureOf(b) print(f) ## All futures fs <- futureOf() print(fs) ## Futures part of environment env <- new.env() env$c %<-% { 3 } f <- futureOf(env$c) print(f) f2 <- futureOf(c, envir = env) print(f2) f3 <- futureOf("c", envir = env) print(f3) fs <- futureOf(envir = env) print(fs)
Gets all futures in an environment, a list, or a list environment and returns an object of the same class (and dimensions). Non-future elements are returned as is.
futures(x, ...)
futures(x, ...)
x |
An environment, a list, or a list environment. |
... |
Not used. |
This function is useful for retrieve futures that were created via
future assignments (%<-%
) and therefore stored as promises.
This function turns such promises into standard Future
objects.
An object of same type as x
and with the same names
and/or dimensions, if set.
Get future-specific session information and validate current backend
futureSessionInfo(test = TRUE, anonymize = TRUE)
futureSessionInfo(test = TRUE, anonymize = TRUE)
test |
If TRUE, one or more futures are created to query workers and validate their information. |
anonymize |
If TRUE, user names and host names are anonymized. |
Nothing.
plan(multisession, workers = 2) futureSessionInfo() plan(sequential)
plan(multisession, workers = 2) futureSessionInfo() plan(sequential)
Attempts to interrupt a running future. If the backend does not support interrupting futures, nothing is done.
interrupt(x, ...)
interrupt(x, ...)
x |
A Future. |
... |
All arguments used by the S3 methods. |
interrupt()
returns the Future flagged as "interrupted",
if the backend supports interrupting futures.
A multicore future is a future that uses multicore evaluation, which means that its value is computed and resolved in parallel in another process.
multicore( ..., workers = availableCores(constraints = "multicore"), gc = FALSE, earlySignal = FALSE, envir = parent.frame() )
multicore( ..., workers = availableCores(constraints = "multicore"), gc = FALSE, earlySignal = FALSE, envir = parent.frame() )
workers |
The number of parallel processes to use. If a function, it is called without arguments when the future is created and its value is used to configure the workers. |
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
envir |
The environment from where global objects should be identified. |
... |
Additional named elements to |
This function is not meant to be called directly. Instead, the typical usages are:
# Evaluate futures in parallel on the local machine via as many forked # processes as available to the current R process plan(multicore) # Evaluate futures in parallel on the local machine via two forked processes plan(multicore, workers = 2)
A Future.
If workers == 1
, then all processing using done in the
current/main R session and we therefore fall back to using a
sequential future. To override this fallback, use workers = I(1)
.
This is also the case whenever multicore processing is not supported,
e.g. on Windows.
Not all operating systems support process forking and thereby not multicore
futures. For instance, forking is not supported on Microsoft Windows.
Moreover, process forking may break some R environments such as RStudio.
Because of this, the future package disables process forking also in
such cases. See parallelly::supportsMulticore()
for details.
Trying to create multicore futures on non-supported systems or when
forking is disabled will result in multicore futures falling back to
becoming sequential futures. If used in RStudio, there will be an
informative warning:
> plan(multicore) Warning message: In supportsMulticoreAndRStudio(...) : [ONE-TIME WARNING] Forked processing ('multicore') is not supported when running R from RStudio because it is considered unstable. For more details, how to control forked processing or not, and how to silence this warning in future R sessions, see ?parallelly::supportsMulticore
For processing in multiple background R sessions, see multisession futures.
Use parallelly::availableCores()
to see the total number of
cores that are available for the current R session.
Use availableCores("multicore") > 1L
to check
whether multicore futures are supported or not on the current
system.
## Use multicore futures plan(multicore) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A multicore future is evaluated in a separate forked ## process. Changing the value of a global variable ## will not affect the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0)
## Use multicore futures plan(multicore) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A multicore future is evaluated in a separate forked ## process. Changing the value of a global variable ## will not affect the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0)
A multisession future is a future that uses multisession evaluation, which means that its value is computed and resolved in parallel in another R session.
multisession( ..., workers = availableCores(), lazy = FALSE, rscript_libs = .libPaths(), gc = FALSE, earlySignal = FALSE, envir = parent.frame() )
multisession( ..., workers = availableCores(), lazy = FALSE, rscript_libs = .libPaths(), gc = FALSE, earlySignal = FALSE, envir = parent.frame() )
workers |
The number of parallel processes to use. If a function, it is called without arguments when the future is created and its value is used to configure the workers. |
lazy |
If FALSE (default), the future is resolved eagerly (starting immediately), otherwise not. |
rscript_libs |
A character vector of R package library folders that
the workers should use. The default is |
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
envir |
The environment from where global objects should be identified. |
... |
Additional arguments passed to |
This function is not meant to be called directly. Instead, the typical usages are:
# Evaluate futures in parallel on the local machine via as many background # processes as available to the current R process plan(multisession) # Evaluate futures in parallel on the local machine via two background # processes plan(multisession, workers = 2)
The background R sessions (the "workers") are created using
makeClusterPSOCK()
.
For the total number of
R sessions available including the current/main R process, see
parallelly::availableCores()
.
A multisession future is a special type of cluster future.
A MultisessionFuture.
If workers == 1
, then all processing is done in the
current/main R session and we therefore fall back to using a
lazy future. To override this fallback, use workers = I(1)
.
For processing in multiple forked R sessions, see multicore futures.
Use parallelly::availableCores()
to see the total number of
cores that are available for the current R session.
## Use multisession futures plan(multisession) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A multisession future is evaluated in a separate R session. ## Changing the value of a global variable will not affect ## the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0) ## Explicitly close multisession workers by switching plan plan(sequential)
## Use multisession futures plan(multisession) ## A global variable a <- 0 ## Create future (explicitly) f <- future({ b <- 3 c <- 2 a * b * c }) ## A multisession future is evaluated in a separate R session. ## Changing the value of a global variable will not affect ## the result of the future. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0) ## Explicitly close multisession workers by switching plan plan(sequential)
Get the number of workers available
nbrOfWorkers(evaluator = NULL) nbrOfFreeWorkers(evaluator = NULL, background = FALSE, ...)
nbrOfWorkers(evaluator = NULL) nbrOfFreeWorkers(evaluator = NULL, background = FALSE, ...)
evaluator |
A future evaluator function.
If NULL (default), the current evaluator as returned
by |
background |
If TRUE, only workers that can process a future in the background are considered. If FALSE, also workers running in the main R process are considered, e.g. when using the 'sequential' backend. |
... |
Not used; reserved for future use. |
nbrOfWorkers()
returns a positive number in , which
for some future backends may also be
+Inf
.
nbrOfFreeWorkers()
returns a non-negative number in
which is less than or equal to
nbrOfWorkers()
.
plan(multisession) nbrOfWorkers() ## == availableCores() plan(sequential) nbrOfWorkers() ## == 1
plan(multisession) nbrOfWorkers() ## == availableCores() plan(sequential) nbrOfWorkers() ## == 1
This function allows the user to plan the future, more specifically,
it specifies how future()
:s are resolved,
e.g. sequentially or in parallel.
plan( strategy = NULL, ..., substitute = TRUE, .skip = FALSE, .call = TRUE, .cleanup = NA, .init = TRUE ) tweak(strategy, ..., penvir = parent.frame()) withPlan( strategy = NULL, expr, envir = parent.frame(), .cleanup = NA, substitute = TRUE, ... ) localPlan( strategy = NULL, .cleanup = NA, envir = parent.frame(), substitute = TRUE, ... )
plan( strategy = NULL, ..., substitute = TRUE, .skip = FALSE, .call = TRUE, .cleanup = NA, .init = TRUE ) tweak(strategy, ..., penvir = parent.frame()) withPlan( strategy = NULL, expr, envir = parent.frame(), .cleanup = NA, substitute = TRUE, ... ) localPlan( strategy = NULL, .cleanup = NA, envir = parent.frame(), substitute = TRUE, ... )
strategy |
An existing future function or the name of one. |
substitute |
If |
.skip |
(internal) If |
.call |
(internal) Used for recording the call to this function. |
.cleanup |
(internal) Used to stop implicitly started clusters. |
.init |
(internal) Used to initiate workers. |
penvir |
The environment used when searching for a future function by its name. |
expr |
An R expression to be evaluated. |
envir |
The environment where the future plan should be set and the expression evaluated. |
... |
Named arguments to replace the defaults of existing arguments. |
The default strategy is sequential
, but the default can be
configured by option future.plan and, if that is not set,
system environment variable R_FUTURE_PLAN.
To reset the strategy back to the default, use plan("default")
.
plan()
returns a the previous plan invisibly if a new strategy
is chosen, otherwise it returns the current one visibly.
a future function.
withPlan()
returns the value of the expression evaluated invisibly.
localPlan()
returns the current future plan before applying the temporary one.
The future package provides the following built-in backends:
sequential
:Resolves futures sequentially in the current R process, e.g.
plan(sequential)
.
multisession
:Resolves futures asynchronously (in parallel) in separate
R sessions running in the background on the same machine, e.g.
plan(multisession)
and plan(multisession, workers = 2)
.
multicore
:Resolves futures asynchronously (in parallel) in separate
forked R processes running in the background on
the same machine, e.g.
plan(multicore)
and plan(multicore, workers = 2)
.
This backend is not supported on Windows.
cluster
:Resolves futures asynchronously (in parallel) in separate
R sessions running typically on one or more machines, e.g.
plan(cluster)
, plan(cluster, workers = 2)
, and
plan(cluster, workers = c("n1", "n1", "n2", "server.remote.org"))
.
Other package provide additional evaluation strategies.
For example, the future.callr package implements an alternative
to the multisession
backend on top of the callr package, e.g.
plan(future.callr::callr, workers = 2)
.
Another example is the future.batchtools package, which implements,
on top of the batchtools package, e.g.
plan(future.batchtools::batchtools_slurm)
.
These types of futures are resolved via job schedulers, which typically
are available on high-performance compute (HPC) clusters, e.g. LSF,
Slurm, TORQUE/PBS, Sun Grid Engine, and OpenLava.
To "close" any background workers (e.g. multisession
), change
the plan to something different; plan(sequential)
is recommended
for this.
Please refrain from modifying the future strategy inside your packages /
functions, i.e. do not call plan()
in your code. Instead, leave
the control on what backend to use to the end user. This idea is part of
the core philosophy of the future framework—as a developer you can never
know what future backends the user have access to. Moreover, by not making
any assumptions about what backends are available, your code will also work
automatically with any new backends developed after you wrote your code.
If you think it is necessary to modify the future strategy within a
function, then make sure to undo the changes when exiting the function.
This can be archived by using localPlan()
, e.g.
my_fcn <- function(x) { localPlan(multisession) y <- analyze(x) summarize(y) }
This is important because the end-user might have already set the future strategy elsewhere for other purposes and will most likely not known that calling your function will break their setup. Remember, your package and its functions might be used in a greater context where multiple packages and functions are involved and those might also rely on the future framework, so it is important to avoid stepping on others' toes.
When writing scripts or vignettes that use futures, try to place any
call to plan()
as far up (i.e. as early on) in the code as possible.
This will help users to quickly identify where the future plan is set up
and allow them to modify it to their computational resources.
Even better is to leave it to the user to set the plan()
prior to
source()
:ing the script or running the vignette.
If a ‘.future.R’ exists in the current directory and / or in
the user's home directory, it is sourced when the future package is
loaded. Because of this, the ‘.future.R’ file provides a
convenient place for users to set the plan()
.
This behavior can be controlled via an R option—see
future options for more details.
Use plan()
to set a future to become the
new default strategy.
a <- b <- c <- NA_real_ # An sequential future plan(sequential) f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs # A sequential future with lazy evaluation plan(sequential) f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }, lazy = TRUE) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs # A multicore future (specified as a string) plan("multicore") f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs ## Multisession futures gives an error on R CMD check on ## Windows (but not Linux or macOS) for unknown reasons. ## The same code works in package tests. # A multisession future (specified via a string variable) plan("future::multisession") f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs ## Explicitly specifying number of workers ## (default is parallelly::availableCores()) plan(multicore, workers = 2) message("Number of parallel workers: ", nbrOfWorkers()) ## Explicitly close multisession workers by switching plan plan(sequential)
a <- b <- c <- NA_real_ # An sequential future plan(sequential) f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs # A sequential future with lazy evaluation plan(sequential) f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }, lazy = TRUE) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs # A multicore future (specified as a string) plan("multicore") f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs ## Multisession futures gives an error on R CMD check on ## Windows (but not Linux or macOS) for unknown reasons. ## The same code works in package tests. # A multisession future (specified via a string variable) plan("future::multisession") f <- future({ a <- 7 b <- 3 c <- 2 a * b * c }) y <- value(f) print(y) str(list(a = a, b = b, c = c)) ## All NAs ## Explicitly specifying number of workers ## (default is parallelly::availableCores()) plan(multicore, workers = 2) message("Number of parallel workers: ", nbrOfWorkers()) ## Explicitly close multisession workers by switching plan plan(sequential)
A future that has successfully completed, has been interrupted, or failed due to an error, can be relaunched after resetting it.
reset(x, ...)
reset(x, ...)
x |
A Future. |
... |
Not used. |
A lazy, vanilla Future can be reused in another R session. For instance, if we do:
library(future) a <- 2 f <- future(42 * a, lazy = TRUE) saveRDS(f, "myfuture.rds")
Then we can read and evaluate the future in another R session using:
library(future) f <- readRDS("myfuture.rds") v <- value(f) print(v) #> [1] 84
reset()
returns a lazy, vanilla Future that can be relaunched.
Resetting a running future results in a FutureError.
## Like mean(), but fails 90% of the time shaky_mean <- function(x) { if (as.double(Sys.time()) %% 1 < 0.90) stop("boom") mean(x) } x <- rnorm(100) ## Calculate the mean of 'x' with a risk of failing randomly f <- future({ shaky_mean(x) }) ## Relaunch until success repeat({ v <- tryCatch(value(f), error = identity) if (!inherits(v, "error")) break message("Resetting failed future, and retry in 0.1 seconds") f <- reset(f) Sys.sleep(0.1) }) cat("mean:", v, "\n")
## Like mean(), but fails 90% of the time shaky_mean <- function(x) { if (as.double(Sys.time()) %% 1 < 0.90) stop("boom") mean(x) } x <- rnorm(100) ## Calculate the mean of 'x' with a risk of failing randomly f <- future({ shaky_mean(x) }) ## Relaunch until success repeat({ v <- tryCatch(value(f), error = identity) if (!inherits(v, "error")) break message("Resetting failed future, and retry in 0.1 seconds") f <- reset(f) Sys.sleep(0.1) }) cat("mean:", v, "\n")
This function provides an efficient mechanism for waiting for multiple futures in a container (e.g. list or environment) to be resolved while in the meanwhile retrieving values of already resolved futures.
resolve( x, idxs = NULL, recursive = 0, result = FALSE, stdout = FALSE, signal = FALSE, force = FALSE, sleep = getOption("future.wait.interval", 0.01), ... )
resolve( x, idxs = NULL, recursive = 0, result = FALSE, stdout = FALSE, signal = FALSE, force = FALSE, sleep = getOption("future.wait.interval", 0.01), ... )
x |
A Future to be resolved, or a list, an environment, or a list environment of futures to be resolved. |
idxs |
(optional) integer or logical index specifying the subset of elements to check. |
recursive |
A non-negative number specifying how deep of a recursion should be done. If TRUE, an infinite recursion is used. If FALSE or zero, no recursion is performed. |
result |
(internal) If TRUE, the results are retrieved, otherwise not. Note that this only collects the results from the parallel worker, which can help lower the overall latency if there are multiple concurrent futures. This does not return the collected results. |
stdout |
(internal) If TRUE, captured standard output is relayed, otherwise not. |
signal |
(internal) If TRUE, captured conditions are relayed, otherwise not. |
force |
(internal) If TRUE, captured standard output and captured conditions already relayed is relayed again, otherwise not. |
sleep |
Number of seconds to wait before checking if futures have been resolved since last time. |
... |
Not used. |
This function is resolves synchronously, i.e. it blocks until x
and
any containing futures are resolved.
Returns x
(regardless of subsetting or not).
If signal
is TRUE and one of the futures produces an error, then
that error is produced.
To resolve a future variable, first retrieve its
Future object using futureOf()
, e.g.
resolve(futureOf(x))
.
Check whether a future is resolved or not
resolved(x, ...)
resolved(x, ...)
x |
A Future, a list, or an environment (which also includes list environment). |
... |
Not used. |
This method needs to be implemented by the class that implement
the Future API. The implementation should return either TRUE or FALSE
and must never throw an error (except for FutureError:s which indicate
significant, often unrecoverable infrastructure problems).
It should also be possible to use the method for polling the
future until it is resolved (without having to wait infinitely long),
e.g. while (!resolved(future)) Sys.sleep(5)
.
A logical of the same length and dimensions as x
.
Each element is TRUE unless the corresponding element is a
non-resolved future in case it is FALSE.
A sequential future is a future that is evaluated sequentially in the current R session similarly to how R expressions are evaluated in R. The only difference to R itself is that globals are validated by default just as for all other types of futures in this package.
sequential(..., gc = FALSE, earlySignal = FALSE, envir = parent.frame())
sequential(..., gc = FALSE, earlySignal = FALSE, envir = parent.frame())
gc |
If TRUE, the garbage collector run (in the process that
evaluated the future) only after the value of the future is collected.
Exactly when the values are collected may depend on various factors such
as number of free workers and whether |
earlySignal |
Specified whether conditions should be signaled as soon as possible or not. |
envir |
The environment from where global objects should be identified. |
... |
Additional named elements to |
This function is not meant to be called directly. Instead, the typical usages are:
# Evaluate futures sequentially in the current R process plan(sequential)
A Future.
## Use sequential futures plan(sequential) ## A global variable a <- 0 ## Create a sequential future f <- future({ b <- 3 c <- 2 a * b * c }) ## Since 'a' is a global variable in future 'f' which ## is eagerly resolved (default), this global has already ## been resolved / incorporated, and any changes to 'a' ## at this point will _not_ affect the value of 'f'. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0)
## Use sequential futures plan(sequential) ## A global variable a <- 0 ## Create a sequential future f <- future({ b <- 3 c <- 2 a * b * c }) ## Since 'a' is a global variable in future 'f' which ## is eagerly resolved (default), this global has already ## been resolved / incorporated, and any changes to 'a' ## at this point will _not_ affect the value of 'f'. a <- 7 print(a) v <- value(f) print(v) stopifnot(v == 0)
Gets the value of a future or the values of all elements (including futures) in a container such as a list, an environment, or a list environment. If one or more futures is unresolved, then this function blocks until all queried futures are resolved.
value(...) ## S3 method for class 'Future' value(future, stdout = TRUE, signal = TRUE, drop = FALSE, ...) ## S3 method for class 'list' value( x, idxs = NULL, recursive = 0, reduce = NULL, stdout = TRUE, signal = TRUE, interrupt = TRUE, inorder = TRUE, drop = FALSE, force = TRUE, sleep = getOption("future.wait.interval", 0.01), ... ) ## S3 method for class 'listenv' value( x, idxs = NULL, recursive = 0, reduce = NULL, stdout = TRUE, signal = TRUE, interrupt = TRUE, inorder = TRUE, drop = FALSE, force = TRUE, sleep = getOption("future.wait.interval", 0.01), ... ) ## S3 method for class 'environment' value(x, ...)
value(...) ## S3 method for class 'Future' value(future, stdout = TRUE, signal = TRUE, drop = FALSE, ...) ## S3 method for class 'list' value( x, idxs = NULL, recursive = 0, reduce = NULL, stdout = TRUE, signal = TRUE, interrupt = TRUE, inorder = TRUE, drop = FALSE, force = TRUE, sleep = getOption("future.wait.interval", 0.01), ... ) ## S3 method for class 'listenv' value( x, idxs = NULL, recursive = 0, reduce = NULL, stdout = TRUE, signal = TRUE, interrupt = TRUE, inorder = TRUE, drop = FALSE, force = TRUE, sleep = getOption("future.wait.interval", 0.01), ... ) ## S3 method for class 'environment' value(x, ...)
future , x
|
A Future, an environment, a list, or a list environment. |
stdout |
If TRUE, standard output captured while resolving futures is relayed, otherwise not. |
signal |
If TRUE, conditions captured while resolving futures are relayed, otherwise not. |
drop |
If TRUE, resolved futures are minimized in size and invalidated
as soon the as their values have been collected and any output and
conditions have been relayed.
Combining |
idxs |
(optional) integer or logical index specifying the subset of elements to check. |
recursive |
A non-negative number specifying how deep of a recursion should be done. If TRUE, an infinite recursion is used. If FALSE or zero, no recursion is performed. |
reduce |
An optional function for reducing all the values.
Optional attribute |
interrupt |
If TRUE and |
inorder |
If TRUE, then standard output and conditions are relayed,
and value reduction, is done in the order the futures occur in |
force |
(internal) If TRUE, captured standard output and captured conditions already relayed is relayed again, otherwise not. |
sleep |
Number of seconds to wait before checking if futures have been resolved since last time. |
... |
All arguments used by the S3 methods. |
value()
of a Future object returns the value of the future, which can
be any type of R object.
value()
of a list, an environment, or a list environment returns an
object with the same number of elements and of the same class.
Names and dimension attributes are preserved, if available.
All future elements are replaced by their corresponding value()
values.
For all other elements, the existing object is kept as-is.
If signal
is TRUE and one of the futures produces an error, then
that error is relayed. Any remaining, non-resolved futures in x
are
interrupted, prior to signaling such an error.
## ------------------------------------------------------ ## A single future ## ------------------------------------------------------ x <- sample(100, size = 50) f <- future(mean(x)) v <- value(f) message("The average of 50 random numbers in [1,100] is: ", v) ## ------------------------------------------------------ ## Ten futures ## ------------------------------------------------------ xs <- replicate(10, { list(sample(100, size = 50)) }) fs <- lapply(xs, function(x) { future(mean(x)) }) ## The 10 values as a list (because 'fs' is a list) vs <- value(fs) message("The ten averages are:") str(vs) ## The 10 values as a vector (by manually unlisting) vs <- value(fs) vs <- unlist(vs) message("The ten averages are: ", paste(vs, collapse = ", ")) ## The values as a vector (by reducing) vs <- value(fs, reduce = `c`) message("The ten averages are: ", paste(vs, collapse = ", ")) ## Calculate the sum of the averages (by reducing) total <- value(fs, reduce = `sum`) message("The sum of the ten averages is: ", total)
## ------------------------------------------------------ ## A single future ## ------------------------------------------------------ x <- sample(100, size = 50) f <- future(mean(x)) v <- value(f) message("The average of 50 random numbers in [1,100] is: ", v) ## ------------------------------------------------------ ## Ten futures ## ------------------------------------------------------ xs <- replicate(10, { list(sample(100, size = 50)) }) fs <- lapply(xs, function(x) { future(mean(x)) }) ## The 10 values as a list (because 'fs' is a list) vs <- value(fs) message("The ten averages are:") str(vs) ## The 10 values as a vector (by manually unlisting) vs <- value(fs) vs <- unlist(vs) message("The ten averages are: ", paste(vs, collapse = ", ")) ## The values as a vector (by reducing) vs <- value(fs, reduce = `c`) message("The ten averages are: ", paste(vs, collapse = ", ")) ## Calculate the sum of the averages (by reducing) total <- value(fs, reduce = `sum`) message("The sum of the ten averages is: ", total)
Below are the R options and environment variables that are used by the
future package and packages enhancing it.
WARNING: Note that the names and the default values of these options may
change in future versions of the package. Please use with care until
further notice.
Just like for other R options, as a package developer you must not change
any of the below future.*
options. Only the end-user should set these.
If you find yourself having to tweak one of the options, make sure to
undo your changes immediately afterward. For example, if you want to
bump up the future.globals.maxSize
limit when creating a future,
use something like the following inside your function:
oopts <- options(future.globals.maxSize = 1.0 * 1e9) ## 1.0 GB on.exit(options(oopts)) f <- future({ expr }) ## Launch a future with large objects
Several functions have been moved to the parallelly package:
The options and environment variables controlling those have been adjusted accordingly to have different prefixes. For example, option future.fork.enable has been renamed to parallelly.fork.enable and the corresponding environment variable R_FUTURE_FORK_ENABLE has been renamed to R_PARALLELLY_FORK_ENABLE. For backward compatibility reasons, the parallelly package will support both versions for a long foreseeable time. See the parallelly::parallelly.options page for the settings.
(character string or future function) Default future strategy plan used unless otherwise specified via plan()
. This will also be the future plan set when calling plan("default")
. If not specified, this option may be set when the future package is loaded if command-line option --parallel=ncores
(short -p ncores
) is specified; if ncores > 1
, then option future.plan is set to multisession
otherwise sequential
(in addition to option mc.cores being set to ncores
, if ncores >= 1
). (Default: sequential
)
(numeric) Maximum allowed total size (in bytes) of global variables identified. This is used to protect against exporting too large objects to parallel workers by mistake. Transferring large objects over a network, or over the internet, can be slow and therefore introduce a large bottleneck that increases the overall processing time. It can also result in large egress or ingress costs, which may exist on some systems. If set of +Inf
, then the check for large globals is skipped. (Default: 500 * 1024 ^ 2
= 500 MiB)
(character string) Controls whether the identified globals should be scanned for so called references (e.g. external pointers and connections) or not. It is unlikely that another R process ("worker") can use a global that uses a internal reference of the master R process—we call such objects non-exportable globals.
If this option is "error"
, an informative error message is produced if a non-exportable global is detected.
If "warning"
, a warning is produced, but the processing will continue; it is likely that the future will be resolved with a run-time error unless processed in the master R process (e.g. plan(sequential)
and plan(multicore)
).
If "ignore"
, no scan is performed.
(Default: "ignore"
but may change)
(integer) An integer specifying the maximum recursive depth to which futures should be resolved. If negative, nothing is resolved. If 0
, only the future itself is resolved. If 1
, the future and any of its elements that are futures are resolved, and so on. If +Inf
, infinite search depth is used. (Default: 0
)
(character string) If random numbers are used in futures, then parallel (L'Ecuyer-CMRG) RNG should be used in order to get statistical sound RNGs. The defaults in the future framework assume that no random number generation (RNG) is taken place in the future expression because L'Ecuyer-CMRG RNGs come with an unnecessary overhead if not needed. To protect against mistakes, the future framework attempts to detect when random numbers are used despite L'Ecuyer-CMRG RNGs are not in place. If this is detected, and future.rng.onMisuse = "error"
, then an informative error message is produced. If "warning"
, then a warning message is produced. If "ignore"
, no check is performed. (Default: "warning"
)
(character string) A future must close any connections it opens and must not close connections it did not open. If such misuse is detected and this option is set to "error"
, value()
will produce an error with details. If it is set to "warning"
, a warning is produced. If "ignore"
, no check is performed. (Default: "warning"
)
(character string) Assigning variables to the global environment for the purpose of using the variable at a later time makes no sense with futures, because the next future may be evaluated in different R process. To protect against mistakes, the future framework attempts to detect when variables are added to the global environment. If this is detected, and future.globalenv.onMisuse = "error"
, then an informative error message is produced. If "warning"
, then a warning message is produced. If "ignore"
, no check is performed. (Default: "ignore"
)
(logical) If TRUE
, a FutureCondition
keeps a copy of the Future
object that triggered the condition. If FALSE
, it is dropped. (Default: TRUE
)
(numeric) Maximum waiting time (in seconds) for a future to resolve or for a free worker to become available before a timeout error is generated. (Default: 30 * 24 * 60 * 60
(= 30 days))
(numeric) Initial interval (in
seconds) between polls. This controls the polling frequency for finding
an available worker when all workers are currently busy. It also controls
the polling frequency of resolve()
. (Default: 0.01
= 1 ms)
(numeric) Positive scale factor used to increase the interval after each poll. (Default: 1.01
)
(logical) If TRUE
, extensive debug messages are generated. (Default: FALSE
)
(character vector or a logical) Specifies zero of more future startup scripts to be sourced when the future package is attached. It is only the first existing script that is sourced. If none of the specified files exist, nothing is sourced—there will be neither a warning nor an error.
If this option is not specified, environment variable R_FUTURE_STARTUP_SCRIPT is considered, where multiple scripts may be separated by either a colon (:
) or a semicolon (;
). If neither is set, or either is set to TRUE
, the default is to look for a ‘.future.R’ script in the current directory and then in the user's home directory. To disable future startup scripts, set the option or the environment variable to FALSE
. Importantly, this option is always set to FALSE
if the future package is loaded as part of a future expression being evaluated, e.g. in a background process. In order words, they are sourced in the main R process but not in future processes. (Default: TRUE
in main R process and FALSE
in future processes / during future evaluation)
(character vector) Overrides commandArgs()
when the future package is loaded.
(logical) Enable or disable multi-threading while using forked parallel processing. If FALSE
, different multi-thread library settings are overridden such that they run in single-thread mode. Specifically, multi-threading will be disabled for OpenMP (which requires the RhpcBLASctl package) and for RcppParallel. If TRUE
, or not set (the default), multi-threading is allowed. Parallelization via multi-threaded processing (done in native code by some packages and external libraries) while at the same time using forked (aka "multicore") parallel processing is known to unstable. Note that this is not only true when using plan(multicore)
but also when using, for instance, mclapply()
of the parallel package. (Default: not set)
(logical) Enable or disable re-encoding of UTF-8 symbols that were incorrectly encoded while captured. In R (< 4.2.0) and on older versions of MS Windows, R cannot capture UTF-8 symbols as-is when they are captured from the standard output. For examples, a UTF-8 check mark symbol ("\u2713"
) would be relayed as "<U+2713>"
(a string with eight ASCII characters). Setting this option to TRUE
will cause value()
to attempt to recover the intended UTF-8 symbols from <U+nnnn>
string components, if, and only if, the string was captured by a future resolved on MS Windows. (Default: TRUE
)
See also parallelly::parallelly.options.
(integer) Either a named list of mandelbrot()
arguments or an integer in {1, 2, 3} specifying a predefined Mandelbrot region. (Default: 1L
)
(integer) Number of rows and columns of tiles. (Default: 3L
)
The following options exists only for troubleshooting purposes and must not be used in production. If used, there is a risk that the results are non-reproducible if processed elsewhere. To lower the risk of them being used by mistake, they are marked as deprecated and will produce warnings if set.
(character string) Action to take when non-existing global variables ("globals" or "unknowns") are identified when the future is created. If "error"
, an error is generated immediately. If "ignore"
, no action is taken and an attempt to evaluate the future expression will be made. The latter is useful when there is a risk for false-positive globals being identified, e.g. when future expression contains non-standard evaluation (NSE). (Default: "ignore"
)
(character string) Method used to identify globals. For details, see globalsOf()
. (Default: "ordered"
)
(logical) If TRUE
, globals that are Future
objects (typically created as explicit futures) will be resolved and have their values (using value()
) collected. Because searching for unresolved futures among globals (including their content) can be expensive, the default is not to do it and instead leave it to the run-time checks that assert proper ownership when resolving futures and collecting their values. (Default: FALSE
)
All of the above R future.* options can be set by corresponding
environment variable R_FUTURE_* when the future package is
loaded. This means that those environment variables must be set before
the future package is loaded in order to have an effect.
For example, if R_FUTURE_RNG_ONMISUSE="ignore"
, then option
future.rng.onMisuse is set to "ignore"
(character string).
Similarly, if R_FUTURE_GLOBALS_MAXSIZE="50000000"
, then option
future.globals.maxSize is set to 50000000
(numeric).
To set R options or environment variables when R starts (even before the future package is loaded), see the Startup help page. The startup package provides a friendly mechanism for configurating R's startup process.
# Allow at most 5 MB globals per futures options(future.globals.maxSize = 5e6) # Be strict; catch all RNG mistakes options(future.rng.onMisuse = "error")
# Allow at most 5 MB globals per futures options(future.globals.maxSize = 5e6) # Be strict; catch all RNG mistakes options(future.rng.onMisuse = "error")