Skip to contents

Start parallel clusters using parallel package

Usage

parallel_start(
  ...,
  .method = c("parallel", "spark"),
  .export_vars = NULL,
  .packages = NULL
)

parallel_stop()

Arguments

...

Parameters passed to underlying functions (See Details Section)

.method

The method to create the parallel backend. Supports:

  • "parallel" - Uses the parallel and doParallel packages

  • "spark" - Uses the sparklyr package

.export_vars

Environment variables that can be sent to the workers

.packages

Packages that can be sent to the workers

Parallel (.method = "parallel")

Performs 3 Steps:

  1. Makes clusters using parallel::makeCluster(...). The parallel_start(...) are passed to parallel::makeCluster(...).

  2. Registers clusters using doParallel::registerDoParallel().

  3. Adds .libPaths() using parallel::clusterCall().

Spark (.method = "spark")

  • Important, make sure to create a spark connection using sparklyr::spark_connect().

  • Pass the connection object as the first argument. For example, parallel_start(sc, .method = "spark").

  • The parallel_start(...) are passed to sparklyr::registerDoSpark(...).

Examples


# Starts 2 clusters
parallel_start(2)

# Returns to sequential processing
parallel_stop()