Overview

RcppParallel provides the parallelFor and parallelReduce functions however the TBB library includes a wealth of other tools for parallelization. The motivation for parallelFor and parallelReduce is portability: you can write a single algorithm that uses TBB on Windows, OS X, Linux, and Solaris x86 but falls back to a lower-performance implementation based on TinyThread on other platforms.

If however you are okay with targeting only the supported platforms you can use TBB directly and bypass parallelFor and parallelReduce. Note that if you are doing this within an R package you plan on submitting to CRAN you should also provide a fallback serial implementation so the package still compiles on platforms that don’t currently support TBB (e.g. Solaris Sparc). Details on doing this are in the Portability section below.

TBB APIs

Algorithms

TBB includes a wide variety of tools for parallel programming, including:

  • Advanced algorithms: parallel_scan, parallel_while, parallel_do, parallel_pipeline, parallel_sort
  • Containers: concurrent_queue, concurrent_priority_queue, concurrent_vector, concurrent_hash_map
  • Atomic operations: fetch_and_add, fetch_and_increment, fetch_and_decrement, compare_and_swap, fetch_and_store
  • Timing: portable fine grained global time stamp
  • Task Scheduler: direct access to control the creation and activation of tasks

See the Intel TBB User Guide for documentation on using these features.

Synchronization

When using TBB directly you can also take advantage of TBB’s built in concurrency and synchronization classes, including:

  1. TBB concurrent container classes (see: https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Containers.htm).

  2. TBB mutual exclusion classes (see: https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Mutual_Exclusion.htm)

  3. TBB atomic operations (see https://www.threadingbuildingblocks.org/docs/help/tbb_userguide/Atomic_Operations.htm).

Portability

When using TBB directly in a CRAN package you should check the value of the RCPP_PARALLEL_USE_TBB macro and conditionally include a serial implementation of your algorithm if it’s not TRUE. Note that this macro is defined in RcppParallel.h so you should include this in all cases (it will in turn automatically include <tbb/tbb.h> on platforms where it’s supported). For example, your source file might look like this:

#include <RcppParallel.h>

#if RCPP_PARALLEL_USE_TBB

IntegerVector transformDataImpl(IntegerVector x) {

  // Implement by calling TBB APIs directly 

}

#else

IntegerVector transformDataImpl(IntegerVector x) {

  // Implement serially

}

#endif

// [[Rcpp::export]]
IntegerVector transformData(IntegerVector x) {
  return transformDataImpl(x);
}

Note that the two functions have the same name (only one will be compiled and linked based on whether the target platform supports TBB).