Skip to content

Use doRNG and foreach for reproducible parallel bootstrapping #46

@xrobin

Description

@xrobin

The plyr is old and newer, better options exist for parallel execution. The foreach package seems to be the way to go, with different backends available, and the doRNG package for reproducible parallel calculations.

Interface from the user perspective would look like:

cl <- makeCluster(2) # 2 cores
registerDoParallel(cl)
registerDoRNG(1234) 
ci(...)
stopCluster(cl)

Internally we would simply have:

resampled.values <- foreach(i=1:boot.n) %dopar% { stratified.bootstrap.test(...) }

instead of

resampled.values <- laply(1:boot.n, stratified.bootstrap.test, ...)

Things to consider:

  • Code should be able to run without any extra line of code from the user (but then not in parallel)
  • Progress bars?
  • What if some of the bootstrapping gets implemented in C++ in the future?

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-changeThe issue describes a change in the API visible to the userfeature-request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions