Secure data environments and R

First, create a complete repository of R packages:

(for Windows, these need to be Windows Binary files)

  • Outside the Secure Data Environment, open R on your computer
    • Use the command:
      • list = available.packages()
    • Select the CRAN server to use
    • Use the command (substitute file directory path where indicated by the blue text):
      • packages(pkgs = pkg.list, destdir = “C://MyRPackages“,type=”win.binary”)
  • The above download command will download all available packages from the CRAN, should take about 2 hours (there are just over 12,000 packages in total)

Locate the packages, virus check, and copy into your secure data environment R Package location.

Ensure that users have read-write access to the R Package location (otherwise they won’t be able to run the install packages commands)

Second:  create the R Index file

Before installing an R package and it’s dependencies, create an ‘index’ file in the location of the secure data environment where all the packages are saved.  This enables R to locate all dependent packages and install them. Without the index file, R will not know where the dependent packages are to install them.

The index file only needs to be created once (although if additional packages are added to the directory, then the following command to create the index file will need to be repeated so that the index file is updated).

Open R in the secure data environment and run the commands:

  • library(tools)
  • write_PACKAGES(“path to packages”)

This will create two index files in the Packages directory.  Go into the Packages directory to check, but delete the file with extension .gz

Third: Install packages in R

To install an R package, and associated dependencies, open up a new R or R Studio session in the secure data environment. When the analyst wishes to install a new package, and their dependencies, they should run the following command:

  • packages(“tidyverse“,contriburl=”file:C:/Software extensions/Packages/“)

where the blue path refers to the Package location (note that the ‘file:C:/ part is required so that R knows that the packages are saved locally, not on the internet); the red text refers to the name of the package that the analyst wishes to install and use for their session.

The analyst should only need to do this once (they don’t need to run this command every time they log in, except if R has been updated (e.g. from 3.4 to 3.5)).