R and Rstudio

 

Contents
R
Rstudio
Rstudio Connect Server

R

R is a widely used open-source software for statistical computing and graphics supported by the R Core Team and the R Foundation for Statistical Computing. Many versions of R are available to use on Minerva. To see a list of installed versions simply type:

$ module -r spider “^R$”

where “^R$” is a regular expression meaning that the module name starts with R and ends with R as well.

Note that new versions of R are periodically added and the default version of R changes accordingly. You may want to explicitly specify the version of R in your module load command (i.e. module load R/3.5.3) to avoid picking up a new version of R when you don’t want it. We would recommend users to use the most recent version installed. Our current R installation comes with a large number of popular scientific and high-performance computing packages preinstalled, including Bioconductor packages.

Usage

R may be run either interactively via the R terminal, or as a batch process reading commands from a script file. To launch the R terminal, open an interactive session by submitting a job to the interactive LSF queue. Then simply execute the R command. After the terminal is launched in the interactive mode, users can run R commands at the prompt:

[choh07@li03c04 ~]$ bsub -q interactive -P acc_hpcstaff -n 1 -W 1:00 -R rusage[mem=8000] -XF -Is /bin/bash
Job <65508400> is submitted to queue .
<>
<>
<>

[choh07@lc02a29 ~]$ ml R/4.2.0
[choh07@lc02a29 ~]$ R

R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> print("Hello World!", quote = FALSE)
[1] Hello World!
>

Rscript, on the other hand, can be used to run a R script in which a sequence of R commands has been saved. The syntax is:

$ Rscript <options> <your R script>

For example,

$ cat hello.R
print("Hello World!", quote = FALSE)

$ Rscript hello.R
[1] Hello World!

Installing New Packages Locally

If you find that a particular package you need is missing from the R version you use, you may open a ticket at hpchelp@hpc.mssm.edu to have it installed in the system area or install it yourself into your local space. In some cases you will need to build/maintain your own set of R packages in your space locally due to various reasons. There are multiple ways to install R packages. Here we present one of them below as an example. To begin, you need to create a directory in your local space to install R packages into. We will use “~/.Rlib” in this example, which is the default location and its path is predefined in the environment variable “R_LIBS_USER” but you may choose as you wish.

$ mkdir ~/.Rlib

To make R packages installed into this local directory, create the “.Renviron” file in your home directory to redefine the environment variable “R_LIBS” for the R packages search path accordingly:

$ cat ~/.Renviron
R_LIBS=${R_LIBS_USER}:${R_LIBS}

Note that the path to your local R packages directory is prepended to the system one in order to point R to the directory where your local packages are installed first. Load R and start up an R session from the terminal. You should be able to use the standard R command to install the package interactively now.

$ R

R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
.
.
.

> .libPaths()
[1] "/hpc/users/choh07/.Rlib"
[2] "/hpc/packages/minerva-centos7/rpackages/4.2.0/site-library"
[3] "/hpc/packages/minerva-centos7/rpackages/bioconductor/3.15"
[4] "/hpc/packages/minerva-centos7/R/4.2.0/lib64/R/library"

> install.packages("zoo")
Installing package into ‘/hpc/users/choh07/.Rlib’
(as ‘lib’ is unspecified)
.
.
.
** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (zoo)
>

 

Rstudio

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. RStudio Desktop and RStudio Server are both available in free and fee-based (commercial) editions.

On Minerva, there are two ways that you can access the rstudio interface, running Rstudio application over GUI (graphical user interface) or one simple command to get On-the-fly Rstudio over Web in a Minerva job.

 

Using Rstudio application over GUI

We have the rstudio desktop compiled as a module on Minerva. To use it:

  • Enable X11 forwarding when you login to Minerva (see login page for details)
  • ml rstudio/2022.02.1-461
  • rstudio

This is the Rstudio Desktop application, and it may be slow to access over GUI depending on your network.

 

Using Rstudio application over GUI

We have developed simple tool scripts for users to launch RStudio server with interactive web interface inside the LSF job with dedicated resources. With one simple command,you can run the free Rstudio Server remotely on Minerva and access the RStudio interface using your local web browser.

The scripts are located in the login nodes under

/usr/local/bin/{minerva-rstudio-web.sh, minerva-rstudio-web-r4.sh}

With one command on the terminal, you will launch RStudio server and access interactive web interface inside the LSF job with dedicated resources on Minerva.

The

 minerva-rstudio-web.sh

will start RStudio web with R3, and

 minerva-rstudio-web-r4.sh

will start RStudio with R4 kernel.

Usage:
For example, to start an RStudio web session with R, on the login nodes, run commands minerva-rstudio-web-r4.sh (using R 4.0.3) or minerva-rstudio-web.sh (R 3.6.2) with default resource configuration and URL to access it. Please see the -h option for help messages containing resource requests and installing packages.

Example:

$ sh minerva-rstudio-web-r4.sh 
[INFO] Image not specified, check if previously used
[INFO] Found previously used image 
/hpc/users/guow03/minerva_jobs/rstudio_jobs/singularity-rstudio.simg. 
Using it.
[INFO] Project is not specified, or is acc_null, 
using 1st avail project.
[INFO] Project to use is acc_hpcstaff
[INFO] Parameters used are: 
[INFO] -n    4
[INFO] -M    3000
[INFO] -W    6:00
[INFO] -P    acc_hpcstaff
[INFO] -J    rstudio
[INFO] -q    premium
[INFO] -R    null
[INFO] -i    /hpc/users/guow03/minerva_jobs/rstudio_jobs/
singularity-rstudio.simg
[INFO] Submitting rstudio job...
Job <25962439> is submitted to queue .
[INFO] See below for web access when job starts.
Job <25962439> : Not yet started.
[INFO] Job is pending
Job <25962439> : Not yet started.
[INFO] Job is pending
Job <25962439> : Not yet started.
[INFO] Job is pending
Job <25962439> : Not yet started.
[INFO] Job is pending
[INFO] Job is running, wait for link
[INFO] Job is running, wait for link
[INFO] Job is running, wait for link
<< output from stdout >>
Using local available port 8787
Using password in /hpc/users/guow03/minerva_jobs/rstudio_jobs/
.rstudio_onthefly_password
RStudio started in the singularity container with PID 306586.
Making sure it is alive
Checking 3, next check in 5 seconds.
   PID TTY          TIME CMD
306586 ?        00:00:00 singularity
Checking 2, next check in 5 seconds.
   PID TTY          TIME CMD
306586 ?        00:00:00 starter-suid
Checking 1, next check in 5 seconds.
   PID TTY          TIME CMD
306586 ?        00:00:00 starter-suid

SSH port forwarding to 10.95.46.103 with PID 306725.

Rstudio is started on compute node lc03e29, port 8787

Access the RStudio Web using your web browser:  
http://10.95.46.103:52439 

<< output from stderr >>

The RStudio web interface will be available at URL http://10.95.46.103:52439. Use the browser on your laptop to access it. Here 10.95.46.103 is the login node IP on the campus network and 52439 is the port. 2439 is the last 4 digit of the job 25962439 running the instance.

What happens behind the scene? This tool wraps the following tasks in one command.

  1. downloads a custom built Singularity container image of RStudio in your home directory
  2. prompts and creates a password for the RStudio interface,
  3. writes and submits an LSF job script to launch the RStudio within the image,
  4. provides the URL link to access the instance

Currently, the minerva-rstudio-web-r4.sh script is the preferred version as it is built with common libraries including"DBI", "odbc", "shiny", "devtools", "ggplot2", "tidyverse", "tidymodels", "car", "dplyr", "tidyr". If you want to switch the scripts to use, move or rename the downloaded image singularity-rstudio.simg in $HOME/minerva_jobs/rstudio_jobs.
See the Package Installations section in the help message for packages not installed.

Common problems:

  1. Plotting in rstudio errors due to X11 display not working when using the v4 version. This problem has been fixed with an updated image. Please remove your old image and run the script.
  2. This is a containerized application for your workflow reproducibility, and R packages are installed in$HOME/x86_64-pc-linux-gnu-library/R_VERSIONby default. Since this is a container environment, you need to install and maintain your own R related package. No module system setup. To install the R packages needed, please read carefully on the help message (minerva-rstudio-web-r4.sh -h). You will need to do the installation in the Rstudio web “Shell terminal tab” not the R Console tab shown as below.
[INFO] === Package Installations ===
[INFO] To install R packages, do the following in the RStudio web *Shell terminal console*
[INFO] $ export http_proxy=http://172.28.7.1:3128
[INFO] $ export https_proxy=http://172.28.7.1:3128
[INFO] $ export all_proxy=http://172.28.7.1:3128
[INFO] $ export no_proxy=localhost,*.hpc.mssm.edu,*.chimera.hpc.mssm.edu,172.28.0.0/16
[INFO] $ R
[INFO] >>> install.packages(name_of_package)
[INFO] The packages will be installed in your /hpc/users/gail01/R/x86_64-pc-linux-gnu-library/R_VERSION
[INFO] If the package is not available in your RStudio Web interface by R library('name_of_package')
[INFO] You can restart the RStudio job

 

Installing R Modules That Require External Libraries From Minerva Packages for on-the-fly RStudio Web

Occasionally, you may need to install an R module that requires one or more dynamic libraries from Minerva packages. In these cases, you should install the R module into the private library that is used by rstudio outside of rstudio using hard-coded paths to those libraries. The library that is automatically set up is $HOME/R/x86_64-pc-linux-gnu-library/[3.0 or 4.0].

A convenient way to signal the linker that it should encode the full path to dynamic libraries is to set LD_RUN_PATH to LD_LIBRARY_PATH after you load all the Minerva packages needed to build the R module but before you actually build the module. When you build the module, you will need to specify the destination library as the default private library for rstudio.
Example:

ml R/4.0.2
ml geos
export LD_RUN_PATH=$LD_LIBRARY_PATH
R
> install.packages(“Seurat”, lib=”/hpc/users/fludee01/R/x86_64-pc-linux-gnu-library/4.0”)

Now that the Seurat R module has been built and installed in rstudio’s user library, you can call it from within on-the-fly rstudio.

 

RStudio Connect Server

What is Rstudio Connect?

Rstudio Connect server is available on Minerva since Aug 2020, where you can publish Shiny, R Markdown and Jupyter for collaborators or others. See more details about Rstudio Connect here.

 

How many users allowed to log into Rstudio Connect?

Currently, the subscription is upgraded to allow 45 named users on the server.

 

How to publish?

Rstudio-connect server is at https://rstudio-connect.hpc.mssm.edu

To publish your R product with Rstudio-connect:

1. Login to Minerva by ssh -X userid@minerva.hpc.mssm.edu ( X11 forwarding needed for GUI)

2. After login,

$ ml rstudio

$ rstudio

Connect Rstudio IDE to Rstudio connect, (see Rstudio Connect here)

The address of the RStudio Connect server is https://rstudio-connect.hpc.mssm.edu. This will request login. Use your Sinai ID and password  (NO VIP token needed).

3.To publish, open your app file such as app.R in Rstudio IDE, and click the right corner publish button (see Rstudio Publishing here)

NOTE: please uncheck “Launch browser” when publishing, since launching the browser over GUI is slow.

4. After your application is successfully deployed as shown in the deploy log, you can copy the URL and open it in any browser you like

5. You can manage access and other metrics in the dashboard. Tip: you can modify your Content URL as shown in the screenshot (bottom right corner), and copy the URL for sharing. For more on how to manage your application in the dashboard, please follow Rstudio connect user guide located here.