12  Running a R script on the command line

12.1 The Basic Process

  1. Specify named arguments for your R script.
  2. Wrap running Rscript in a bash file, passing on arguments to the R Script
  3. Use bash to execute your bash script, with appropriate arguments.
For FH Users

In your bash script you’ll need to load the appropriate environment module (usually at least fhR):

module load fhR/4.3.3-foss-2023b

You can also run things from the Bioconductor / Rocker containers using Apptainer (?sec-fh-apptainer).

12.2 Using Rscript on the command-line

Let’s talk about wrapping R scripts in a Bash script. This might seem like an extra layer of redundancy, but remember that we need to specify our software environment before we run something, so our bash script lets us do that.

Our main executable for running R on the command-line is Rscript.

When we run R on the command line, it will look something like this:

Rscript process_data.R --input_file=my_genome_file.vcf

Note that you can have named inputs when you run on the command line,

12.3 Wrapping it up in a bash script

Say you have an R Script you need to run on the command line. In our bash script, we can do the following:

scripting-basics/wrap_r_script.sh
#!/bin/bash
Rscript process_data.R input_file="${1}"

This calls Rscript, which is the command line executable, to run our R script. Note that we have a named argument called input_file and it is done differently than in Bash - how do we use this in our R Script?

12.3.1 Using Named Arguments in an R script

We can pass arguments from our bash script to our R script by using commandArgs() - this will populate a list of named arguments (such as CSVFILE) that are passed into the R Script. We assign the output of commandArgs() into the args object.

We refer to our CSVFILE argument as args$CSVFILE in our script.

scripting-basics/r_script.R
library(tidyverse)

args <- commandArgs()
# Use arg$CSVFILE in read.csv
csv_file <- read.csv(file=args$input_file)

# Do some work with csv_file
csv_filtered <- csv_file |> dplyr::filter()

# Write output
write.csv(csv_filtered, file = paste0(args$CSVFILE, "_filtered.csv"))

12.3.2 Running our R Script

Now that we’ve set it up, we can run the R script from the command line as follows:

bash my_bash_script.sh my_csvfile.csv 

In our bash script, my_bash_script.sh, we’re using positional argument (for simplicity) to specify our csvfile, and then passing the positional argument to named ones (CSVFILE) for my_r_script.R.

12.4 Quarto Documents

Quarto is the next generation of RMarkdown and supports a number of output formats.

You might want to apply a workflow that you’ve built in a quarto document.

The main difference is that you’d use quarto run rather than Rscript run.

#!/bin/bash
quarto run my_quarto_doc.qmd 

12.5 Apptainer

Apptainer (previous Singularity) is a secure way to run Docker containers on a HPC system. The commands are very similar to Docker, but aren’t.