R / Batch Mode
It is helpful to run R
in batch mode, such as in a workflow that runs multiple R
scripts.
This documentation provides examples of how to do this.
Overview
Running R
in batch mode is useful, in particular when a script has been tested
(for example in RStudio
) and needs to be run in an automated way rather than from R
user interface or RStudio
.
A critical concept when running in batch mode is to properly set the working directory for R
so that it
can find input and output file locations that are specified as local paths.
Ideally, paths in R
scripts are specified as relative paths to the working directory
so that scripts are portable between users and computers.
R
typically assumes that the working directory is the current folder
and RStudio
defaults the working directory to the folder of the script that is opened from the file explorer
such as double-clicking on the R
script.
When the RStudio
File / Open File... menu is used,
RStudio
defaults the working directory to the user's files (e.g., C:\Users\user\Documents
on Windows).
Calling R
from other software requires additional care.
The working directory can be set in an R
script by using a statement like the following in the script.
Note that forward slash or double-backslash can be used but single backslash syntax cannot be used because
single backslashes are used to escape special characters (e.g., \n
for newline).
setwd("C:/some/folder")
setwd("C:\\some\\folder")
The R
documentation indicates that when running in batch mode, the Rscript
program should be called
(Rscript.exe
on Windows) rather than R CMD BATCH
or other scripts that are no longer recommended.
Therefore, the examples below focus on using Rscript
.
- See the
Rscript
documentation.
A goal is to allow running an R
script without hard-coding the working directory in the script
because hard-coding the working directory renders the script non-portable.
RScript
can be run in a Windows command prompt window
by running expressions on the command line or running a script, but not both.
Therefore, the following first two statements below will run, but the third does not work
(-e
expression will be run and script won't be run, perhaps because it is considered an argument to the expression?).
Note also that the -e
expression must be double quoted if spaces exist in the expression and
consequently the inner set of quotes must be escaped with single backslashes.
> Rscript -e "setwd(\"C:/some/folder\")"
> Rscript SomeRScript.R
> Rscript -e "setwd(\"C:/some/folder\")" SomeRScript.R
The third example could then theoretically set the working directory and then run an R
script that uses that working directory.
However, RScript
does not support this behavior (-e
cannot be used and also specify an R
script).
Windows Examples
Running R
in batch mode on Windows can be tested by running R
on the command line (Windows command prompt),
then from a *.bat
file,
and then from other environments, for example a system call from Java, Python, etc.
Consider the following test script called test.R
:
# This is a simple R script to illustrate working directory
cat("Inside of script\n")
# Print the working directory to confirm
wd <- getwd()
cat("Working directory: ", wd)
The script can exist in any folder and can be run from the Windows command shell prompt the following ways:
- Change to script folder, specify full path to
Rscript
, and use the script file name without leading path. Note that double quotes are used to specifyRscript
path because it contains a space. This is inconvenient because it is easy to mistype the path."C:\Program Files\R\R-3.6.1\bin\x64\Rscript" test.R
- Add the
R
software folder (C:\Program Files\R\R-3.6.1\x64
) to thePATH
environment variable (for example by configuring userPATH
environment variable), change to the script folder, and use the script file name without the leading path. Adding the software to thePATH
is somewhat inconvenient but may be OK if versions are not frequently updated.Rscript test.R
- Change to any folder, specify the full path to
Rscript
, and use the full path to the script. This is the most explicit way to run the script but is inconvenient because now two full paths must be specified. Double quotes should be used around each path if a path contains a space. Note that although single backslashes are specified to the operating system to find theRscript
program, single backslashes cannot be used to provide paths toR
software. This approach can be used when called from a program such as Java because both paths are explicit."C:\Program Files\R\R-3.6.1\bin\x64\Rscript" "C:/Users/user/some/path/test.R"
The above command lines can be added to a batch file (e.g., testR.bat
), for example
rem Simple batch file to run R command line
# Example 1 from above
"C:\Program Files\R\R-3.6.1\bin\x64\Rscript" test.R
# Example 2 from above
Rscript test.R
# Example 3 from above
"C:\Program Files\R\R-3.6.1\bin\x64\Rscript" "C:/Users/user/some/path/test.R"
The above illustrate the mechanics of running R
in batch mode but does not solve the issue
allowing the working directory to be set dynamically.
To do so (lacking additional features in R
), the working directory must be passed to the script
from software that is capable of determining the working directory dynamically.
The resulting R
script has the following logic (see runWithWd.R
):
# Example R script to accept working directory on command line.
# Set working directory
# - this is bad because it is hard-coded and not the same on every computer
#setwd ("C:\\Some\\Path\\")
# Get the working directory from first and only command line argument.
args = commandArgs(trailingOnly=TRUE)
if (length(args)<1) {
# Working directory not specified on the command line, so default to current.
cat("Working directory defaulting to: ", wd)
}
else {
# Set working directory to first command line argument
wd <- args[1]
cat("Working directory to set: ", wd)
setwd(wd)
}
cat("Inside of script\n")
# Print the working directory to confirm
wd <- getwd()
cat("Working directory: ", wd, "\n")
quit()
The script can be called from a *.bat
file as follows (see runR.bat
):
rem Run R script with batch file folder as working directory
rem Determine the folder where this Linux shell script lives
rem - first get the bat file folder
rem - then convert single backslashes to forward slashes since R does not like single backslashes
rem - have to be careful with quote positioning because they get carried forward
set rwd="%~dp0"
set "rwd=%rwd:\=/%"
echo R working directory from bat file is %rwd%
rem Run the R script
rem - R is probably in the PATH since it installs to a common bin folder
rem - the first use of %rwd% tells R where to find the R script
rem - the second use of %rwd% tells the script being run where the script exists
rem - all paths in the R script should be absolute using %wd%
rem - a more explicit command line option than the first R script argument could be implemented
"C:\Program Files\R\R-3.6.1\bin\x64\Rscript" %rwd%/test.R "%rwd%"
Linux Examples
See the Windows examples. To call an R
script and pass the working directory from a Linux script,
create a script with the following lines:
#!/bin/sh
#
# Run R script with Linux script folder as working directory
#
# Determine the folder where this Linux shell script lives
rwd=$(cd $(dirname "$0") && pwd)
# Run the R script
# - R is probably in the PATH since it installs to a common bin folder
# - the first use of $rwd tells R where to find the R script
# - the second use of $rwd tells the script being run where the script exists
# - all paths in the R script should be absolute using %wd%
# - a more explicit command line option than the first R script argument could be implemented
Rscript "${scriptFolder}/test.R" "${rwd}" "${rwd}"
Java Examples
Using a Windows bat
file or Linux script to run an R
script and transparently set the working directory
is a bit cumbersome because a bat
file or Linux script must be created for each R
script
(or can run multiple R
scripts).
The Java language provides features to run other programs using a system call approach.
The solution varies depending on application due to handling of stdout
and stderr
streams,
error handling, threading, etc.
An example of a Java program running an R
program in batch mode is
the TSTool software RunR
command,
The TSTool software provides features to automate workflow processing and internally
maintains a working directory as the location of the command file that was opened.
Any relative file paths specified to TSTool are relative to the command file location.
Therefore, it is possible to specify the working directory for R
.
The RunR
command automatically handles the working directory.
Another option might be to set an environment variable in the environment
that is used to run the R
process, and then retrieve the environment variable in the R
script.