B R Package Installation Tips

This appendix introduces some of the mechanics involved in working with R packages. If you are tasked with installing, uninstalling, or upgrading Oracle Machine Learning for R but you do not have extensive experience working with R packages, then you may find the information in this appendix helpful.

This appendix contains these topics:

B.1 R Package Installation Basics

You can install R packages from the R command line or from your system’s command line.

R package installation basics are outlined in Chapter 6 of the R Installation and Administration Guide. The following example installs a package on Oracle Linux using Oracle R Distribution. It installs the arules package as root so that packages are installed in the default R system-wide location where all users can access it, /usr/lib64/R/library.

Within R, using the install.packages function always attempts to install the latest version of the requested package available on CRAN:

R> install.packages("arules")

If the arules package depends upon other packages that are not already installed locally, the R installer automatically downloads and installs those required packages. This is a huge benefit that frees users from the task of identifying and resolving those dependencies.

You can also install R from the shell command line. This is useful for some packages when an internet connection is not available or for installing packages not uploaded to CRAN. To install packages this way, first locate the package on CRAN and then download the package source to your local machine. For example:

$ wget https://cran.r-project.org/src/contrib/arules_1.1-9.tar.gz

Then, install the package using the command R CMD INSTALL:

$ R CMD INSTALL arules_1.1-9.tar.gz

A major difference between installing R packages using the R package installer at the R command line and shell command line is that package dependencies must be resolved manually at the shell command line. Package dependencies are listed in the Depends section of the package's CRAN site. If dependencies are not identified and installed prior to the package's installation, you will see an error similar to:

ERROR: dependency 'xxx' is not available for package 'yyy'

As a best practice and to save time, always refer to the package's CRAN site to understand the package dependencies prior to attempting an installation.

If you don't run R as root, you won't have permission to write packages into the default system-wide location and you will be prompted to create a personal library accessible by your userid. You can accept the personal library path chosen by R, or specify the library location by passing parameters to the install.packages function. For example, to create an R package repository in your home directory:

R> install.packages("arules", lib="/home/username/Rpackages")

or

$ R CMD INSTALL arules_1.1-9.tar.gz --library=/home/username/Rpackages

Refer to the install.packages help file in R or execute R CMD INSTALL --help at the shell command line for a full list of command line options.

To set the library location and avoid having to specify this at every package install, simply create the R startup environment file .Renviron in your home area if it does not already exist, and add the following piece of code to it:

R_LIBS_USER = "/home/username/Rpackages"

B.2 Set the R Repository

Instructions for setting the R repository.

Each time you install an R package from the R command line, you are asked which CRAN mirror, or server, R should use. To set the repository and avoid having to specify this during every package installation, create the R startup command file .Rprofile in your home directory and specify the CRAN mirror to use. The following code sets the R package repository to the Seattle CRAN mirror at the start of each R session.

cat("Setting Seattle repository")
r = getOption("repos") 
r["CRAN"] = "http://cran.fhcrc.org/"
options(repos = r)
rm(r)

B.3 About R Package Installation for Oracle Machine Learning for R

Embedded R execution with OML4R allows the use of CRAN or other third-party R packages in user-defined R functions executed on the Oracle Database server.

The steps for installing and configuring packages for use with OML4R are the same as for open source R. The database-side R engine just needs to know where to find the R packages.

The OML4R installation is performed by the user oracle, which typically does not have write permission to the default site-wide library, /usr/lib64/R/library. On Linux and UNIX platforms, the OML4R Server installation provides the ORE script, which is executed from the operating system shell to install R packages and to start R. The ORE script is a wrapper for the default R script, a shell wrapper for the R executable. It can be used to start R, run batch scripts, and build or install R packages. Unlike the default R script, the ORE script installs packages to a location writable by the oracle user and accessible by all OML4R users: $ORACLE_HOME/R/library.

To install a package on the database server so that any R user can use it and for use in embedded R execution, an Oracle DBA would typically download the package source from CRAN using wget. If the package depends on any packages that are not in the R distribution in use, download the sources for those packages, also.

For a single Oracle Database instance, replace the R script with ORE to install the packages in the same location as the OML4R packages.

$ wget https://cran.r-project.org/src/contrib/arules_1.1-9.tar.gz
$ ORE CMD INSTALL arules_1.1-9.tar.gz

Behind the scenes, the ORE script performs the equivalent of setting R_LIBS_USER to the value of $ORACLE_HOME/R/library, and all R packages installed with the ORE script are installed to this location. For installing a package on multiple database servers, such as those in an Oracle Real Application Clusters (Oracle RAC) or a multinode Oracle Exadata Database Machine environment, use the ORE script in conjunction with the Exadata Distributed Command Line Interface (DCLI) utility.

$ dcli -g nodes -l oracle ORE CMD INSTALL arules_1.1-9.tar.gz

The DCLI -g flag designates a file containing a list of nodes to install on, and the -l flag specifies the user id to use when executing the commands.

If you are using an OML4R client, install the package in the same way as any R package, bearing in mind that you must install the same version of the package on both the client and server machines to avoid incompatibilities.

B.4 About CRAN Task Views

CRAN maintains a set of Task Views that identify packages associated with a particular task or methodology.

Task Views are helpful in guiding users through the huge set of available R packages. They are actively maintained by volunteers who include detailed annotations for routines and packages. If you find one of the task views is a perfect match, then you can install every package in that view using the ctv package, which automates package installation.

Install the ctv Package and Task Views

To use the ctv package to install a task view, first, install and load the ctv package.

R> install.packages("ctv")
R> library(ctv)

Then query the names of the available task views and install the view you choose.

R> available.views()
R> install.views("TimeSeries")

Use and Manage Packages

To use a package, start R and load packages one at a time with the library command.

Load the arules package in your R session.

R> library(arules)

Verify the version of arules installed.

R> packageVersion("arules")
[1] '1.1.9'

Verify the version of arules installed on the database server using embedded R execution.

R> ore.doEval(function() packageVersion("arules"))

View the help file for the apropos function in the arules package.

R> ?apropos

Over time, your package repository will contain more and more packages, especially if you are using the system-wide repository in which others are also adding packages. It's good to know the entire set of R packages accessible in your environment. To list all available packages in your local R session, use the installed.packages command:

R> myLocalPackages <- row.names(installed.packages())
R> myLocalPackages