5 Install Oracle Machine Learning for R on Exadata

This chapter explains how to install Oracle R Distribution and OML4R Server on Oracle Exadata Database Machine. This chapter includes these topics:

5.1 About Oracle Machine Learning for R on Exadata

Exadata is an ideal platform for OML4R.

The parallel resources of R computations in OML4R take advantage of the massively parallel grid infrastructure of Exadata.

Note:

The version of OML4R must be the same on the server and on each client computer. Also, the version of R must be the same on the server and on each client computer. See the Create a Database User for Oracle Machine Learning for R for supported configurations.

To install OML4R on Exadata:

  1. On each node:

    • Install Oracle R Distribution

    • Verify and configure the environment

    • Install OML4R Server and the supporting packages

  2. On the first node only, create an OML4R user, if desired. Alternatively, configure an existing database user to use OML4R. See .

You can simplify the process of installing OML4R on Exadata by using the Distributed Command Line Interface (DCLI).

5.2 Install Oracle Machine Learning for R on Exadata Using DCLI

Using DCLI can simplify the installation of OML4R on Exadata.

With DCLI, you can use a single command to install Oracle R Distribution and OML4R Server across multiple Exadata compute nodes. The following example shows the output of the DCLI help option, which explains the basic syntax of the utility.

See Also:

For more details about DCLI, go to the My Oracle Support website, log in with your Customer Support Identifier, and type DCLI in the search box.

Example 5-1 DCLI Help Option Output

$ dcli -h
 
Distributed Shell for Oracle Storage
 
This script executes commands on multiple cells in parallel threads.
The cells are referenced by their domain name or ip address.
Local files can be copied to cells and executed on cells.
This tool does not support interactive sessions with host applications.
Use of this tool assumes ssh is running on local host and cells.
The -k option should be used initially to perform key exchange with
cells.  User may be prompted to acknowledge cell authenticity, and
may be prompted for the remote user password.  This -k step is serialized
to prevent overlayed prompts.  After -k option is used once, then
subsequent commands to the same cells do not require -k and will not require
passwords for that user from the host.
Command output (stdout and stderr) is collected and displayed after the
copy and command execution has finished on all cells.
Options allow this command output to be abbreviated.

Return values:
 0 -- file or command was copied and executed successfully on all cells
 1 -- one or more cells could not be reached or remote execution returned
      non-zero status.
 2 -- An error prevented any command execution

Examples:
 dcli -g mycells -k
 dcli -c stsd2s2,stsd2s3 vmstat
 dcli -g mycells cellcli -e alter iormplan active
 dcli -g mycells -x reConfig.scl
 
usage: dcli [options] [command]

options:
 --version           show program's version number and exit
 -c CELLS            comma-separated list of cells
 -d DESTFILE         destination directory or file
 -f FILE             file to be copied
 -g GROUPFILE        file containing list of cells
 -h, --help          show help message and exit
 -k                  push ssh key to cell's authorized_keys file
 -l USERID           user to login as on remote cells (default: celladmin)
 -n                  abbreviate non-error output
 -r REGEXP           abbreviate output lines matching a regular expression
 -s SSHOPTIONS       string of options passed through to ssh
 --scp=SCPOPTIONS    string of options passed through to scp if different from
                     sshoptions
 --serial            serialize execution over the cells
 -t                  list target cells
 --unkey             drop keys from target cells' authorized_keys file
 -v                  print extra messages to stdout
 --vmstat=VMSTATOPS  vmstat command options
 -x EXECFILE         file to be copied and executed

The following topics describe installing OML4R components using DCLI:

5.2.1 Install Oracle R Distribution Across Exadata Compute Nodes Using DCLI

How to run DCLI to install Oracle R Distribution across multiple Exadata Linux compute nodes.

The commands are summarized in DCLI Command Summary for Oracle R Distribution installation on Exadata.

Important:

Before beginning the installation, review the instructions for installing Oracle R Distribution in Install R for Oracle Machine Learning for R.

To install Oracle R Distribution on Exadata using DCLI, follow these steps:

  1. Configure the Exadata environment to enable automatic authentication for DCLI on each compute node.

    1. Generate an SSH public-private key for the root user. Execute the following command as root on any node:

      $ ssh-keygen -N '' -f /.ssh/id_dsa -t dsa
      

      This command generates public and private key files in the .ssh subdirectory of the home directory of the root user.

    2. In a text editor, create a file that contains the names of all the compute nodes in the rack. Specify each node name on a separate line. For example, the nodes file for a 2-node cluster could contain entries like the following:

      $ cat nodes
      exadb01
      exadb02
      
    3. Run the DCLI command with the -k option to establish SSH trust across all the nodes. The -k option causes DCLI to contact each node sequentially (not in parallel) and prompts you to enter the password for each node.

      $ dcli -t -g nodes -l root -k -s "\-o StrictHostkeyChecking=no"
      

      DCLI with -k establishes SSH Trust and User Equivalence. Subsequent DCLI commands will not prompt for passwords.

  2. Install Oracle R Distribution using yum if an internet connection is available. Otherwise, install the Oracle R Distribution and operating system dependencies manually. Request the file ord-linux-x86_64-Rversion-Exadataversion.tar.gz from Oracle Support, where Rversion is the version of Oracle R Distribution to install and Exadataversion is the Exadata version output from running the imageinfo command..

    1. Log in to My Oracle Support.

    2. Click Contact Us.

    3. If yum and internet access are unavailable, request access to this file through My Oracle Support.

      ord-linux-x86_64-Rversion-Exadataversion.tar.gz
      
    4. When permission is granted, log in as root to any compute node and download the file.

  3. Create a directory and replicate the downloaded file in this directory across all nodes. For example, the following commands create the directory /home/oracle/ORD and replicate the file ord-linux-x86_64-Rversion-Exadataversion.tar.gz in this directory.

    $ dcli -t -g nodes -l root mkdir -p /home/oracle/ORD
    $ dcli -t -g nodes -l root -f 
            ord-linux-x86_64-Rversion-Exadataversion.tar.gz -d
            /home/oracle/ORD/ord-linux-x86_64-Rversion-Exadataversion.tar.gz
    
  4. Uncompress and untar the file to replicate the dependent RPMs across all nodes.

    $ dcli -t -g nodes -l root tar xvfz 
            /home/oracle/ORD/ord-linux-x86_64-Rversion-Exadataversion.tar.gz
            -C /home/oracle/ORD
    $ ls /home/oracle/ORD/ord-linux-x86_64-Rversion-Exadataversion.tar.gz
    

    Alternatively, you can download these RPMs from the Oracle public yum server. The locations of the RPMs are listed in "Install Oracle R Distribution on Oracle Linux Using RPMs".

  5. To install the new RPMs and update existing RPMs across nodes, execute the following RPM command:

    $ dcli -t -g nodes -l root rpm -i --force 
            /home/oracle/ORD/ord-linux-x86_64-Rversion-Exadataversion/*.rpm
    

    The --force flag prevents errors from circular dependencies.

  6. Verify the R installations on each node by first returning to the location where R is installed and then starting R.

    $ dcli -g nodes -l oracle R RHOME
    exadb01: /usr/lib64/R
    exadb02: /usr/lib64/R
    

    For each node, the following command returns the output shown.

    $ dcli -g nodes -l oracle R --vanilla
    ...
    exadb01: R is free software and comes with ABSOLUTELY NO WARRANTY.
    exadb01: You are welcome to redistribute it under certain conditions.
    exadb01: Type 'license()' or 'licence()' for distribution details.
    exadb01:
    exadb01: Natural language support but running in an English locale
    exadb01:
    exadb01: R is a collaborative project with many contributors.
    exadb01: Type 'contributors()' for more information and
    exadb01: 'citation()' on how to cite R or R packages in publications.
    exadb01:
    exadb01: Type 'demo()' for some demos, 'help()' for on-line help, or
    exadb01: 'help.start()' for an HTML browser interface to help.
    exadb01: Type 'q()' to quit R.
    exadb01:
    exadb01: You are using Oracle's distribution of R. Please contact
    exadb01: Oracle Support for any problems you encounter with this
    exadb01: distribution.
5.2.1.1 DCLI Command Summary for Oracle R Distribution installation on Exadata

The DCLI commands used to install Oracle R Distribution on a Linux Exadata system are listed in the following example.

Replace version with the version number of the Oracle R Distribution that you are using.

Example 5-2 DCLI Command Summary for Oracle R Distribution

ssh-keygen -N " -f ~/.ssh/id_dsa -t dsa
vi nodes # enter node names
dcli -t -g nodes -l root -k -s "\-o StrictHostkeyChecking=no" 
dcli -t -g nodes -l root mkdir -p /home/oracle/ORD
dcli -t -g nodes -l root -f ord-linux-x86_64-version.tar.gz -d
           /home/oracle/ORD/ord-linux-x86_64-version.tar.gz 
dcli -t -g nodes -l root tar xvfz /home/oracle/ORD
           /ord-linux-x86_64-version.tar.gz -C /home/oracle/ORD
dcli -t -g nodes -l root rpm -i --force
           /home/oracle/ORD/ord-linux-x86_64-version/*.rpm
dcli -g nodes -l root R RHOME
dcli -g nodes -l root R --vanilla

5.2.2 Install OML4R Server Across Exadata Compute Nodes Using DCLI for 12c and Earlier

How to use DCLI to install OML4R Server across multiple Exadata Linux compute nodes for Oracle Database 12c and Earlier.

The DCLI commands are summarized in DCLI Commands Summary for Oracle Machine Learning for R Server.

Note:

Before beginning the installation, review the instructions for installing OML4R Server in Install Oracle Machine Learning for R Server.

To install OML4R Server on Exadata using DCLI for Oracle Database 12c and earlier, follow these steps:

  1. Ensure that the ORACLE_HOME, ORACLE_SID, R_HOME, PATH, and LD_LIBRARY_PATH environment variables are properly set on each node, and are defined in the same shell where the DCLI script will run. For example, you could specify values like the following in a bashrc file:

    export ORACLE_HOME=/hostname/app/oracle/product/release_number/dbhome_1
    export ORACLE_SID=ORCL
    export R_HOME=/usr/lib64/R
    export PATH=$PATH:$R_HOME/bin:$ORACLE_HOME/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib:$RHOME_lib:$R_HOME/port/Linux-X64/lib
    
  2. Go to the Oracle Machine Learning for R Downloads website.

    On the Downloads page, in the Linux 64-bit row, select Server, accept the license agreement, and download the file. To download the supporting packages, select Supporting, accept the license agreement, and download the file. The following files are downloaded for OML4R, where version is the OML4R, release number.

    ore-server-linux-x86-64-version.zip
    ore-supporting-linux-x86-64-version.zip
    
  3. Log in as root, and copy the installers for OML4R Server and the supporting packages across nodes. For example:

    $ dcli -g nodes -l oracle mkdir -p /home/oracle/OML4R
    $ dcli -g nodes -l oracle -f ore-server-linux-x86-64-version.zip -d 
         /home/oracle/OML4R/ore-server-linux-x86-64-version.zip
    $ dcli -g nodes -l oracle -f ore-supporting-linux-x86-64-version.zip -d 
         /home/oracle/OML4R/ore-supporting-linux-x86-64-version.zip
    
  4. Unzip the OML4R Server bundle on each node:

    $ dcli -t -g nodes -l oracle unzip
         /home/oracle/OML4R/ore-server-linux-x86-64-version.zip -d
         /my_destination_directory/
  5. Unzip the supporting packages on each node:

    $ dcli -t -g nodes -l oracle unzip   
         /home/oracle/OML4R/ore-supporting-linux-x86-64-version.zip -d 
         /my_destination_directory/
    
  6. Install OML4R Server components:

    $ dcli -t -g nodes -l oracle "cd /my_destination_directory; ./server.sh -y
          --perm permtablespace --temp temptablespace 
          --user-perm usertablespace --user-temp usertemptablespace
          --user OML_USER"

    Note:

    The server script creates a user for OML4R. By default, the script does not grant the RQADMIN role to the user.

    Any OML4R user can execute embedded R, but only those with the RQADMIN role can create and drop the R scripts in the database. Use caution when granting the RQADMIN role.

    For more information about the role, see About the RQADMIN Role.

  7. Verify OML4R loads.

    > library(ORE)
    Loading required package: OREbase
    Attaching package: OREbase
    The following objects are masked from âpackage:baseâ:
        cbind, data.frame, eval, interaction, order, paste, pmax, pmin,
        rbind, table
    Loading required package: OREembed
    Loading required package: OREstats
    Loading required package: MASS
    Loading required package: OREgraphics
    Loading required package: OREeda
    Loading required package: OREmodels
    Loading required package: OREdm
    Loading required package: lattice
    Loading required package: OREpredict
    Loading required package: ORExml 

5.2.3 Install OML4R Server Across Exadata Compute Nodes Using DCLI for 18c and Later

How to use DCLI to install OML4R Server across multiple Exadata Linux compute nodes for Oracle Database 18c and later.

To install OML4R Server on Exadata using DCLI for Oracle Database 18c and later, follow these steps:
  1. Get a list of the compute nodes in the rack.

    In the following example, the cat nodes command lists the nodes for a two-node cluster.

    $ cat nodes
    exadb01
    exadb02
  2. In a text editor, create a file that contains the names of all of the compute nodes in the rack. Specify each node name on a separate line. For example, the nodes file for a two-node cluster would contain entries such as the following:
    exadb01
    exadb02
  3. Ensure that the ORACLE_HOME, ORACLE_SID, R_HOME, PATH, and LD_LIBRARY_PATH environment variables are properly set on each node, and are defined in the same shell in which you will run the DCLI script. For example, you could specify values like the following in a bashrc file:
    export ORACLE_HOME=/u01/app/oraclecle/product/release_number/dbhome_1
    export ORACLE_SID=ORCL
    export R_HOME=/usr/lib64/R
    export PATH=$PATH:$R_HOME/bin:$ORACLE_HOME/bin
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib:$RHOME_lib:$R_HOME/port/Linux-X64/lib
  4. Option 1: On the first database node only, execute as sysdba the rqcfg.sql script from your PDB.
    $ sqlplus / as sysdba;
    SQL> alter session set container=PDBNAME;
    SQL> @$ORACLE_HOME/R/server/rqcfg.sql

    Note:

    The rqcfg.sql script ships with Oracle Database 18c and later and resides in the $ORACLE_HOME/R/server directory. The script installs the OML4R Server components in the database and you only need to be execute it once.

    The rqcfg.sql script prompts you for the following input parameters:

    define permtbl = permanent tablespace name for RQSYS schema
    define temptbl = temporary tablespace name for RQSYS schema
    define orahome = ORACLE_HOME path
    define rhome = R_HOME path

    Option 2: Execute the rqcfg.sql script from the Linux command line.

    In the example, the user is system with the password apassword, the RQSYS schema is in SYSAUX and SYSAUX is assigned the temporary tablespace TEMP. The value for ORACLE_HOME is /u01/app/oracle/product/21.3.0.0/dbhome_1 and the value for R_HOME is the Linux default path, /usr/lib64/R:

    $ sqlplus -L -S system/apassword @$ORACLE_HOME/R/server/rqcfg.sql SYSAUX TEMP /u01/app/oracle/product/21.3.0.0/dbhome_1 /usr/lib64/R
  5. Download and install the OML4R supporting packages.

    To download ths supporting packages, go to the Oracle Machine Learning for R Downloads website. Select Supporting in the column for your version of the database and R, accept the license agreement, and download the ore-supporting-linux-x86-64-version.zip file.

    Log in as root and copy the installers for the supporting packages across the nodes. For example:

    $ dcli -g nodes -l oracle mkdir -p /home/oracle/OML4R
    
    $ dcli -g nodes -l oracle -f ore-supporting-linux-x86-64-version.zip -d
         /home/oracle/OML4R/ore-supporting-linux-x86-64-version.zip

    Unzip the supporting packages on each node:

    $ dcli -t -g nodes -l oracle unzip
         /home/oracle/OML4R/ore-supporting-linux-x86-64-version.zip -d
         /my_destination_directory/

    Install the OML4R supporting packages, as in the following example:

    $ dcli -t -g nodes -l oracle R CMD INSTALL /my_destination_directory/supporting/* -l $ORACLE_HOME/R/library/

    Note:

    The rqcfg.sql script creates an OML4R user. By default, the script does not grant the RQADMIN role to the user.

    Any OML4R user can use an embedded R execution function, but only those with the RQADMIN role can create and drop the R scripts in the OML4R script repository in the database. Use caution when granting the RQADMIN role.

  6. Start R with the ORE script, and verify that OML4R loads.
    $ ORE
    
    > library(ORE)
    Loading required package: OREbase
    Attaching package: OREbase
    The following objects are masked from âpackage:baseâ:
        cbind, data.frame, eval, interaction, order, paste, pmax, pmin,
        rbind, table
    Loading required package: OREembed
    Loading required package: OREstats
    Loading required package: MASS
    Loading required package: OREgraphics
    Loading required package: OREeda
    Loading required package: OREmodels
    Loading required package: OREdm
    Loading required package: lattice
    Loading required package: OREpredict
    Loading required package: ORExml

5.2.4 DCLI Commands Summary for Oracle Machine Learning for R Server

The DCLI commands used to install OML4R and the supporting packages on a Linux Exadata system are listed in the following example.

Example 5-3 DCLI Command Summary for OML4R Server

dcli -g nodes -l oracle mkdir -p /home/oracle/ORE
dcli -g nodes -l oracle -f ore-server-linux-x86-64-version.zip -d
     /home/oracle/ORE/ore-server-linux-x86-64-version.zip
dcli -g nodes -l oracle -f ore-supporting-linux-x86-64-version.zip -d
     /home/oracle/ORE/ore-supporting-linux-x86-64-version.zip
dcli -t -g nodes -l oracle unzip
     /home/oracle/ORE/ore-server-linux-x86-64-version.zip -d
     /home/oracle/ORE/
dcli -t -g nodes -l oracle /home/oracle/ORE/server.sh
sqlplus / as sysdba
grant RQADMIN to OML_USER;
exit;
dcli -t -g nodes -l oracle ORE -e "library(ORE)" 

5.3 Install Oracle Machine Learning for R for Oracle RAC Without DCLI

How to install OML4R for an Oracle Real Application Clusters (Oracle RAC) database if DCLI is unavailable.

If the Distributed Command Line Interface (DCLI) is not available, you must install each of the following components individually on each database instance in the Oracle RAC cluster.

  • R or Oracle R Distribution
  • OML4R Server
  • OML4R supporting packages

The first section contains installation instructions for Oracle Database 18c and later. The second section has instructions for Oracle Database 12c and earlier.

Install OML4R in an Oracle 18c and Later RAC Environment

Following these step to install Oracle R Distribution, OML4R, and the OML4R supporting packages.

  1. Install Oracle R Distribution. See Install R for Oracle Machine Learning for R.
  2. Start SQL*Plus, log in to your PDB directly and run the rqcfg.sql script. The following example uses the PDB PDB1 and gives example values for the script arguments.
    SQL> sqlplus / as sysdba
    SQL> alter session set container=PDB1;
    SQL> ALTER PROFILE DEFAULT LIMIT PASSWORD_VERIFY_FUNCTION NULL;
    SQL> @$ORACLE_HOME/R/server/rqcfg.sql
    
    define permtbl = SYSAUX
    define temptbl = TEMP
    define orahome = /u01/app/oracle/product/21.3.0.0/dbhome_1
    define rhome = /usr/lib64/R
  3. At your operating system prompt, go to the ORACLE_HOME/bin directory and grant read and execute permission to all users to the ORE directory.
    cd  $ORACLE_HOME/bin
    chmod 755 ORE
  4. Create a directory to contain the OML4R 1.5.1 supporting packages for your system and change directories to it. To that directory, download the supporting package zip file as described in Install the OML4R Supporting Packages.
  5. Extract the supporting packages.
  6. For each package, at your operating system command prompt, run the following command.
    ORE CMD INSTALL package

Install OML4R in an Oracle 12c and Earlier RAC Environment

Following these step to install Oracle R Distribution, OML4R, and the OML4R supporting packages.

Note:

You can perform steps 2 and 3 simultaneously by first extracting the OML4R supporting packages bundle in the same directory from which you execute the server.sh script. (For Microsoft Windows, the script is server.bat.)
  1. Install Oracle R Distribution. See Install R for Oracle Machine Learning for R.
  2. Execute the server.sh script from the OML4R Server installer bundle. See Install Oracle Machine Learning for R Server
  3. Install the OML4R supporting packages. See Install Oracle Machine Learning for R Server for Oracle Database 12c and Earlier.

When you execute the server.sh script on node 1, it installs the OML4R packages on the operation system in the $ORACLE_HOME/R/library directory. It also installs and configures the database components of OML4R. While running the script, you can create a new database user when prompted to do so. You can create a user while running the script only during the execution of the server.sh script on the first node.

When you execute the server.sh script on each subsequent node, the script only installs the OML4R packages on the operation system.