1 Overview of Oracle R Enterprise

R is an open source statistical programming language and environment. For information about R, see the R Project for Statistical Computing at http://www.r-project.org.

R provides an environment for statistical computing, including:

  • An easy-to-use language

  • A powerful graphical environment for visualization

  • Many out-of-the-box statistical techniques

  • R packages (An R package is a set of related functions, help files, and data files; as of this writing, there are more than 4000 R packages, but the number grows constantly.)

  • The R Console graphical user interface for analyzing data interactively

R's rapid adoption has earned it a reputation as a new statistical software standard.

Oracle R Enterprise is a component of the Oracle Advanced Analytics Option of Oracle Database Enterprise Edition.

For detailed information about Oracle R Enterprise, including links to software downloads, go to Oracle R Enterprise at http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html. This site contains links to downloads, the blog, the discussion forum, and the latest documentation. See Oracle R Enterprise Useful Links for information about the blog and the forum.

Oracle R Enterprise allows users to perform statistical analysis on data stored in an Oracle Database. Oracle R Enterprise has these components:

  • The Oracle R Enterprise R transparency layer. The transparency layer is a collection of packages that support mapping of R data types to Oracle Database objects and generate SQL transparently in response to R expressions on mapped data types. The transparency layer allows an R user to interact directly with database-resident data using R language constructs. One advantage of interacting with database-resident data is that R users can work with data too large to fit into the memory of a user's desktop system.

  • The Oracle R Enterprise statistics engine, a collection of statistical functions and procedures corresponding to commonly-used statistical libraries. The statistics engine packages execute in Oracle Database.

  • Embedded R execution enables the database server to manage and control the execution of R scripts by spawning server-side R engines. Embedded R execution enables operationalization of R scripts, that is, running R scripts in a lights-out fashion as part of an application. Embedded R execution eliminates moving data from Oracle Database. Embedded R execution enables data and task parallel execution, generation of rich XML output and png image streams through the SQL API, and provides parallel simulations capability.

Oracle R Enterprise includes many packages; for a list see Oracle R Enterprise and Oracle R Distribution Packages.

The rest of this chapter describes Oracle R Enterprise Architecture and Oracle R Enterprise Supported Configurations.

Oracle R Enterprise Training is available free from Oracle Learning Library.

Oracle R Enterprise Useful Links describes the blog and the forum.

Oracle R Enterprise Architecture

Oracle R Enterprise has these three components including the connector for Hadoop:

Description of oreug_vm_001.png follows
Description of the illustration oreug_vm_001.png

  1. The Client R Engine (R Engine in Client) is a collection of R packages that allows you to connect to an Oracle Database and to interact with data in that database.

    You can use any R commands from the client. In addition, the client supplies these functions:

    • The R SQL Transparency layer intercepts R functions for scalable in-database execution

    • Functions intercept data transforms, statistical functions, and Oracle R Enterprise-specific functions

    • Interactive display of graphical results and flow control as in open source R

    • Submission of R closures (functions) for execution in Oracle Database

  2. The Server (in Oracle Database) is a collection of PL/SQL procedures and libraries that augment Oracle Database with the capabilities required to support an Oracle R Enterprise client. The R engine is also installed on Oracle Database to support embedded R execution. Oracle Database spawns R engines, which can provide data parallelism.

    The Oracle R Enterprise Database engine provides this functionality:

    • Scale to large datasets

    • Access to tables, views, and external tables in the database, as well as those accessible through database links

    • Use SQL query parallel execution

    • Use in-database statistical and data mining functionality

  3. R Engines spawned by Oracle Database support database-managed parallelism; provide lights-out scheduled execution of R scripts, that is, scheduling or triggering R scripts packaged inside a PL/SQL or SQL query. Oracle R Enterprise provides efficient transfer to and from the spawned engines. Embedded R execution can be used to emulate MapReduce style programming.

There are several data types specific to Oracle R Enterprise; see Data Types Supported for details.

Oracle R Enterprise Supported Configurations

Oracle R Enterprise consists of a client and a server. The client and the server run on Oracle Linux, Red Hat Linux; the client runs on Microsoft Windows 64-bit. The server is installed in an Oracle Database, to which the client connects. Client and server are not required to run on the same operating system. For example, the client can run on Microsoft Windows with the server installed on Oracle Linux.

Oracle R Enterprise also runs on Oracle Exadata machines with the Linux and Solaris operating systems. For details, see Oracle R Enterprise Installation and Administration Guide.

GUIs and IDEs for R

Open source R is distributed through The Comprehensive R Archive Network (CRAN). It can be downloaded, but it is not shipped.

The CRAN distribution contains a Graphical User Interface (GUI) for Windows. There are open source GUIs for R on all operating systems, but they require a download from a separate site and a separate install.

If you require an Integrated Development Environment (IDE) for R, you may wish to use RStudio IDE. For an overview of RStudio IDE installation, see Oracle R Enterprise Installation and Administration Guide.

Oracle R Enterprise Training

Oracle R Enterprise Tutorial Series (https://apexapps.oracle.com/pls/apex/f?p=44785:24:17534844732288::NO::P24_CONTENT_ID,P24_PREV_PAGE:6528,1), part of Oracle Learning Library, contains lessons describing Open-source R basics and Oracle R Enterprise functionality. Topics include R basics, graphing in R, the transparency layer, R scripts, and SQL scripts. There is also a lesson about Oracle R Connector for Hadoop. (Oracle Connector for Hadoop is a separate product.)

Lessons in Oracle Learning Library are free.

See Also:

The Learning R Series presentations available on the Oracle R Enterprise page on the Oracle Technology Network at http://www.oracle.com/technetwork/database/options/advanced-analytics/r-enterprise/index.html

Oracle R Enterprise Useful Links

The following web sites provide useful information for users of Oracle R Enterprise:

  • The Oracle R Enterprise Discussion Forum (https://forums.oracle.com/forums/forum.jspa?forumID=1397) supports all aspects of Oracle's R-related offerings, including: Oracle R Enterprise, Oracle R Connector for Hadoop (part of the Big Data Connectors), and Oracle R Distribution. Use the forum to ask questions and make comments about the software.

  • The Oracle R Enterprise Blog (https://blogs.oracle.com/R/) discusses best practices, tips, and tricks for applying Oracle R Enterprise and Oracle R Connector for Hadoop in both traditional and new Big Data environments.