2 Planning an EDQ Installation

This chapter describes how to plan and prepare to install EDQ and presents information that the you should consider and be familiar with before you begin the installation, including the following:

2.1 Selecting Directories for Installation

During the installation process, you must specify locations for one or more of the following home directories:

  • Oracle Fusion Middleware

  • EDQ

Once installed, additional directories and files are located in the Oracle Fusion Middleware home directory for Fusion Middleware products, such as EDQ and WebLogic Server.

2.1.1 Choosing a Fusion Middleware Home Directory

The Fusion Middleware home directory serves as a repository for common files that are used by multiple Fusion Middleware products installed on the same machine. For this reason, the Middleware home directory can be considered a central support directory for all the Fusion Middleware products installed on your system.

The files in the Middleware home directory are essential to ensuring that Fusion Middleware products operate correctly on your system. They facilitate checking of cross-product dependencies during installation. The directories in the Middleware home directory vary depending on the installer that you are using and the products you selected for installation.

The default installation directory for the Middleware home directory is:

On Linux and UNIX: /opt/Oracle/Middleware/

On Windows: C:\Oracle\Middleware\

The Middleware home directory is referenced as MW_HOME in Fusion Middleware documentation and this guide.

2.1.2 Choosing the EDQ Installation Directory

When you are installing EDQ, you are prompted to choose an existing MW_HOME directory or specify a path to create a new one. If you choose to create a new directory, the installation program automatically creates it for you.

You are then prompted to enter a home directory for EDQ. This home directory contains the components necessary to installing and configuring the product. The default installation directory for EDQ is:

On Linux and UNIX: MW_HOME/Oracle_EDQ1

On Windows: MW_HOME\Oracle_EDQ1

This directory path is referenced as the EDQ_HOME directory in this document.

2.2 Installation Prerequisites

The following sections describe the installation prerequisites:

2.2.1 Hardware and Software Requirements

You must ensure that the following hardware and software requirements are observed. These requirements represent the certified and supported server configurations.

Depending on the tasks that EDQ is required to perform, it can place heavy demands on the hardware used to run it. A recommended minimum hardware specification for an EDQ server is:

  • 16GB physical memory, with 8GB allocated to the EDQ Java Virtual Machine (JVM)

  • At least 4 logical CPUs

  • At least 500GB of hard disk space on the database server

In order to allow the flexible use of EDQ to meet use cases, ensure that the EDQ Results Database has enough space for at least 20 times the volume of the data it is working with.

The preceding recommendations do not represent sizing advice for any specific deployment, where it may be appropriate to deploy a considerably larger machine or many machines, depending on the processing needs placed on EDQ.

Review the list of certified platforms and releases for EDQ prior to installation, see Oracle Enterprise Data Quality Certification Matrix at

http://www.oracle.com/technetwork/middleware/ias/downloads/fusion-certification-100350.html

Locate Oracle Enterprise Data Quality in the Product Area column and then click the System Requirements and Supported Platforms for Oracle Enterprise Data Quality (11.1.1.7.N) Certification Matrix (xls) link.

2.2.1.1 UNIX System Resource Limits

On UNIX systems, the operating system is configured with a default ulimit value (use the ulimit -a command to view the value). Depending on how you installed and configured UNIX, you may find that your application server user is unable to create files larger than 1 GB. This restricts your ability to work with large data sets if you are using files to transfer data. In this case, the hard ulimit on file size may need to be removed for your application server user.

2.2.1.2 Virtual Hardware

You can install EDQ on virtualized systems using a virtualization tool, such as Oracle VM Server. Both the virtual system and the physical system it is deployed on must fulfill the minimum hardware requirements.

If load balancing software is used to deploy multiple virtual systems onto a single physical system, care must be taken to ensure that the load balancing software is carefully tuned. In general, EDQ imposes a load similar to an extract, transform and load tool or data warehousing software. Between batches, very little load is imposed on the system. When processing a batch of data, EDQ rapidly drives hardware to be CPU or I/O bound. Unless the virtualized load balancing is correctly configured suboptimal performance results.

2.2.2 Choosing an Installation Combination

You can choose to install one the following combinations ensuring that it is supported on your installed operating system (see Section 2.2.1, "Hardware and Software Requirements"):

Application Server Database

WebLogic

Oracle

WebSphere

Oracle

WebSphere

PostgreSQL

Tomcat

Oracle

Tomcat

PostgreSQL


2.2.3 Choosing User Accounts

An operating system user account is used to install and upgrade EDQ on your servers. This user must have full permissions (read, write and execute) to the directories that will contain the EDQ installation files, target installation directory, and all database directories; it is applicable to all operating systems. This operating system user account is referred to as the EDQ installation user in this document.

The EDQ installation user is used to install your application server and database.

Note:

When installing on UNIX or Linux operating systems, do not use the root user as your EDQ installation user account.

For Tomcat and WebSphere, an application server user is necessary to create EDQ user accounts, tables, and schemas. For WebLogic, a user is automatically created for your EDQ domain when you run the WebLogic Configuration Wizard and is used to administer your EDQ domain and to log into the EDQ application.

Similarly, a database administrator user account that has the privileges to access the database and ability to create schemas and run the database product is necessary. This database administrator user account is used during the installation and configuration processes to create the database accounts specific to EDQ. This is applicable to any supported database that you want to use with EDQ.

2.2.4 Installing the Java Development Kit

You must install a supported JDK since both the EDQ and application server products rely on it. The JDK provides the Java run-time environment (JRE) and tools for compiling and debugging Java applications.

Identify the EDQ supported JDK that you want to install using the following table and the Oracle Enterprise Data Quality Certification Matrix (see Section 2.2.1, "Hardware and Software Requirements").

If You Are Installing: You must use the:

WebSphere

IBM JDK that is bundled with WebSphere.

on AIX

the IBM JDK is the only supported JDK so it is used for all application servers.

on HP-UX

the HP JDK.


Download and install the Oracle JDK using the instructions provided at

http://www.oracle.com/technetwork/java/javase/downloads/index.html

You will be required to specify the directory into which you installed the JDK during the installation of your application server so note them. For example, the directories may be:

On Linux and UNIX: /opt/jdk1.7.0_40

On Windows: C:\Program Files\Java\jdk1.7.0_40

This directory path to your installation is referenced as the JDK_HOME directory in this document.

Note:

On Solaris systems, you must install both the 32-bit and 64-bit JDKs in order to run java applications. Install these JDKs by following the instructions at the Oracle Java SE documentation website at

http://docs.oracle.com/javase/7/docs/webnotes/install/solaris/solaris-jdk.html

2.2.5 Installing the Application Server

You must install one of the supported application servers, WebLogic, WebSphere, or Tomcat, see Section 2.2.1, "Hardware and Software Requirements". This section contains any information specific to the installation or configuration of these application servers.

2.2.5.1 Installing WebLogic Server

The installation instructions, including how to obtain the product, are found in the Oracle WebLogic Server Installation Guide at

http://docs.oracle.com/cd/E23943_01/wls.htm

The directory path to your installation is referenced as the WL_HOME directory in this document.

Oracle recommends the use of managed servers in your EDQ domain and that you use WebLogic Node Manager to administer the servers in your domain. For more information, see Oracle Fusion Middleware Node Manager Administrator's Guide for Oracle WebLogic Server 11g Release 1.

2.2.5.2 Installing Tomcat

You can download the Tomcat Application Server, installation instructions, and all documentation, from the Apache Software Foundation Server web site at

http://tomcat.apache.org/

Configuring Tomcat

After you have installed Tomcat, you must ensure that you configure it to use an Oracle Java JDK (not OpenJDK). For example, JAVA_HOME="/opt/java/jdk1.7.0_25". This path variable should be set in your tomcat#.conf file to specify that it is for Tomcat; alternatively, you can add it to your setenv.sh post-installation.

Note:

Oracle recommends that you configure Tomcat to start as a service.

2.2.5.3 Installing WebSphere

You can download the WebSphere Application Server and installation instructions from the IBM WebSphere web site at

http://www-03.ibm.com/software/products/us/en/appserv-was/

Configuring WebSphere

After you have installed WebSphere, you must create a new profile that describes your EDQ WebSphere server.

2.2.6 Installing the Database

You must install one of the supported databases, Oracle or PostgreSQL. This section contains any information specific to the installation or configuration of these databases.

2.2.6.1 Oracle Database

You can download the Oracle Database product and installation instructions from the Oracle Database Documentation web site at

http://www.oracle.com/pls/db112/

Installation and configuration considerations:

  • Ensure that you select the Create and configure a database installation option.

  • Oracle recommends the following Oracle Database Memory Structure and tablespace configuration:

    • 4GB Program Global Area (PGA)

    • 4GB System Global Area (SGA)

    • 20GB undo tablespace

    • 20GB temp tablespace

    • Separate user tablespaces for configuration and results schemas

  • You may need to increase the values for the SESSIONS and PROCESSES parameters. The appropriate value for these parameters depends on your Oracle Database installation and intended use of EDQ though the suggested values are:

    SESSIONS=500

    PROCESSES=500

    If you are unsure of the appropriate settings for these parameters, or how the values should be set, see Oracle Database Concepts 11g Release 1 (11.1) or contact your database administrator. For more information about the integration of Oracle Database with EDQ, see Oracle Enterprise Data Quality Architecture Guide.

  • You must configure your Oracle database to use a Unicode character set to ensure that EDQ is able to capture and process data in the widest range of character sets.

  • If required, multiple EDQ servers may share the same Oracle Database; each server must have dedicated Config and Results schemas within the database.

2.2.6.1.1 Installing Repository Creation Utility

EDQ requires the existence of schemas in your installed Oracle Database prior to installation. These schemas are created and loaded in your database using the Repository Creation Utility (RCU).

Note:

Do not use RCU when upgrading EDQ; use the instructions in Section 7, "Upgrading EDQ."

You must obtain the RCU product using the instructions found in the Oracle Fusion Middleware Repository Creation Utility User's Guide at

http://docs.oracle.com/cd/E28280_01/doc.1111/e14259/toc.htm

Note:

On Windows operating systems, make sure that you do not unzip the RCU .zip file to a directory name containing spaces.

The directory you unzip the product into will be referred to as the RCU_HOME directory in this guide.

2.2.6.2 PostgreSQL

You can download the PostgreSQL product and installation instructions from the PostgreSQL website at

http://www.postgresql.org/

Note:

PostgreSQL may be distributed with your operating system so you must verify that the release is one of EDQ supported releases.

Note:

The EDQ RCU does not run on PostgreSQL databases so you must manually set up your users and databases as described in this section. Then you must configure PostgreSQL, install EDQ, and then run the EDQ Configuration Application to create the required tables as described in the remaining chapters.

Installation and configuration considerations:

  • If you are installing on Windows, Oracle recommends that you use the graphical installer that you can download from the PostgreSQL web site at

    http://www.postgresql.org/download/windows/

  • Allow a maximum connections of 403 by editing the postgresql.conf file in the PostgreSQL data directory (for example, /var/lib/pgsql/data/postgresql.conf).

  • You must configure your PostgreSQL database to use a Unicode character set to ensure that EDQ is able to capture and process data in the widest range of character sets.

Configure your schema for installation as the EDQ repository as follows:

  • Create two new PostgreSQL users, named config and results.

  • Create a schema within your database, named config and owned by the config user.

  • Create a second schema within your database, named results and owned by the results user.

On Linux systems, configure PostgreSQL to:

  • Use password authentication by editing the pg_hba.conf file in the data directory PostgreSQL is installed (for example, /var/lib/pgsql/data/pg_hba.conf) and changing the ident sameuser entries to md5.

2.3 Product Distribution

The EDQ installation and configuration files are distributed by downloading the generic package installer from the Oracle Software Delivery Cloud web site as follows:

  1. Enter the Oracle Software Delivery Cloud URL into a web browser:

    http://edelivery.oracle.com/

  2. Click Sign-in/Register.

    Note:

    If you are not already logged in, the Oracle Single Sign-On page appears. Enter your Oracle user id and password and click Sign In.

    The Terms & Restrictions page appears

  3. Click the Oracle Software Delivery Cloud Trial License Agreement and the Export Restrictions check boxes, and then click Continue.

    The Media Pack Search page appears.

  4. On the Media Pack Search page, do the following:

    1. Click the Select Product Pack drop-down list and elect E-Business Suite (if you purchased the product from the Application Price List) or Oracle Fusion Middleware (if you purchased the product from the Technology Price List) from the Select a Product Pack drop-down list.

    2. Click the Platform drop-down list and select the platform on which you are installing EDQ.

    3. Click Go.

    The Results list expands to show all available media packs that include your search criteria.

  5. Locate and select Oracle Enterprise Data Quality 11.0 Media Pack (11.1.1) (E-Business Suite Product Pack) or Oracle Enterprise Data Quality 11.1.1.9.# (11.1.1) Media Pack (Oracle Fusion Middleware Product Pack) option and click Continue.

  6. Click the Download button for Oracle Enterprise Data Quality 11.1.1.9.#.

  7. Browse to the directory where you want to save the file then click Save to start the file download. A ZIP file is downloaded.

  8. Extract the ZIP file to the following directory:

    On Linux and UNIX: /opt/edq_install

    On Windows: C:\edq_install

    The installation directory now contains the edq directory. The installers are in the edq/Disk1/ directory. You have all of the files necessary to install EDQ though additional software may be required as described in the following section.

2.4 Next Step

Go to Section 3, "Configuring Your EDQ Database Schemas" to continue with the installation.