8 Microsoft Excel

This chapter describes how to work with Microsoft Excel in Oracle Data Integrator.

This chapter includes the following sections:

8.1 Introduction

Oracle Data Integrator (ODI) integrates data stored into Microsoft Excel workbooks. It allows reverse-engineering as well as read and write operations on spreadsheets.

Oracle Data Integrator uses Open Database Connectivity (ODBC) to connect to a Microsoft Excel data server. See Section 8.2.3, "Connectivity Requirements" for more details.

8.1.1 Concepts

A Microsoft Excel data server corresponds to one Microsoft Excel workbook (.xls file) that is accessible through your local network. A single physical schema is created under this data server.

Within this schema, a spreadsheet or a given named zone of the workbook appears as a datastore in Oracle Data Integrator.

8.1.2 Knowledge Modules

Oracle Data Integrator provides no Knowledge Module (KM) specific to the Microsoft Excel technology. You can use the generic SQL KMs to perform the data integration and transformation operations of Microsoft Excel data. See Chapter 4, "Generic SQL" for more information.

Note:

Excel technology cannot be used as the staging area, does not support incremental update or flow/static check. As a consequence, the following KMs will not work with the Excel technology:
  • RKM SQL (JYTHON)

  • LKM File to SQL

  • CKM SQL

  • IKM SQL Incremental Update

  • IKM SQL Control Append

  • LKM SQL to SQL (JYTHON)

8.2 Installation and Configuration

Make sure you have read the information in this section before you start using the Microsoft Excel Knowledge Module:

8.2.1 System Requirements and Certifications

Before performing any installation you should read the system requirements and certification documentation to ensure that your environment meets the minimum installation requirements for the products you are installing.

The list of supported platforms and versions is available on Oracle Technical Network (OTN):

http://www.oracle.com/technology/products/oracle-data-integrator/index.html.

8.2.2 Technology Specific Requirements

There are no technology-specific requirements for using Microsoft Excel files in Oracle Data Integrator.

8.2.3 Connectivity Requirements

This section lists the requirements for connecting to a Microsoft Excel workbook.

To be able to access Microsoft Excel data, you need to:

Install the Microsoft Excel ODBC Driver

Microsoft Excel workbooks can only be accessed through ODBC connectivity. The ODBC Driver for Excel must be installed on your system.

Declare a Microsoft Excel ODBC Data Source

An ODBC data source must be defined for each Microsoft Excel workbook (.xls file) that will be accessed from ODI. ODBC datasources are created with the Microsoft ODBC Data Source Administrator. Refer to your Microsoft Windows operating system documentation for more information on datasource creation.

8.3 Setting up the Topology

Setting up the Topology consists in:

  1. Creating a Microsoft Excel Data Server

  2. Creating a Microsoft Excel Physical Schema

8.3.1 Creating a Microsoft Excel Data Server

A Microsoft Excel data server corresponds to one Microsoft Excel workbook (.xls file) that is accessible through your local network.

Create a data server for the Microsoft Excel technology using the standard procedure, as described in "Creating a Data Server" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator. This section details only the fields required or specific for defining a Microsoft Excel Data Server:

  1. In the Definition tab:

    • Array Fetch Size: 1

    • Batch Update Size: 1

  2. In the JDBC tab:

    • JDBC Driver: sun.jdbc.odbc.JdbcOdbcDriver

    • JDBC URL: jdbc:odbc:<odbc_dsn_alias>

      where <odbc_dsn_alias> is the name of your ODBC data source.

    WARNING:

    To access a Microsoft Excel workbook via ODBC, you must first ensure that this workbook is not currently open in a Microsoft Excel session. This can lead to unexpected results.

8.3.2 Creating a Microsoft Excel Physical Schema

Create a Microsoft Excel Physical Schema using the standard procedure, as described in "Creating a Physical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

Note that Oracle Data Integrator needs only one physical schema for each Microsoft Excel data server. If you wish to connect a different workbook, a different data server must be created to connect a ODBC datasource corresponding to this other workbook.

Create for this physical schema a logical schema using the standard procedure, as described in "Creating a Logical Schema" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator and associate it in a given context.

Note:

An Excel physical schema only has a data schema, and no work schema. Microsoft Excel cannot be used as the staging area of an interface.

8.4 Setting Up an Integration Project

Setting up a Project using the Microsoft Excel follows the standard procedure. See "Creating an Integration Project" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

Import the following generic SQL KMs into your project for getting started with Microsoft Excel:

  • LKM SQL to SQL

  • IKM SQL to SQL Append

See Chapter 4, "Generic SQL" for more information about these KMs.

8.5 Creating and Reverse-Engineering a Microsoft Excel Model

This section contains the following topics:

8.5.1 Create a Microsoft Excel Model

A Microsoft Excel Model is a set of datastores that correspond to the tables contained in a Microsoft Excel workbook.

Create a Microsoft Excel Model using the standard procedure, as described in "Creating a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

8.5.2 Reverse-engineer a Microsoft Excel Model

Microsoft Excel supports only the Standard reverse-engineering, which uses only the abilities of the ODBC driver.

Oracle Data Integrator reverse-engineers:

  • Spreadsheets: Spreadsheets appear as system tables. Such a table is named after the spreadsheet name, followed with a dollar sign ($). This table's columns are named after the first line of the spreadsheet. Note that new records are added at the end of the spreadsheet.

  • Named Cell Ranges in a spreadsheet. These will appear as tables named after the cell range name. Depending on the scope of a name, the table name may be prefixed by the name of the spreadsheet (in the following format: <spreadsheet_name>$<zone_name>). The columns for such a table are named after the first line of the cell range. Note that new records are added automatically below the named cell. It is possible to create a blank named cell range that will be loaded using ODI by naming a cell range that contains only the first header line.

    In most Microsoft Excel versions, you can simply select a cell range and use the Name a Range... popup menu to name this range. See the Microsoft Excel documentation for conceptual information about Names and how to define a cell range in a spreadsheet.

Standard Reverse-Engineering

To perform a Standard Reverse-Engineering on Microsoft Excel use the usual procedure, as described in "Reverse-engineering a Model" of the Oracle Fusion Middleware Developer's Guide for Oracle Data Integrator.

Note:

On the Reverse Engineer tab of your Model, select in the Types of objects to reverse-engineer section Table and System Table to reverse-engineer spreadsheets and named cell ranges.

8.6 Designing an Interface

You can use a Microsoft Excel file as a source or a target of an integration interface, but NOT as the staging area

The KM choice for an interface or a check determines the abilities and performances of this interface or check. The recommendations below help in the selection of the KM for different situations concerning a Microsoft Excel server.

8.6.1 Loading Data From and to Microsoft Excel

Microsoft Excel can be used as a source or a target of an interface. The LKM choice in the Interface Flow tab to load data between Microsoft Excel and another type of data server is essential for the performance of an interface.

8.6.1.1 Loading Data from Microsoft Excel

Oracle Data Integrator does not provide specific knowledge modules for Microsoft Excel. Use the Generic SQL KMs or the KMs specific to the technology used as the staging area. The following table lists some generic SQL KMs that can be used for loading data from Microsoft Excel to any staging area.

Table 8-1 KMs to Load from Microsoft Excel

Target or Staging Area KM Notes

Oracle

LKM SQL to Oracle

Loads data from any ISO-92 database to an Oracle target database. Uses statistics.

SQL

LKM SQL to SQL

Loads data from any ISO-92 database to any ISO-92 compliant target database.

Sybase

LKM SQL to Sybase (bcp)

Loads data from any ISO-92 compliant database to a Sybase ASE Server database. Uses Bulk Loading.

Microsoft SQL Server

LKM SQL to MSSQL (bulk)

Loads data from any ISO-92 database to a Microsoft SQL Server target database. Uses Bulk Loading.


8.6.1.2 Loading Data to Microsoft Excel

Because Microsoft Excel cannot be used as staging area you cannot use a LKM to load data into Microsoft Excel. See Section 8.6.2, "Integrating Data in Microsoft Excel" for more information on how to integrate data into Microsoft Excel.

8.6.2 Integrating Data in Microsoft Excel

Oracle Data Integrator does not provide specific knowledge modules for Microsoft Excel. Use the Generic SQL KMs or the KMs specific to the technology used as the staging area. For integrating data from a staging area to Microsoft Excel, you can use, for example the IKM SQL to SQL Append.

8.7 Troubleshooting

This section provides information on how to troubleshoot problems that you might encounter when using the Microsoft Excel technology in Oracle Data Integrator. It contains the following topics:

8.7.1 Decoding Error Messages

Errors appear often in Oracle Data Integrator in the following way:

java.sql.SQLException: java.sql.SQLException: [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified RC=Oxb
at ... ...

the java.sql.SQLException code simply indicates that a query was made through the JDBC-ODBC bridge, which has returned an error. This error is frequently a database or driver error, and must be interpreted in this direction.

Only the part of text in italic must first be taken in account. It must be searched in the ODBC driver or Excel documentation. If its contains a specific error code, like here in bold italic, the error can be immediately identified.

If such an error is identified in the execution log, it is necessary to analyze the SQL code to find the source of the error. The code is displayed in the description tab of the task in error.

The most common errors with Excel are detailed below, with their principal causes.

8.7.2 Common Problems and Solutions

This section describes common problems and solutions.

  • UnknownDriverException

    The JDBC driver is incorrect. Check the name of the driver.

  • [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified RC=0xb Datasource not found or driver name not specified

    The ODBC Datasource specified in the JDBC URL is incorrect.

  • The Microsoft Jet Database engine could not find the object <object name>

    The table you are trying to access does not exist or is not defined in the Excel spreadsheet.

  • Too few parameters. Expected 1.

    You are trying to access an nonexisting column in the Excel spreadsheet.

  • Operation must use an updateable query.

    This error is probably due to the fact that you have not unchecked the "read only" option when defined the Excel DSN. Unselect this option and re-execute your interface.

  • DBCS or UTF-16 data is corrupted when loaded.

    This error is due to the fact that the JDBC-ODBC Bridge of the Java machine does not support UTF-16 data. This is a known issue in the Sun JVM that is solved in the later releases (1.7).