Oracle® Business Intelligence Server Administration Guide > Using XML as a Data Source for the Oracle BI Server > Using the Oracle BI Server XML Gateway >

Accessing HTML Tables


The Oracle BI Server XML Gateway also supports the use of tables in HTML files as a data source. The HTML file may be identified as a URL pointing to a file on the internet (including intranet or extranet) or as a file on a local or network drive.

Even though tables, defined by the <table> and </table> tag pair, are native constructs of the HTML 4.0 specification, they are often used by Web designers as a general formatting device to achieve specific visual effects rather than as a data structure. The Oracle BI Server XML Gateway is currently the most effective in extracting tables that include specific column headers, defined by <th> and </th> tag pairs.

For tables that do not contain specific column headers, the Oracle BI Server XML Gateway employs some simple heuristics to make a best effort to determine the portions of an HTML file that appear to be genuine data tables.

The following is a sample HTML file with one table.

<html>
   <body>
      <table border=1 cellpadding=2 cellspacing=0>
         <tr>
            <th colspan=1>Transaction</th>
            <th colspan=2>Measurements</th>
         </tr>
         <tr>
            <th>Quality</th>
            <th>Count</th>
            <th>Percent</th>
         </tr>
         <tr>
            <td>Failed</td>
            <td>66,672</td>
            <td>4.1%</td>
         </tr>
         <tr>
            <td>Poor</td>
            <td>126,304</td>
            <td>7.7%</td>
         </tr>
         <tr>
            <td>Warning</td>
            <td>355,728</td>
            <td>21.6%</td>
         </tr>
         <tr>
            <td>OK</td>
            <td>1,095,056</td>
            <td>66.6%</td>
         </tr>
         <tr>
            <td colspan=1>Grand Total</td>
            <td>1,643,760</td>
            <td>100.0%</td>
         </tr>
      </table>
   </body>
</html>

The table name is derived from the HTML filename, and the column names are formed by concatenating the headings (defined by the <th> and </th> tag pairs) for the corresponding columns, separated by an underscore.

Assuming that our sample file is named 18.htm, the table name would be 18_0 (because it is the first table in that HTML file), with the following column names and their corresponding fully qualified names.

Column
Fully Qualified Name

Transaction_Quality

\\18_0\Transaction_Quality

Measurements_Count

\\18_0\Measurements_Count

Measurements_Percent

\\18_0\Measurements_Percent

If the table column headings appear in more than one row, the column names are formed by concatenating the corresponding field contents of those header rows.

For tables without any heading tag pairs, the Oracle BI Server XML Gateway assumes the field values (as delimited by the <td> and </td> tag pairs) in the first row to be the column names. The columns are named by the order in which they appear (c0, c1, and so on).

For additional examples of XML, refer to XML Examples.

Oracle® Business Intelligence Server Administration Guide Copyright © 2007, Oracle. All rights reserved.