|Oracle9i OLAP Developer's Guide to the OLAP DML
Release 2 (9.2)
Part Number A95298-01
Reading Data from Files, 6 of 9
The records in a data file often contain dimension values, which are used to identify the cell in which the data values should be stored. When all of the dimension values in the file already exist in your analytic workspace, you can use the default attribute
MATCH in the dimension field description.
MATCH accepts only dimension values that already are in the analytic workspace.
FILEREAD finds an unrecognized value, the command signals an error that warns you about the bad data. Your data-reading program can handle the error by skipping the data and continuing processing, or by halting the processing and letting you check the validity of the data file.
The following example shows a data file that contains 6-character values for the dimension
productid, names for each product, and the number of units sold.
1234AA00CHOCOLATE CHIP COOKIES 123 1099BB00OATMEAL COOKIES 145 2344CC00SUGAR COOKIES 223 3222DD00BROWNIES 432 5553EE00GINGER SNAP COOKIES 233
The following workspace objects are used by the example program.
DEFINE productid DIMENSION ID DEFINE productname VARIABLE TEXT <productid> DEFINE units.sold VARIABLE INTEGER <month productid>
dr.prog program reads the file. The values of
productid with the associated product name are already part of the analytic workspace, so the program uses the
productid values only to set status and assign the units data to the right cells of the
MATCH attribute is left out of the field description because it is the default. When the program finds a value for
productid that is not in the analytic workspace, it branches to the trap label. If the user interrupts the program (that is, the error name is
attn) or the data file cannot be opened, then the program ends. Otherwise, the program resets the error trap and branches back to
FILEREAD to continue with the next record.
The example program, named
dr.prog, has the following definition.
DEFINE dr.prog PROGRAM LD Reads a file with existing dimension values PROGRAM VARIABLE funit INTEGER TRAP ON error PUSHLEVEL 'save' PUSH month productid LIMIT month TO FIRST 1 funit = FILEOPEN('olapfiles/dr.dat' READ) next: FILEREAD funit - COLUMN 1 WIDTH 6 productid - COLUMN 39 WIDTH 3 units.sold FILECLOSE funit POPLEVEL 'save' RETURN error: "Skip current record and continue processing IF funit NE na and ERRORNAME NE 'ATTN' THEN DO TRAP ON error GOTO next DOEND "Close the file IF funit NE na THEN FILECLOSE funit POPLEVEL 'save' END
When your data file contains a mixture of existing and new dimension values, you can add the new values and all the associated data to the analytic workspace by using the
APPEND attribute in the field description.
FILEREAD command in the
dr.prog2 program uses
APPEND to add any new
productid values to the analytic workspace. The second
FILEREAD command includes a field to read the product name so the new data will be complete.
The dimension maintenance performed by
APPEND might be done in the same
FILEREAD command that reads the data, but that would cause inefficient handling of the data. The data is handled more efficiently when the dimension maintenance and data reading are performed in two separate passes over the file.
The error processing in this version is shorter because there is no need to skip nonexistent product values and branch back. If there is an error, then the program closes the file, restores any pushed values, and terminates.
The program, named
dr.prog2, has the following definition.
DEFINE dr.prog2 PROGRAM LD Reads a file with new dimension values PROGRAM VARIABLE funit INTEGER TRAP ON error PUSHLEVEL 'save' PUSH month productid LIMIT month TO FIRST 1 funit = FILEOPEN('olapfiles/dr.dat' READ) FILEREAD funit COLUMN 1 APPEND WIDTH 6 productid FILECLOSE funit funit = FILEOPEN('olapfiles/dr.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 6 productid - COLUMN 9 WIDTH 30 productname - COLUMN 39 WIDTH 3 units.sold FILECLOSE funit POPLEVEL 'save' RETURN error: IF funit NE na THEN FILECLOSE funit POPLEVEL 'save' END
If the target dimension has a data type of
ID and the input field in the file contains dimension position numbers (rather than dimension values), then you must specify a conversion type of
INTEGER in the field description. The conversion type specifies how input data should be converted to values of the target dimension.
Suppose the target dimension is
month, then you can use the following command to read input values that represent positions within the default status of
When the input field contains position numbers, you cannot use the
APPEND keyword to add new values to a target dimension.
Composites are automatically maintained. The way in which you define and use composites can dramatically improve or hinder performance. The more you know about analytic workspace design, especially in regard to the applications that will be used with an analytic workspace, the more effective your use of composites will be.
When you have conjoint dimensions in your analytic workspace, you can set the status of those dimensions while reading a file with the
FILEREAD command. Typically, the records in the data file will have a separate field for each base dimension of your conjoint dimension. For example, a file might have a market name in the first field, a product name in the second, and then one or more fields containing sales data.
To read the sales data into a variable dimensioned by a conjoint dimension, for example
markprod, you can use a
FILEREAD command as follows.
This command will read a value of the
market dimension from the first 8-character field of the record and a value of the
product dimension from the next 8-character field.
The command will then use the results to set the status of
markprod, which is a conjoint dimension defined as follows.
The command then reads the last field and assigns the value to the variable
sales, which is dimensioned by
By including the
APPEND keyword in the field description, you can add new values to
markprod, when the
FILEREAD command encounters values in the file that do not match existing dimension values.
The fields containing dimension information in your data file might have values that are not identical to the dimension values in your analytic workspace. The file values might be abbreviated or otherwise encoded. The way you translate a coded dimension value varies depending on whether the code is merely an abbreviation (for example, "P" for
product) or if the code is more complicated.
When the file contains an abbreviated code, you can sometimes complete the value by using the
LSET attribute to add text to the right or left of the value in the file.
For example, products in the file might be identified by all-numeric product numbers, while in your analytic workspace, the values of the
product dimension might be these same product numbers preceded by the letter
P. In this case, you can use the
LSET attribute to add the letter
P to the values in the file.
P is added when the value is read from the file; it is not added when the modified value is matched with or assigned to the
To correctly read values that have less straightforward codes, you can set up another dimension containing the coded values found in the data file, along with a relation to the real dimension.
FILEREAD can then use the relation to determine the actual dimension value. Or you can use any OLAP DML function to alter or manipulate the coded value to make it match a value in your analytic workspace.
When reading coded data that must be manipulated in some way before being stored in the target, use an assignment statement (shown below) in the field description.
expression argument specifies the processing or calculation to be performed. If you want to include the value just read from the file as part of the
expression, then use the
Both of the following field descriptions function identically.
This example illustrates the use of an expression for translating codes into dimension values.
The following example shows the data file, which has 3-character codes for months, and 2-character codes for districts and products.
SEP BO CH 113945 115 OCT BO CH 118934 115 SEP BO CO 92013 119 OCT BO CO 95820 119 SEP BO WI 83201 110 OCT BO WI 82986 110 SEP DA CH 111792 115 OCT DA CH 136031 114 SEP DA CO 91641 121 OCT DA CO 96347 120 SEP DA WI 89734 109 OCT DA WI 88264 109
The following OLAP DML objects are used by the example program.
DEFINE distcode DIMENSION ID DEFINE district.dcode RELATION district <distcode> DEFINE prodcode DIMENSION ID DEFINE Product.pcode RELATION product <prodcode>
The example program, named
dr.prog3, has the following definition.
DEFINE dr.prog3 PROGRAM LD Translates coded values into valid dimension values PROGRAM VARIABLE funit INT funit = FILEOPEN('olapfiles/dr3.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 3 APPEND RSET '96' month FILECLOSE funit funit = FILEOPEN('olapfiles/dr3.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 3 RSET '96' month - COLUMN 5 WIDTH 2 district = district.dcode - (distcode VALUE) - COLUMN 8 WIDTH 2 product = product.pcode - (prodcode VALUE) - COLUMN 11 WIDTH 6 STRIP units - COLUMN 18 WIDTH 3 SCALE 2 price FILECLOSE funit END
The program translates the 2-character codes for districts and products into values of a
district dimension and a
product dimension. The program also appends a 2-digit year to the months.
In the first
FILEREAD command, the
APPEND keyword is used so that new months are added to the MONTH dimension.
For the district and product fields, the program reads the value from the data file and finds the corresponding dimension value using the relations
COLUMN 5 WIDTH 2 district = district.dcode distcode VALUE) COLUMN 8 WIDTH 2 product = product.pcode (prodcode VALUE)
The program uses a QDR with the keyword
VALUE representing the code read from the data file. For the districts, the
distcode VALUE QDR modifies the relation
district.dcode, which holds district names. It specifies the district that corresponds to the value of
distcode just read from the data file. The QDR for
product works the same way.
The program assumes the
district dimension values are already in the analytic workspace, along with the
prodcode dimensions and the relations connecting them to
product. Once the coded values have been processed, the resulting values of
product are used to limit the dimension status so that the data is put in the right cells of the
Finally, you can see in the data file that the price data, which starts in column 18, does not have a decimal point. The
SCALE attribute on the last line of the
FILEREAD command puts two decimal places in each price value.