13 Limits in EDQ

This chapter describes the guidelines on various limits that EDQ applies when reading in and writing data:

  • Oracle VARCHAR Columns in EDQ Results Schema

    Oracle databases 12c and later can be configured to support 32767 bytes or 4000 bytes as the maximum size of VARCHAR columns. When EDQ starts up the maximum size is detected automatically and all VARCHAR columns in results tables are created with this size.

    EDQ version 12.2.1.4.3 onwards you can configure individual columns to be marked as "long". Columns marked as long are created as VARCHAR(32767) whereas other columns are created as VARCHAR(4000).

    You can control the default column sizes by setting the following properties in the file director.properties:
    • oracle.default.string.size: Sets the size used for any column not marked as "long". The default value is 4000.
    • oracle.max.string.size: Sets the size used for any column marked as "long". The default value is 32767 for databases with extended strings enabled, and 4000 otherwise.

    Note that the long column selection flags are not shown if the values for oracle.default.string.size and oracle.max.string.size are equal. This could happen, for example, with databases that do not have extended strings enabled.

    This column size cannot be larger than the limit imposed by the database (4000 or 32767).

    When executing a snapshot, values longer than the maximum string size are truncated automatically, and marked as truncated in the results display for the snapshot. All other data (processor results and match data, for example) are not truncated and the process fails with an ORA-01461: can bind a LONG value only for insert into a LONG column or ORA-12899: value too large for column error, if a written value is longer than the maximum value.

    If a results table column contains long values, indexing for the column fails with an ORA-01450: maximum key length (string) exceeded error. This is recorded in the EDQ logs as a warning but will not cause a process to fail.

    If you wish to truncate snapshot columns to a length less than the configured maximum, set the snapshot.max.string.size in the file director.properties. For example:
    snapshot.max.string.size = 3000

    This is useful in avoiding the ORA-01450 errors on snapshot column indexes or if data calculated from snapshot column may exceed the limit for results tables.

    Note:

    All VARCHAR columns in tables created for exports are limited to 4000 bytes. If you want to export to tables with longer columns, create the table first before configuring the export.
  • Row limit when using the xls format for Microsoft Excel - When configuring a Microsoft Excel Data Store that uses a .xls extension, exports are limited to 65536 rows. This is because such file types support only up to this specified limit. Larger volumes of data should always use the .csv export format, though it is also possible to use the .xlsx format, provided there is sufficient memory.

    Go through the best practice guidelines listed below for a better understanding.

  • Practical row limits when using Microsoft Excel - EDQ supports direct reading and writing of Excel files using both client-side and server-side data stores.

    Follow the below best practice guidelines:

    • The buttons on the Results Browser that enable easy sharing of results to Excel are designed for sharing any results summaries (results views showing statistics rather than data) and small samples (up to 1000 records) of data. They should not be used to attempt to export large volumes of data as this is very likely to breach client-side memory limits

    • Always use CSV file formats for writing out large volumes of data. CSV files can imported easily into Excel for data viewing.

    • If Excel is used, specify an XLSX (not XLS) file extension, and enable the option to always overwrite the file (stream data) on export. Files are always streamed during snapshots (during data import). Streaming mode uses significantly less memory when writing large XLSX files, but does not preserve worksheets and does not support append mode.