Schema Definition
Schema Definition provides control over dataset structures while maintaining flexibility.
- Allows predefined or overridden data types and formats for datasets.
- Provides uniformity across multiple datasets while permitting necessary adjustments.
- Enables you to define a schema separately as a template or embedded within a dataset as an inline statement.
- Allows you to reference by name within a dataset when you've a schema defined separately.
- Becomes part of a code block when you've embedded it. See Code Block.
Syntax
For information on schema statements (template_schema_statement or builtin_schema_statement) in the dataset definition code_block, see Generic Dataset Definition.
Schema Definition Rules
- Column data types are mandatory;
PRIMARYKEYis optional. - By default, all columns are nullable unless you explicitly specify otherwise.
- If you don't define
PRIMARYKEYin the schema, it must be provided by the source dataset.
- Primary Key
- When you define
PRIMARYKEYin both the schema and the dataset, the dataset value takes precedence. - For datasets with row source joins, you can omit
PRIMARYKEYfrom the dataset if you've already specified it in the schema.
- When you define
- Data Type
- When you define a column's data type in both the schema and the dataset, the data type that you provide in the dataset overrides the dataset in the schema.
Column Mismatch Handling
- Extra Columns in the Dataset: If a column appears in the dataset but not in the schema, its properties are derived from the source.
- Extra Schema Columns: If you define a column in the schema but don't map it in the dataset, it's ignored unless it's part of the primary key. If so, a warning is issued.
Schema Template Definition
You can use DEFINE SCHEMA, the dataset definition code block, to define a schema separately from a dataset definition, refer to it, and apply it to any dataset.
schema_definition ::= DEFINE SCHEMA schema_name
'[ '
column_name data_type [ PRIMARYKEY ] [ nullable_flag ]
[,column_name data_type [ PRIMARYKEY ] [ nullable_flag ]] ...
']'
END
schema_name ::= identifier
template_schema_statement ::= SCHEMA schema_name ; //used in Dataset Definitiontemplate_schema_statement:IMPORT SOURCE CUSTOMERS
DEFINE SCHEMA CUSTOMERS_D_SCHEMA
[
CUST_ID NUMBER(38,0) PRIMARYKEY,
CUST_LAST_NAME VARCHAR2(32),
CUST_CITY_ID NUMBER(38,0),
CUST_VALID VARCHAR2(32),
CUST_EFF_FROM DATE NOT NULL,
CUST_EFF_TO DATE
]
END
//Usage in Dataset Definition
DEFINE DATASET CUSTOMERS_D
SCHEMA CUSTOMERS_D_SCHEMA;
ROWSOURCE CUSTOMERS;
THIS = CUSTOMERS;
END
Inline Schema Definition
Within the dataset definition code block, you can define a schema directly within a dataset definition, applying it only to that specific dataset.
builtin_schema_statement ::= SCHEMA
'[ '
column_name data_type [ PRIMARYKEY ] [ nullable_flag ]
[,column_name data_type [ PRIMARYKEY ] [ nullable_flag ]] ...
']' ;
builtin_schema_statement:DEFINE DATASET INS_CUSTOMERS_D
SCHEMA
[
CUST_ID NUMBER(38,0) PRIMARYKEY,
CUST_LAST_NAME. VARCHAR2(32),
CUST_CITY_ID NUMBER(38,0),
CUST_VALID VARCHAR2(32),
CUST_EFF_FROM DATE NOT NULL,
CUST_EFF_TO DATE
];
ROWSOURCE CUSTOMERS;
THIS = CUSTOMERS;
END
Code Block Load and Schema Definition
When you separate the full load code from the incremental load within a dataset definition, the inline schema is defined at the beginning of the code_block_load.
code_block_load defined with template_schema_statement:DEFINE DATASET FL_CUSTOMERS_D
SCHEMA CUSTOMERS_D_SCHEMA;
ON FULL LOAD
ROWSOURCE CUSTOMERS WHERE CUSTOMERS.CUST_VALID = ‘A’;
THIS = CUSTOMERS;
INCREMENTAL LOAD
ROWSOURCE CUSTOMERS;
THIS = CUSTOMERS;
ENDLOAD
END
code_block_load defined with builtin_schema_statement:DEFINE VERSIONED DATASET INSCH_FL_CUSTOMERS_CF
SCHEMA
[
CUST_ID NUMBER(38,0) PRIMARYKEY,
CUST_LAST_NAME VARCHAR2(32),
CUST_CITY_ID NUMBER(38,0),
CUST_VALID VARCHAR2(32),
CUST_EFF_FROM DATE NOT NULL,
CUST_EFF_TO DATE
];
ON FULL LOAD
ROWSOURCE CUSTOMERS WHERE CUSTOMERS.CUST_VALID = ‘A’;
THIS = CUSTOMERS;
INCREMENTAL LOAD
ROWSOURCE CUSTOMERS;
THIS = CUSTOMERS;
ENDLOAD
END