General Schema Mapping Rule
This section explains the general Avro conversion rules for the following scenarios:
- Complete and partial matches
- Duplicates
- Duplicates with complete and partial matches
- Combining fields across different data structures
- Undefined fields
All examples in this section are based on the following sample schema:
{
"type": "record",
"name": "ContactAndKids",
"fields": [
{
"name": "MyContact",
"siebelName": "Contact",
"type": {
"type": "record",
"name": "Contact",
"fields": [
{
"name": "ContactId",
"type": "string"
},
{
"name": "First_Name",
"type": "string",
"siebelName": "First Name"
}
]
}
}
]
}
- Complete and partial matches:
- A complete path match occurs when the entire path defined in the schema is
found at the root level of the input payload. For example:
Input Payload Output Payload Explanation { "Contact": { "First Name": "Jack", "ContactId": "1234" } }{ "MyContact": { "ContactId": "1234", "First_Name": "Jack" } }In this example, Contact is at the root level. This is considered as a complete path match, and the fields in the input payload are converted to Avro as follows:
Contact → ContactId to MyContact → ContactId
Contact → First Name to MyContact → First_Name - A partial path match occurs when the entire path defined in the schema is
found under an element within the input payload, rather than at the root
level. When processing an Avro schema, complete path match takes precedence
over a partial path match. For example:
Input Payload Output Payload Explanation { "Quote": { "Name": "qt1234", "Contact": { "First Name": "Jack", "Last Name": "Doe", "ContactId": "1234" } } }{ "MyContact": { "ContactId": "1234", "First_Name": "Jack" } }In this example, it is only a partial match because Contact is under Quote, not at the root level. The fields in the input payload are converted to Avro as follows:
Quote →Contact →ContactId to MyContact → ContactId
Quote → Contact → First Name to MyContact → First_Name - Duplicates: A duplicate is defined as any case where scanning the payload
returns multiple matches for all fields specified in the schema. For
example:
Input Payload Output Payload Explanation { "Quote": { "Name": "qt1234", "Contact": { "First Name": "Jack", "ContactId": "1234" } }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill", "ContactId": "12345" } } }{ "MyContact": { "ContactId": "", "First_Name": "" } }In this example, the fields First Name and ContactId are found under both Quotes → Contact and Communication → Contact and are therefore considered duplicates. As a result, the output contains blank values for First Name and ContactId as there are duplicate records for these fields in the payload.
{ "Quote": { "Name": "qt1234", "Contact": { "City": "BLR" } }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill", "ContactId": "12345" } } }{ "MyContact": { "ContactId": "12345", "First_Name": "Jill" } }In this example, two instances of Contact are found under the elements Quote and Communication. However, the fields First Name and ContactId under Contact, as defined in the schema, are found only under, . Therefore, this is not a case of duplicate records. - Duplicates with complete and partial matches: In case of duplicates, the
following rules apply:
- A complete path match automatically takes precedence. For example:
Input Payload Output Payload Explanation { "Contact": { "First Name": "Jack", "ContactId": "1234" }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill", "ContactId": "12345" } } }{ "MyContact": { "ContactId": "1234”, “First_Name”: “Jack" } }In this example, the fields ContactID and First Name are found both under the root-level Contact and under Communication → Contact and are therefore considered duplicates. However, the root-level Contact represents a complete path match according to the schema and takes precedence over partial path match (Communication → Contact). As a result, the output sets the Contact → First Name to Jack and Contact → ContactId to 1234 from the root-level Contact.
-
A complete path match also takes precedence over the type of the record. For this specific example, the below sample schema is used:
{ "type": "record", "name": "ContactAndKids", "fields": [ { "name": "MyContact", "siebelName": "Contact", "type": { "type": "record", "name": "Contact", "fields": [ { "name": "ContactId", "type": "string" }, { "name": "First_Name", "type": "string", "siebelName": "First Name" } { "name": "Account", "type": { "type": "record", "name": "Account", "fields": [ { "name": "Name", "type": "string" } ] } } ] } } ] }Input Payload Output Payload Explanation { "Contact": { "First Name": "Jack", "Last Name": "Doe", "ContactId": "1234", "Account": [ { "Name": "Test Acc1", "CSN": "88-36MIAD" } ] }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill", "ContactId": "12345", "Account": { "Name": "Test Acc12", "CSN": "88-36MKAD" } } } }{ "MyContact": { "ContactId": "1234", "First_Name": "Jack", "Account": { "Name": "Test Acc1" } } }In this example, the schema scans for Contact and an Account within Contact. Both Contact and Account are defined as records in the schema. In the input payload, the root-level Contact is a record, but Account inside it is a single element array. However, the Avro conversion selects the single-element array Account under root-level Contact, even though both Contact and Contact→Account under Communication->Contact are records.
- A complete path match will take precedence only if it includes at
least one required field from the schema. If no required fields are
present in the complete path match, then a partial path match that
includes at least one required field will be considered instead. For
example:
Input Payload Output Payload Explanation { "Contact": { "First Name": "Jack" }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill", "SingleRecordAsJSONObject": "true", "ContactId": "12345", } } }{ "MyContact": { "ContactId": "", "First_Name": "Jack" } }In this example, the root-level Contact only includes First Name. Since it is complete path match and includes one required field defined in the schema, First Name is set to Jack in the output. The scan ignores the partial path match Communication → Contact, even though it includes both First Name and ContactId from the schema.
- If a complete path match is not found, then an element with at least
one non-duplicate field from the schema will be considered. If all
the fields in a schema have duplicates, then empty values are
returned for those fields. For example:
Input Payload Output Payload Explanation { "Quote": { "Name": "qt1234", "Contact": { "First Name": "Jack", "ContactId": "1234567" } }, "Communication": { "Name": "qt1234", "Contact": { "First Name": "Jill" "ContactId": "1234567" } } }{ "MyContact": { "ContactId": "", "First_Name": "" } }In this example, all the fields are duplicated. As a result, the output contains only blank values.
- A complete path match automatically takes precedence. For example:
- A complete path match occurs when the entire path defined in the schema is
found at the root level of the input payload. For example:
- Combining fields across different structures: Creating a match by combining schema
fields across different data elements is not allowed. For example:
Input Payload Output Payload Explanation { "Quote": { "Name": "qt1234", "Contact": { "First Name": "Jack" } }, "Communication": { "Name": "qt1234", "Contact": { "ContactId": "12345" } } }{ "MyContact": { "ContactId": "12345", "First_Name": "" } }In this example, the system scans for the First Name and ContactId fields under Contact. However, First Name is available only under Quote→Contact and ContactId appears only under Communication→Contact. Fields from different elements, Quote and Communication, are not combined. As a result, the final output includes fields from only one data element. You must review and update the schema as needed.
- Undefined fields: The fields that are not defined in the schema are not converted
and will not be included in the output payload. For example:
Input Payload Output Payload Explanation { "Contact": { "First Name": "Jack", "Last Name": "Dow", "ContactId": "1234" } }{ "MyContact": { "ContactId": "1234", "First_Name": "Jack" } }In this example, the input payload includes Last Name which is not defined in the schema. As a result, Last Name is not included in the output payload.