General Schema Mapping Rule

This section explains the general Avro conversion rules for the following scenarios:

  • Complete and partial matches
  • Duplicates
  • Duplicates with complete and partial matches
  • Combining fields across different data structures
  • Undefined fields

All examples in this section are based on the following sample schema:

{
  "type": "record",
  "name": "ContactAndKids",
  "fields": [
	{
      	"name": "MyContact",
	"siebelName": "Contact",
	"type": {
		"type": "record",
		"name": "Contact",
		"fields": [
           		 {
			"name": "ContactId",
			"type": "string"
		},
		{
			"name": "First_Name",
			"type": "string",
			"siebelName": "First Name"
		}
       	   	]
		             }
    	}
   ]
}
  • Complete and partial matches:
    • A complete path match occurs when the entire path defined in the schema is found at the root level of the input payload. For example:
      Input Payload Output Payload Explanation
      {
          "Contact": {
              "First Name": "Jack",
              "ContactId": "1234"
          }
      }
      
      {
          "MyContact": {
              "ContactId": "1234",
              "First_Name": "Jack"
          }
      }
      

      In this example, Contact is at the root level. This is considered as a complete path match, and the fields in the input payload are converted to Avro as follows:

      Contact ContactId to MyContact ContactId

      Contact → First Name to MyContact → First_Name
    • A partial path match occurs when the entire path defined in the schema is found under an element within the input payload, rather than at the root level. When processing an Avro schema, complete path match takes precedence over a partial path match. For example:
      Input Payload Output Payload Explanation
      {
          "Quote": {
              "Name": "qt1234",
              "Contact": {
                  "First Name": "Jack",
                  "Last Name": "Doe",
      	     "ContactId": "1234"
              }
          }
      }
      
      {
          "MyContact": {
              "ContactId": "1234",
              "First_Name": "Jack"
          }
      }
      

      In this example, it is only a partial match because Contact is under Quote, not at the root level. The fields in the input payload are converted to Avro as follows:

      Quote ContactContactId to MyContact ContactId

      Quote → Contact → First Name to MyContact → First_Name
    • Duplicates: A duplicate is defined as any case where scanning the payload returns multiple matches for all fields specified in the schema. For example:
      Input Payload Output Payload Explanation
      {
          "Quote": {
              "Name": "qt1234",
              "Contact": {
                  "First Name": "Jack",
                  "ContactId": "1234"
              }
          },
          "Communication": {
              "Name": "qt1234",
              "Contact": {
                  "First Name": "Jill",
                  "ContactId": "12345"
              }
          }
      }
      
      {
          "MyContact": {
              "ContactId": "",
              "First_Name": ""
          }
      }
      

      In this example, the fields First Name and ContactId are found under both QuotesContact and CommunicationContact and are therefore considered duplicates. As a result, the output contains blank values for First Name and ContactId as there are duplicate records for these fields in the payload.

      {
          "Quote": {
              "Name": "qt1234",
              "Contact": {
                  "City": "BLR"
              }
          },
      	"Communication": {
              "Name": "qt1234",
              "Contact": {
                  "First Name": "Jill",
                  "ContactId": "12345"
              }
          }
      }
      
      {
          "MyContact": {
              "ContactId": "12345",
              "First_Name": "Jill"
          }
      }
      
      In this example, two instances of Contact are found under the elements Quote and Communication. However, the fields First Name and ContactId under Contact, as defined in the schema, are found only under, Communication > Contact. Therefore, this is not a case of duplicate records.
    • Duplicates with complete and partial matches: In case of duplicates, the following rules apply:
      • A complete path match automatically takes precedence. For example:
        Input Payload Output Payload Explanation
        {
             "Contact": {
        	"First Name": "Jack",
        	"ContactId": "1234"
        	},
            "Communication": {
                "Name": "qt1234",
                "Contact": {
                    	"First Name": "Jill",
                    	"ContactId": "12345"
                	}
            }
        }
        
        {
            "MyContact": {
                "ContactId": "1234”,
                “First_Name”: “Jack"
            }
        }
        

        In this example, the fields ContactID and First Name are found both under the root-level Contact and under Communication Contact and are therefore considered duplicates. However, the root-level Contact represents a complete path match according to the schema and takes precedence over partial path match (Communication Contact). As a result, the output sets the Contact First Name to Jack and Contact ContactId to 1234 from the root-level Contact.

      • A complete path match also takes precedence over the type of the record. For this specific example, the below sample schema is used:

        {
          "type": "record",
          "name": "ContactAndKids",
          "fields": [
        	{
              "name": "MyContact",
        	  "siebelName": "Contact",
        	  "type": {
        		"type": "record",
        		"name": "Contact",
        		"fields": [
                    {
        			  "name": "ContactId",
        			  "type": "string"
        			},
        			{
        			  "name": "First_Name",
        			  "type": "string",
        			  "siebelName": "First Name"
        			}
                    {
                      "name": "Account",
                      "type": {
        			      "type": "record",
        				      "name": "Account",
        					  "fields": [
        					{
        					  "name": "Name",
        					  "type": "string"
        					}
        					]
        			  }
                    }
        	
                  ]
        	  }
            }
           ]
        }
        
        Input Payload Output Payload Explanation
        {
            "Contact": {
                "First Name": "Jack",
                "Last Name": "Doe",
                "ContactId": "1234",
                "Account": [
                    {
                        "Name": "Test Acc1",
                        "CSN": "88-36MIAD"
                    }
                 ]
            },
            "Communication": {
                "Name": "qt1234",
                "Contact": {
                    "First Name": "Jill",
                    "ContactId": "12345",
                    "Account": {
                        "Name": "Test Acc12",
                        "CSN": "88-36MKAD"
                    }
                }
            }
        }
        
        {
            "MyContact": {
                "ContactId": "1234",
                "First_Name": "Jack",
                "Account": {
                    "Name": "Test Acc1"
                }
            }
        }
        

        In this example, the schema scans for Contact and an Account within Contact. Both Contact and Account are defined as records in the schema. In the input payload, the root-level Contact is a record, but Account inside it is a single element array. However, the Avro conversion selects the single-element array Account under root-level Contact, even though both Contact and ContactAccount under Communication->Contact are records.

      • A complete path match will take precedence only if it includes at least one required field from the schema. If no required fields are present in the complete path match, then a partial path match that includes at least one required field will be considered instead. For example:
        Input Payload Output Payload Explanation
        {
             "Contact": {
                "First Name": "Jack"
            },
            "Communication": {
                "Name": "qt1234",
                "Contact": {
                    "First Name": "Jill",
                    "SingleRecordAsJSONObject": "true",
                    "ContactId": "12345",
                }
            }
        }
        
        {
            "MyContact": {
                "ContactId": "",
                "First_Name": "Jack"
         
            }
        }
        

        In this example, the root-level Contact only includes First Name. Since it is complete path match and includes one required field defined in the schema, First Name is set to Jack in the output. The scan ignores the partial path match Communication Contact, even though it includes both First Name and ContactId from the schema.

      • If a complete path match is not found, then an element with at least one non-duplicate field from the schema will be considered. If all the fields in a schema have duplicates, then empty values are returned for those fields. For example:
        Input Payload Output Payload Explanation
        {
            "Quote": {
                "Name": "qt1234",
                "Contact": {
                    "First Name": "Jack",
                    "ContactId": "1234567"
                }
            },
            "Communication": {
                "Name": "qt1234",
                "Contact": {
                    "First Name": "Jill"
                    "ContactId": "1234567"
                }
            }
        }
        
        {
            "MyContact": {
                "ContactId": "",
                "First_Name": ""
                
           }
        }
        

        In this example, all the fields are duplicated. As a result, the output contains only blank values.

  • Combining fields across different structures: Creating a match by combining schema fields across different data elements is not allowed. For example:
    Input Payload Output Payload Explanation
    {
        "Quote": {
            "Name": "qt1234",
            "Contact": {
                "First Name": "Jack"
            }
        },
       "Communication": {
            "Name": "qt1234",
            "Contact": {
                "ContactId": "12345"
            }
        }
    }
    
    {
        "MyContact": {
            "ContactId": "12345",
            "First_Name": ""
        }
    }
    

    In this example, the system scans for the First Name and ContactId fields under Contact. However, First Name is available only under Quote→Contact and ContactId appears only under Communication→Contact. Fields from different elements, Quote and Communication, are not combined. As a result, the final output includes fields from only one data element. You must review and update the schema as needed.

  • Undefined fields: The fields that are not defined in the schema are not converted and will not be included in the output payload. For example:
    Input Payload Output Payload Explanation
    {
        "Contact": {
            "First Name": "Jack",
            "Last Name": "Dow",
            "ContactId": "1234"
        }
    }
    
    {
        "MyContact": {
            "ContactId": "1234",
            "First_Name": "Jack"
        }
    }
    

    In this example, the input payload includes Last Name which is not defined in the schema. As a result, Last Name is not included in the output payload.