8 Mapping Connector Fields to Hub Fields

This chapter describes how records are transformed between the Hub and PIM servers.

This chapter includes the following topics:

8.1 Overview of Data Transformation

To enable the exchange of information between different types of PIM servers, the Hub defines a schema representation for each connector domain type in a schema (XSD) file. The data that is extracted from the PIM servers and returned to the Hub is transformed into the format defined by the Hub schema. Connectors transform the data into Hub XML format and also into the format of the PIM server for both inbound and outbound transactions. For every inbound record extracted, the connector must convert it from the native PIM API format to an XML message that conforms to a PIM XML schema defined for the connector domain. After the connector constructs the PIM XML message, it then uses XSLT to transform the message into a Hub XML message. This message conforms to a provided Hub XML schema.

To enable data to map between the Hub domain and the connector domain, a connector's Transformer component includes the schemas for both the Hub and connector domains and also XSLT documents. As described in Appendix C, "Connector API," an UpsertRecord is a representation of a Hub record that flows between the Hub and connectors. This data structure includes a member called the hubRecordData, which contains the XML representation of the Hub record. Because the Hub record's XML must conform to the XSD for the Hub domain, connector developers must create the following:

  • An XSD for a corresponding connector domain that maps to the Hub domain

  • An XSL definition for transforming the Hub XML representation of the record to the PIM XML representation of the record

  • An XSL definition for transforming the PIM XML representation of the record to the Hub XML representation of the record

During extractions from the PIM server, a connector must ensure that each extracted record is first converted into a PIM XML message. It then uses the XSLT to convert the message from PIM XML to Hub XML. For record events pushed to the PIM server (that is, Create and Update operations), the connector uses XLST to convert the Hub XML record to a PIM XML record before writing to the PIM server.

8.2 About Domain Schemas

During record transformation, the BDSS connectors apply an XLST document that contains references to the Hub and connector domain schemas, thus providing robust document validation for the record data contained in the messages that pass between the Hub and the connectors.

A domain schema has the following attributes defined for each domain field:

  • Schema Data Types

    A connector must support the following schema data types:

    • W3C Schema Data Types

    • Hub Schema Data Types

    • Connector Schema Data Types

  • PIM field class

    A connector can use a PIM field class during transformation or to control flow. For more information, see Section C.1.4, "Field Class."

8.2.1 Schema Data Types

A schema uses the following schema data types:

8.2.1.1 W3C Schema Data Types

The data types used by the Hub are a subset of the native W3C XML Schema (WXS) data types. These data types are defined at http://www.w3.org. Connectors use any WXS data types for their schema documents.

The subset of the native W3C XML Schema (WXS) data types used by the Hub schemas are as follows.

  • xsd:string

  • xsd:boolean

  • xsd:dateTime

    Note:

    Because xsd:dateTime generally allows the ISO 8601 date and time representations, all varieties of date and time can be specified in standard coordinated universal time (UTC).

    All date and time Hub fields that are passed between the Engine and the connector must:

    1. Be in UTC time and in the format of YYYY-MM-DDhh:mm:ss.fff. In this format:

      • YYYY is the 4 digit year, such as 2007.

      • MM is the month (January is 01 and December is 12.)

      • DD is the day of the month (starting from 01).

      • hh is the hour of day in 24-hour clock time (00 through 23, inclusive).

      • mm is the minute of the hour (00 through 59, inclusive)

      • ss is the second of the minute (00 through 59, inclusive).

      • fff are fractional seconds (that is, milliseconds—000 through 999, inclusive).

    2. All date/time PIM field values have a corresponding UTC offset field in the format of +HH:MM or -HH:MM:

      • + indicates a positive UTC offset.

      • - indicates a negative UTC offset.

      • HH is the UTC offset hour in 24-hour clock time (00 through 23, inclusive).

      • MM is the UTC offset minute (00 through 59, inclusive).

    By providing both the UTC date-time and UTC offset for each of the PIM date-time fields, a receiving connector can calculate a local time if the PIM server uses local time or it can use the UTC date-time if the PIM server uses UTC date-time for the field.

  • xsd:nonNegativeInteger

  • xsd:unsignedLong

  • xsd:positiveInteger

8.2.1.2 Hub Schema Data Types

Hub domains are constructed using W3C schema data types.

Caution:

If you alter the Hub schemas or Hub schema data types, you must also make corresponding changes to the XSLT documents for each connector. The XSLT documents must change to ensure proper transformation between the Hub and PIM schema representations.

Hub-Defined Data Types

In addition to the native W3C XML Schema (WXS) data types, BDSS provides the XSDs for such common PIM data types as phone numbers and addresses. These data types have only basic restrictions. For example, HubPhoneNumber is restricted to a simple string, as illustrated in Example 8-1.

Example 8-1 Predefined Restrictions to a Hub-Defined Data Type

<?xml version="1.0" encoding="windows-1252" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
              xmlns="http://xmlns.oracle.com/pss/hub/ContactDomain"
              elementFormDefault="qualified">
   <xsd:simpleType name="HubPhoneNumber">
    <xsd:restriction base="xsd:string"></xsd:restriction>
   </xsd:simpleType>
</xsd:schema>

Although BDSS does not define other restrictions, you can define more elaborate restrictions, if needed. The bold font in Example 8-2 illustrates additional restrictions to the length and format of the HubPhoneNumber data type.

Example 8-2 Additional Restrictions to a Hub-Defined Data Type

<?xml version="1.0" encoding="windows-1252" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
             xmlns="http://xmlns.oracle.com/pss/hub/ContactDomain"
             elementFormDefault="qualified">
   <xsd:simpleType name="HubPhoneNumber">
    <xsd:restriction base="xsd:string"></xsd:restriction>
<xsd:pattern value="([0-9]){0,2}\([0-9]{3}\)[0-9]{3}\-[0-9]{4}( Ext:[0-9]{4})?" />
   </xsd:simpleType>
</xsd:schema>

The Hub-defined data types include:

  • xsd:HubPercent

  • xsd:HubUnsignedLong

  • xsd:HubTokens

  • xsd:HubUtcOffest

  • xsd:HubPhoneNumber

  • xsd:HubAddress

  • xsd:HubState

  • xsd:HubCountry

  • xsd:HubEmailAddress

  • xsd:HubNamePrefix

  • xsd:HubNameSuffix

8.2.1.3 Connector Schema Data Types

Each connector publishes a set of schema data types for defining the domain schemas of its PIM server. The following restrictions apply to these data types:

  • Any new connector domain fields added to a connector domain must only be assigned to a single schema data type published by the corresponding connector.

  • The schema data types published by the connector cannot be modified, restricted, or extended. Do not change them.

8.2.2 Hub Schema Documents

A connector must process the following schemas provided by the Hub:

8.2.2.1 Hub Schema Type Library

The Hub defines a library of custom XML schema types in the HubTypeLibrary.xsd document (Example 8-3). These schema types are available for use by any Hub or connector domain schema.

Example 8-3 Custom XML Schema Types

<?xml version="1.0" encoding="windows-1252" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns="http://xmlns.oracle.com/pss/hub/TypeLibrary"
            elementFormDefault="qualified">
  <xsd:simpleType name="HubPercent">
    <xsd:restriction base="xsd:positiveInteger">
      <xsd:maxInclusive value="100"/>
      <xsd:minInclusive value="0"/>
    </xsd:restriction>
  </xsd:simpleType>
  <xsd:simpleType name="HubUnsignedLong">
    <xsd:annotation>
      <xsd:documentation>Specifies an amount of work in terms of minutes.</xsd:documentation>
    </xsd:annotation>
    <xsd:restriction base="xsd:unsignedLong"/>
  </xsd:simpleType>
  <xsd:complexType name="HubTokens">
    <xsd:sequence>
      <xsd:element name="Token" maxOccurs="unbounded" nillable="false"
                   minOccurs="0" type="xsd:string"/>
    </xsd:sequence>
  </xsd:complexType>
  <xsd:simpleType name="HubUtcOffset">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value='[+|-][0-9]{2}:[0-9]{2}'/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:schema>

8.2.2.2 Hub Task Schema

The Hub defines an XML schema as the central data format for passing task record data between the Hub and the connectors. The schema is defined in the HubTask.xsd document (Example 8-4).

Example 8-4 Hub Task XML Schema Listing

<?xml version="1.0" encoding="windows-1252" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns="http://xmlns.oracle.com/pss/hub/TaskDomain"
            targetNamespace="http://xmlns.oracle.com/pss/hub/TaskDomain"
            elementFormDefault="qualified">
  <xsd:include schemaLocation="HubTypeLibrary.xsd"/>
  <xsd:simpleType name="HubTaskPriority">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="High"/>
      <xsd:enumeration value="MediumHigh"/>
      <xsd:enumeration value="Medium"/>
      <xsd:enumeration value="MediumLow"/>
      <xsd:enumeration value="Low"/>
    </xsd:restriction>
  </xsd:simpleType>
  <xsd:simpleType name="HubTaskStatus">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="NotStarted"/>
      <xsd:enumeration value="InProgress"/>
      <xsd:enumeration value="Completed"/>
      <xsd:enumeration value="WaitingOnSomeoneElse"/>
      <xsd:enumeration value="Deferred"/>
    </xsd:restriction>
  </xsd:simpleType>
  <xsd:element name="HubTaskDomain">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="ActualWork" type="xsd:unsignedLong" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="BillingInfo" type="xsd:string" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="Category" type="HubTokens" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="Comments" type="xsd:string" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="Companies" type="HubTokens" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="CompletedFlag" type="xsd:boolean" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="DateCompleted" type="xsd:dateTime" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="DateCompletedUtcOffset" type="HubUtcOffset" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="DueDate" type="xsd:dateTime" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="DueDateUtcOffset" type="HubUtcOffset" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="Mileage" type="xsd:string" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="PercentComplete" type="HubPercent"/>
        <xsd:element name="Priority" type="HubTaskPriority"/>
        <xsd:element name="Private" type="xsd:boolean"/>
        <xsd:element name="Reminder" type="xsd:boolean" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="ReminderOffset" type="xsd:dateTime" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="ReminderOffsetUtcOffset" type="HubUtcOffset" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="StartDate" type="xsd:dateTime"/>
        <xsd:element name="StartDateUtcOffset" type="HubUtcOffset" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="Status" type="HubTaskStatus"/>
        <xsd:element name="Subject" type="xsd:string"  minOccurs="0"
                     maxOccurs="1" nillable="true"/>
        <xsd:element name="TotalWork" type="HubUnsignedLong" minOccurs="0"
                     maxOccurs="1" nillable="true"/>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

8.3 Data Side Effects Caused by Synchronization

Data loss can occur because of:

8.3.1 Data Model Incompatibility

Data loss may occur when data models for Hub and PIM XSD schemas cannot represent the same set of data. For example, if a Hub Task domain field (field-X) is not represented in Task domain schema for PIM server B and a Hub Task record extracted from PIM server A contains a value for Hub field-X, then synchronization still propagates the Task record from PIM server A to PIM Server B. However, the value of Hub field-X is not propagated to PIM server B. A subsequent update of the Task record from PIM server B to PIM Server A causes the data for field-X to be removed from PIM server A. This loss occurs because the data is not available from the updated record coming from PIM server B.

8.3.2 Differing Data Type Facets

Data loss can result as transformations occur between different data types with different constraining facets. Some data type facets that may result in data loss include:

8.3.2.1 Length

Data loss may occur when field length differs between the Hub and the connector domains for the same corresponding field. For example, if PIM Server A can store more task comment text than PIM server B, then the synchronization of task records causes the comment text to be truncated to the length allowed by PIM server B.

8.3.2.2 Pattern

Data may change unexpectedly or content may be lost if any PIM server or connector imposes a data pattern or data value restriction that changes the data value content. For example, suppose the connector for PIM server A extracts a PIM contact record with customer text notes appended to a phone number. Although synchronization still propagates the record to PIM server B, the business logic for PIM Server B strips the text notes as the record is written to PIM server B. A subsequent update for this record from PIM server B results in the removal of the text notes from the contact record in PIM server A.

8.3.2.3 Enumeration

Enumeration granularity is reduced to the most coarse-grained level available from any PIM server or Hub domain. For example, if PIM server A has task priority levels of High and Low and PIM server B has task priority levels of High, Medium and Low, then the granularity of all task priority levels is reduced to the lower granularity of PIM server A, as synchronization occurs between the two PIM servers.

8.3.2.4 White Space

Extraneous white space may be truncated during XSLT transformation by a connector or during synchronization if it is truncated by the business logic of the PIM server. For example, suppose PIM server A trims white space, but PIM server B does not. As synchronizations occur between the two PIM servers, the white space in PIM server B is trimmed because of updates coming from PIM server A.

8.3.3 List Transformations

The categories of list transformations that can result in data loss include:

In general, data loss may occur if a list or set of values from a source domain cannot be represented in a target domain.

8.3.3.1 Multivalue Field

A multivalue field is a single field that contains a list of tokens (that is, a list of values). The Exchange category field is an example of this data type. Data loss occurs if the entire list of values from the source domain field cannot be represented in the target domain. For example, if a task category field from Microsoft Exchange has multiple value tokens but the Hub task domain can only represent a single value for the corresponding category field, then transformation to the Hub schema format strips all but one token value.

8.3.3.2 Field Group

A field group is a collection or group of related fields. Information loss may occur if the number of fields in the group differs between the Hub and PIM XSD schemas. For example, PIM server A allows three address lines, but PIM server B allows four address lines. Although this does not necessarily mean that data loss can occur, special transformation processing may be necessary to prevent data loss.

8.3.4 Record Collisions

The types of record collisions that result in data loss are as follows:

  • Engine Merge Collision: The collisions that occur when the Engine merges changes from multiple records can cause the loss of data for the record that does not win the collision merge competition.

  • Connector PIM Collision: The collisions that occur when a connector writes changes to a PIM Server that conflict with new PIM record changes can cause data loss. At some point, a change is overwritten, resulting in a loss of data.