ONC+ Developer's Guide

Canonical Standard

The XDR approach to standardizing data representations is canonical. That is, XDR defines a single byte order, a single floating-point representation (IEEE), and so on. Any program running on any machine can use XDR to create portable data by translating its local representation to the XDR standard representations. Similarly, any program running on any machine can read portable data by translating the XDR standard representations to its local equivalents.

The single standard completely decouples programs that create or send portable data from those that use or receive portable data. A new machine learns how to convert the standard representations and its local representations. The local representations of other machines are irrelevant. Conversely, the local representations of the new machine are also irrelevant to existing programs running on other machines. Such programs can immediately read portable data produced by the new machine because such data conforms to the canonical standards that they already understand.

Strong precedents are in place for XDR's canonical approach. For example, TCP/IP, UDP/IP, XNS, Ethernet, and, indeed, all protocols below layer five of the ISO model, are canonical protocols. The advantage of any canonical approach is simplicity. In the case of XDR, a single set of conversion routines is written once and is never touched again.

The canonical approach has the disadvantage of unnecessary conversion to and from the XDR standard when data is transferred between two machines with identical byte order. Suppose two Intel machines are transferring integers according to the XDR standard. The sending machine converts the integers from Intel host byte order to XDR byte order. The receiving machine performs the reverse conversion. Because both machines observe the same byte order, their conversions are unnecessary.

The time spent converting to and from a canonical representation is insignificant, especially in distributed applications. Most of the time required to prepare a data structure for transfer is not spent in conversion but in traversing the elements of the data structure.

To transmit a tree, for example, each leaf must be visited and each element in a leaf record must be copied to a buffer and aligned there. Storage for the leaf might have to be de-allocated as well. Similarly, to receive a tree, you must allocate storage for each leaf, move data from the buffer to the leaf and properly align it, and construct pointers to link the leaves together.

Every machine pays the cost of traversing and copying data structures whether or not conversion is required. In distributed applications, communications overhead, which is the time required to move the data down the sender's protocol layers, across the network, and up the receiver's protocol layers, dwarfs conversion overhead.