E The PIF-POF Binary Format

Learn the binary streams for the Portable Object Format (POF) and the Portable Invocation Format (PIF) that are used to serialize objects in a platform and language neutral way.

This appendix includes the following sections:

Overview of the PIF-POF Binary Format

The Portable Object Format (POF) allows object values to be encoded into a binary stream in such a way that the platform/language origin of the object value is both irrelevant and unknown.The Portable Invocation Format (PIF) allows method invocations to be similarly encoded into a binary stream. These two formats (referred to as PIF-POF) are derived from a common binary encoding substrate. The binary format is provided here for informative purposes and is not a requirement for using PIF-POF. See Using Portable Object Format.

Stream Format

The PIF-POF stream format is octet-based; a PIF-POF stream is a sequence of octet values.For the sake of clarity, this documentation treats all octets as unsigned 8-bit integer values in the range 0x00 to 0xFF (decimal 0 to 255). Byte-ordering is explicitly not a concern since (in PIF-POF) a given octet value that is represented by an unsigned 8-bit integer value is always written and read as the same unsigned 8-bit integer value.

A PIF stream contains exactly one Invocation. An Invocation consists of an initial POF stream that contains an Integer Value for the remaining length of the Invocation, immediately followed by a POF stream that contains an Integer Value that is the conversation identifier, immediately followed by a POF stream that contains a User Type value that is the message object. The remaining length indicates the total number of octets used to encode the conversation identifier and the message object; the remaining length is provided so that a process receiving an Invocation can determine when the Invocation has been fully received. The conversation identifier is used to support multiple logical clients and services multiplexed through a single connection, just as TCP/IP provides multiple logical port numbers for a given IP address. The message object is defined by the particular high-level conversational protocol.

A POF stream contains exactly one Value. The Value contains a Type Identifier, and if the Type Identifier does not imply a value, then it is immediately trailed by a data structure whose format is defined by the Type Identifier.

This section includes the following topics:

Integer Values

The stream format relies extensively on the ability to encode integer values in a compact form. Coherence refers to this integer binary format as a packed integer. This format uses an initial octet and one or more trailing octets as necessary; it is a variable-length format.

Table E-1 describes the three regions in the first octet.

Table E-1 Regions in the First Octet of a Packed Integer

Region Mask Description

0x80

Continuation indicator

0x40

Negative indicator

0x3F

integer value (6 binary LSDs)

Table E-2 describes the two regions in the trailing octets.

Table E-2 Regions in the Trailing Octet of a Packed Integer

Region Mask Description

0x80

Continuation indicator

0x7F

integer value (next 7 binary LSDs)

Example E-1 illustrates writing a 32-bit integer value to an octet stream as supported in Coherence.

Example E-1 Writing a 32-bit Integer Value to an Octet Stream

public static void writeInt(DataOutput out, int n)
        throws IOException
    {
    int b = 0;
    if (n < 0)
        {
        b = 0x40;
        n = ~n;
        }
    b |= (byte) (n & 0x3F);
    n >>>= 6;
    while (n != 0)
        {
        b |= 0x80;
        out.writeByte(b);
        b = (n & 0x7F);
        n >>>= 7;
        }
    out.writeByte(b);
    }

Example E-2 illustrates reading a 32-bit integer value from an octet stream as supported in Coherence.

Example E-2 Reading a 32-bit Integer Value from an Octet Stream

public static int readInt(DataInput in)
        throws IOException
    {
    int b = in.readUnsignedByte();
    int n = b & 0x3F;
    int cBits = 6;
    boolean fNeg = (b & 0x40) != 0;
    while ((b & 0x80) != 0)
        {
        b = in.readUnsignedByte();
        n |= ((b & 0x7F) << cBits);
        cBits += 7;
        }
    if (fNeg)
        {
        n = ~n;
        }
    return n;
    }

Integer values used within this documentation without an explicit Type Identifier are assumed to be 32-bit signed integer values that have a decimal range of -231 to 231-1.

Table E-3 illustrates some integer value examples.

Table E-3 Binary Formats for Integer Values Without a Type Identifier

Value Binary Format

0

0x00

1

0x01

2

0x02

99

0xA301

9999

0x8F9C01

-1

0x40

-2

0x41

-99

0xE201

-9999

0xCE9C01

Type Identifiers

A Type Identifier is encoded in the binary stream as an Integer Value. Type Identifiers greater than or equal to zero are user Type Identifiers. Type Identifiers less than zero are predefined ("intrinsic") type identifiers.

Table E-4 lists the predefined identifiers.

Table E-4 Predefined Type Identifiers

Type ID Description

-1 (0x40)

int16

-2 (0x41)

int32

-3 (0x42)

int64

-4 (0x43)

int128*

-5 (0x44)

float32

-6 (0x45)

float64

-7 (0x46)

float128*

-8 (0x47)

decimal32*

-9 (0x48)

decimal64*

-10 (0x49)

decimal128*

-11 (0x4A)

boolean

-12 (0x4B)

octet

-13 (0x4C)

octet-string

-14 (0x4D)

char

-15 (0x4E)

char-string

-16 (0x4F)

date

-17 (0x50)

year-month-interval*

-18 (0x51)

time

-19 (0x52)

time-interval*

-20 (0x53)

datetime

-21 (0x54)

day-time-interval*

-22 (0x55)

collection

-23 (0x56)

uniform-collection

-24 (0x57)

array

-25 (0x58)

uniform-array

-26 (0x59)

sparse-array

-27 (0x5A)

uniform-sparse-array

-28 (0x5B)

map

-29 (0x5C)

uniform-keys-map

-30 (0x5D)

uniform-map

-31 (0x5E)

identity

-32 (0x5F)

reference

Type Identifiers less than or equal to -33 are a combination of a type and a value. This form is used to reduce space for these commonly used values.

Table E-5 lists the type identifiers that combine type and value.

Table E-5 Type Identifiers that Combine a Type and a Value

Type ID Description

-33 (0x60)

boolean:false

-34 (0x61)

boolean:true

-35 (0x62)

string:zero-length

-36 (0x63)

collection:empty

-37 (0x64)

reference:null

-38 (0x65)

floating-point:+infinity

-39 (0x66)

floating-point:-infinity

-40 (0x67)

floating-point:NaN

-41 (0x68)

int:-1

-42 (0x69)

int:0

-43 (0x6A)

int:1

-44 (0x6B)

int:2

-45 (0x6C)

int:3

-46 (0x6D)

int:4

-47 (0x6E)

int:5

-48 (0x6F)

int:6

-49 (0x70)

int:7

-50 (0x71)

int:8

-51 (0x72)

int:9

-52 (0x73)

int:10

-53 (0x74)

int:11

-54 (0x75)

int:12

-55 (0x76)

int:13

-56 (0x77)

int:14

-57 (0x78)

int:15

-58 (0x79)

int:16

-59 (0x7A)

int:17

-60 (0x7B)

int:18

-61 (0x7C)

int:19

-62 (0x7D)

int:20

-63 (0x7E)

int:21

-64 (0x7F)

int:22

Binary Formats for Predefined Types

Learn the binary formats for the predefined ("intrinsic") type identifiers that are supported with PIF-POF.The types are: int, Decimal, Floating Point, Boolean, Octet, Octet String, Char, Char String, Date, Year-Month Interval, Time, Time Interval, Date-Time, Date-Time Interval, Collections, Arrays, Sparse Arrays, Key-Value Maps (Dictionaries), Identity, and Reference.

This section includes the following topics:

Int

Four signed integer types are supported: int16, int32, int64, and int128. If a type identifier for a integer type is encountered in the stream, it is immediately followed by an Integer Value.

The four signed integer types vary only by the length that is required to support the largest value of the type using the common "twos complement" binary format. The Type Identifier, one of int16, int32, int64, or int128 is followed by an Integer Value in the stream. If the Integer Value is outside of the range supported by the type (-215 to 215-1 for int16, -231 to 231-1, for int32, -263 to 263-1 for int64, or -2127 to 2127-1 for int128,) then the result is undefined and may be bitwise truncation or an exception.

Additionally, there are some Type Identifiers that combine the int designation with a value into a single byte for purpose of compactness. As a result, these Type Identifiers are not followed by an Integer Value in the stream, since the value is included in the Type Identifier.

Table E-6 illustrates these type identifiers.

Table E-6 Type Identifiers that Combine an int Data Type with a Value

Value int16 int32 int64 int128

0

0x69

0x69

0x69

0x69

1

0x6A

0x6A

0x6A

0x6A

2

0x6B

0x6B

0x6B

0x6B

99

0x40A301

0x41A301

0x42A301

0x43A301

9999

0x408F9C01

0x418F9C01

0x428F9C01

0x438F9C01

-1

0x68

0x68

0x68

0x68

-2

0x4041

0x4141

0x4241

0x4341

-99

0x40E201

0x41E201

0x42E201

0x43E201

-9999

0x40CE9C01

0x41CE9C01

0x42CE9C01

0x43CE9C01

The Java type equivalents are short (int16), int (int32), long (int64) and BigInteger (int128). Since BigInteger can represent much larger values, it is not possible to encode all BigInteger values in the int128 form; values out of the int128 range are basically unsupported, and would result in an exception or would use a different encoding, such as a string encoding.

Coercion of Integer Types

To enable the efficient representation of numeric data types, an integer type is coerced into any of the following types by a stream recipient:

Table E-7 Type IDs of Integer Types that can be Coerced into Other Types

Type IDss Description

-1 (0x40)

int16

-2 (0x41)

int32

-3 (0x42)

int64

-4 (0x43)

int128

-5 (0x44)

float32

-6 (0x45)

float64

-7 (0x46)

float128

-8 (0x47)

decimal32

-9 (0x48)

decimal64

-10 (0x49)

decimal128

-12 (0x4B)

octet

-14 (0x4D)

char

In other words, if the recipient reads any of the above types from the stream and it encounters an encoded integer value, it automatically converts that value into the expected type. This capability allows a set of common (that is, small-magnitude) octet, character, integer, decimal and floating-point values to be encoded using the single-octet integer form (Type Identifiers in the range -41 to -64).

For purposes of unsigned types, the integer value -1 is translated to 0xFF for the octet type, and to 0xFFFF for the char type. (In the case of the char type, this does unfortunately seem to imply a UTF-16 platform encoding; however, it does not violate any of the explicit requirements of the stream format.)

Decimal

There are three floating-point decimal types supported: decimal32, decimal64, and decimal128. If a type identifier for a decimal type is encountered in the stream, it is immediately followed by two packed integer values. The first integer value is the unscaled value, and the second is the scale. These values are equivalent to the parameters to the constructor of Java's BigDecimal class: java.math.BigDecimal(BigInteger unscaledVal, int scale).

In addition to the coercion of integer values into decimal values supported as described in Coercion of Integer Types, the constant type+value identifiers listed in Table E-8 are used to indicate special values supported by IEEE 754r.

Table E-8 Type Identifiers that can Indicate Decimal Values

Type ID Description

-38 (0x65)

floating-point:+infinity

-39 (0x66)

floating-point:-infinity

-40 (0x67)

floating-point:NaN

Java does not provide a standard (that is, portable) decimal type; rather, it has the awkward BigDecimal implementation that was intended originally for internal use in Java's cryptographic infrastructure. In Java, the decimal values for positive and negative infinity, and not-a-number (NaN), are not supported.

Floating Point

Three base-2 floating point types are supported: float32, float64, and float128. If a type identifier for a floating point type is encountered in the stream, it is immediately followed by a fixed-length floating point value, whose binary form is defined by IEEE 754/IEEE754r. IEEE 754 format is used to write floating point numbers to the stream, and IEEE 754r format is used for the float128 type.

In addition to the coercion of integer values into decimal values as described in Coercion of Integer Types, the constants in Table E-9 are used to indicate special values supported by IEEE-754

Table E-9 Type Identifiers that can Indicate IEEE 754 Special Values

Type ID Description

-38 (0x65)

floating-point:+infinity

-39 (0x66)

floating-point:-infinity

-40 (0x67)

floating-point:NaN

Other special values defined by IEEE-754 are encoded using the full 32-bit, 64-bit or 128-bit format, and may not be supported on all platforms. Specifically, by not providing any means to differentiate among them, Java only supports one NaN value.

Boolean

If the type identifier for Boolean occurs in the stream, it is followed by an integer value, which represents the Boolean value false for the integer value of zero, or true for all other integer values.

While it is possible to encode Boolean values as described in Coercion of Integer Types, the only values for the Boolean type are true and false. As such, the only expected binary formats for Boolean values are the predefined (and compact) forms described in Table E-10.

Table E-10 Type Identifiers that can Indicate Boolean Values

Type ID Description

-33 (0x60)

boolean:false

-34 (0x61)

boolean:true

Octet

If the type identifier for Octet occurs in the stream, it is followed by the octet value itself, which is by definition in the range 0 to 255 (0x00 to 0xFF). The compact form of integer values can be used for Octet values, with the integer value -1 being translated as 0xFF. See Coercion of Integer Types,

Table E-11 lists the integer values that may be used as Octet values.

Table E-11 Integer Values that may be Used for Octet Values

Value Octet

0 (0x00)

0x69

1 (0x01)

0x6A

2 (0x02)

0x6B

99 (0x63)

0x4B63

254 (0xFE)

0x4BFE

255 (0xFF)

0x68

Octet String

If the type identifier for Octet String occurs in the stream, it is followed by an Integer Value for the length n of the string, and then n octet values.

An Octet String of zero length is encoded using the "string:zero-length" Type Identifier.

Char

If the type identifier for Char occurs in the stream, it is followed by a UTF-8 encoded character. The compact form of integer values may be used for Char values, with the integer value -1 being translated as 0xFFFF. See Coercion of Integer Types.

Note:

POF optimizes the storage of String data by using only one byte for each character when possible. Custom POF character codecs (ASCII for example) are not required and do not result in better performance.

Example E-3 illustrates writing a character value to an octet stream.

Example E-3 Writing a Character Value to an Octet Stream

public static void writeChar(DataOutput out, int ch)
        throws IOException
    {
    if (ch >= 0x0001 && ch <= 0x007F)
        {
        // 1-byte format: 0xxx xxxx
        out.write((byte) ch);
        }
    else if (ch <= 0x07FF)
        {
        // 2-byte format: 110x xxxx, 10xx xxxx
        out.write((byte) (0xC0 | ((ch >>> 6) & 0x1F)));
        out.write((byte) (0x80 | ((ch ) & 0x3F)));
        }
    else
        {
        // 3-byte format: 1110 xxxx, 10xx xxxx, 10xx xxxx
        out.write((byte) (0xE0 | ((ch >>> 12) & 0x0F)));
        out.write((byte) (0x80 | ((ch >>> 6) & 0x3F)));
        out.write((byte) (0x80 | ((ch ) & 0x3F)));
        }
    }

Example E-4 illustrates reading a character value from an octet stream.

Example E-4 Reading a Character Value from an Octet Stream

public static char readChar(DataInput in)
        throws IOException
    {
    char ch;

    int b = in.readUnsignedByte();
    switch ((b & 0xF0) >>> 4)
        {
        case 0x0: case 0x1: case 0x2: case 0x3:
        case 0x4: case 0x5: case 0x6: case 0x7:
            // 1-byte format: 0xxx xxxx
            ch = (char) b;
            break;

        case 0xC: case 0xD:
            {
            // 2-byte format: 110x xxxx, 10xx xxxx
            int b2 = in.readUnsignedByte();
            if ((b2 & 0xC0) != 0x80)
                {
                throw new UTFDataFormatException();
                }
            ch = (char) (((b & 0x1F) << 6) | b2 & 0x3F);
            break;
            }

        case 0xE:
            {
            // 3-byte format: 1110 xxxx, 10xx xxxx, 10xx xxxx
            int n = in.readUnsignedShort();
            int b2 = n >>> 8;
            int b3 = n & 0xFF;
            if ((b2 & 0xC0) != 0x80 || (b3 & 0xC0) != 0x80)
                {
                throw new UTFDataFormatException();
                }
            ch = (char) (((b & 0x0F) << 12) |
                        ((b2 & 0x3F) << 6) |
                          b3 & 0x3F);
            break;
            }

        default:
            throw new UTFDataFormatException(
                    "illegal leading UTF byte: " + b);
        }

    return ch;
    }

Char String

If the type identifier for Char String occurs in the stream, it is followed by an Integer Value for the length n of the UTF-8 representation string in octets, and then n octet values composing the UTF-8 encoding described above. Note that the format length-encodes the octet length, not the character length.

A Char String of zero length is encoded using the string:zero-length Type Identifier. Table E-12 illustrates the Char String formats.

Table E-12 Values for Char String Formats

Values Char String Format

zero length

0x62 (or 0x4E00)

"ok"

0x4E026F6B

Date

Date values are passed using ISO8601 semantics. If the type identifier for Date occurs in the stream, it is followed by three Integer Values for the year, month and day, in the ranges as defined by ISO8601.

Year-Month Interval

If the type identifier for Year-Month Interval occurs in the stream, it is followed by two Integer Values for the number of years and the number of months in the interval.

Time

Time values are passed using ISO8601 semantics. If the type identifier for Time occurs in the stream, it is followed by five Integer Values, which may be followed by two more Integer Values. The first four Integer Values are the hour, minute, second and fractional second values. Fractional seconds are encoded in one of three ways:

  • 0 indicates no fractional seconds.

  • [1..999] indicates the number of milliseconds.

  • [-1..-999999999] indicates the negated number of nanoseconds.

The fifth Integer Value is a time zone indicator, encoded in one of three ways:

  • 0 indicates no time zone.

  • 1 indicates Universal Coordinated Time (UTC).

  • 2 indicates a time zone offset, which is followed by two more Integer Values for the hour offset and minute offset, as described by ISO8601.

The encoding for variable fractional and time zone does add complexity to the parsing of a Time Value, but provide for much more complete support of the ISO8601 standard and the variability in the precision of clocks, while achieving a high degree of binary compactness. While time values tend to have no fractional encoding or millisecond encoding, the trend over time is toward higher time resolution.

Time Interval

If the type identifier for Time Interval occurs in the stream, it is followed by four Integer Values for the number of hours, minutes, seconds and nanoseconds in the interval.

Date-Time

Date-Time values are passed using ISO8601 semantics. If the type identifier for Date-Time occurs in the stream, it is followed by eight or ten Integer Values, which correspond to the Integer Values that compose the Date and Time values.

Coercion of Date and Time Types

Date Value can be coerced into a Date-Time Value. Time Value can be coerced into a Date-Time Value. Date-Time Value can be coerced into either a Date Value or a Time Value.

Day-Time Interval

If the type identifier for Day-Time Interval occurs in the stream, it is followed by five Integer Values for the number of days, hours, minutes, seconds and nanoseconds in the interval.

Collections

A collection of values, such as a bag, a set, or a list, are encoded in a POF stream using the Collection type. Immediately following the Type Identifier, the stream contains the Collection Size, an Integer Value indicating the number of values in the Collection, which is greater than or equal to zero. Following the Collection Size, is the first value in the Collection (if any), which is itself encoded as a Value. The values in the Collection are contiguous, and there is exactly n values in the stream, where n equals the Collection Size.

If all the values in the Collection have the same type, then the Uniform Collection format is used. Immediately following the Type Identifier (uniform-collection), the uniform type of the values in the collection writes to the stream, followed by the Collection Size n as an Integer Value, followed by n values without their Type Identifiers. Note that values in a Uniform Collection cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Collection has an explicit content type.

Table E-13 illustrates examples of Collection and Uniform Collection formats for several values.

Table E-13 Collection and Uniform Collection Formats for Various Values

Values Collection Format Uniform Collection Format

no value

0x63 (or 0x5500)

not applicable (n/a)

1

0x55016A

0x56410101

1,2,3

0x55036A6B6C

0x564103010203

1, "ok"

0x55026A4E026F6B

n/a

Arrays

An indexed array of values is encoded in a POF stream using the Array type. Immediately following the Type Identifier, the stream contains the Array Size, an Integer Value indicating the number of elements in the Array, which must be greater than or equal to zero. Following the Array Size is the value of the first element of the Array (the zero index) if there is at least one element in the array which is itself encoded using as a Value. The values of the elements of the Array are contiguous, and there must be exactly n values in the stream, where n equals the Array Size.

If all the values of the elements of the Array have the same type, then the Uniform Array format is used. Immediately following the Type Identifier (uniform-array), the uniform type of the values of the elements of the Array writes the stream, followed by the Array Size n as an Integer Value, followed by n values without their Type Identifiers. Note that values in a Uniform Array cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Array has an explicit array element type.

Table E-14 illustrates examples of Array and Uniform Array formats for several values.

Table E-14 Array and Uniform Array Formats for Various Values

Values Array Format Uniform Array Format

no value

0x63 (or 0x5700)

0x63 (or 0x584100) – This example assumes an element type of Int32.

1

0x57016A

0x58410101

1,2,3

0x57036A6B6C

0x584103010203

1, "ok"

0x57026A4E026F6B

n/a

Sparse Arrays

For arrays whose element values are sparse, the Sparse Array format allows indexes to be explicitly encoded, implying that any missing indexes have a default value. The default value is false for the Boolean type, zero for all numeric, octet and char types, and null for all reference types. The format for the Sparse Array is the Type Identifier (sparse-array), followed by the Array Size n as an Integer Value, followed by not more than n index/value pairs, each of which is composed of an array index encoded as an Integer Value i (0 <= i < n) whose value is greater than the previous element's array index, and an element value encoded as a Value; the Sparse Array is finally terminated with an illegal index of -1.

If all the values of the elements of the Sparse Array have the same type, then the Uniform Sparse Array format is used. Immediately following the Type Identifier (uniform-sparse-array), the uniform type of the values of the elements of the Sparse Array writes the stream, followed by the Array Size n as an Integer Value, followed by not more the n index/value pairs, each of which is composed of an array index encoded as an Integer Value i (0 <= i < n) whose value is greater than the previous element's array index, and a element value encoded as a Value without a Type Identifier; the Uniform Sparse Array is finally terminated with an illegal index of -1. Note that values in a Uniform Sparse Array cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Sparse Array has an explicit array element type.

Table E-15 illustrates examples of Sparse Array and Uniform Sparse Array formats for several values.

Table E-15 Sparse Array and Uniform Sparse Array Formats for Various Values

Values Sparse Array format Uniform Sparse Array format

no value

0x63 (or 0x590040)

0x63 (or 0x5A410040) – This example assumes an element type of Int32.

1

0x5901006A40

0x5A4101000140

1,2,3

0x5903006A016B026C40

0x5A410300010102020340

1,,,,5,,,,9

0x5909006A046E087240

0x5A410900010405080940

1,,,,"ok"

0x5905006A044E026F6B40

n/a

Key-Value Maps (Dictionaries)

For key/value pairs, a Key-Value Map (also known as Dictionary data structure) format is used. There are three forms of the Key-Value Map binary encoding:

  • The generic map encoding is a sequence of keys and values;

  • The uniform-keys-map encoding is a sequence of keys of a uniform type and their corresponding values;

  • The uniform-map encoding is a sequence of keys of a uniform type and their corresponding values of a uniform type.

The format for the Key-Value Map is the Type Identifier (map), followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as Value, and a corresponding value encoded as a Value.

Table E-16 illustrates several examples of key/value pairs and their corresponding binary format.

Table E-16 Binary Formats for Key/Value Pairs

Values Binary format

no value

0x63 (or 0x5B00)

1="ok"

0x5B016A4E026F6B

1="ok", 2="no"

0x5B026A4E026F6B6B4E026E6F

If all of the keys of the Key-Value Map are of a uniform type, then the encoding uses a more compact format, starting with the Type Identifier (uniform-keys-map), followed by the Type Identifier for the uniform type of the keys of the Key-Value Map, followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as a Value without a Type Identifier, and a corresponding value encoded as a Value.

Table E-17 illustrates several examples of the binary formats for Key/Value pairs where the Keys are of uniform type.

Table E-17 Binary Formats for Key/Value Pairs where Keys are of Uniform Type

Values Binary format

no value

0x63 (or 0x5C4100)

1="ok"

0x5C4101014E026F6B

1="ok", 2="no"

0x5C4102014E026F6B024E026E6F

If all of the keys of the Key-Value Map are of a uniform type, and all the corresponding values of the map are also of a uniform type, then the encoding uses a more compact format, starting with the Type Identifier (uniform-map), followed by the Type Identifier for the uniform type of the keys of the Key-Value Map, followed by the Type Identifier for the uniform type of the values of the Key-Value Map, followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as a Value without a Type Identifier, and a corresponding value encoded as a Value without a Type Identifier.

Table E-18 illustrates several examples of the binary formats for Key/Value pairs where the Keys and Values are of uniform type.

Table E-18 Binary Formats for Key/Value Pairs where Keys and Values are of Uniform Type

Values Binary format

no value

0x63 (or 0x5D414E00)

1="ok"

0x5D414E0101026F6B

1="ok", 2="no"

0x5D414E0201026F6B02026E6F

Identity

If the type identifier for Identity occurs in the stream, it is followed by an Integer Value, which is the Identity. Following the Identity is the value that is being identified, which is itself encoded as a Value.

Any value within a POF stream that occurs multiple times, is labeled with an Identity, and subsequent instances of that value within the same POF stream are replaced with a Reference. For platforms that support "by reference" semantics, the identity represents a serialized form of the actual object identity.

An Identity is an Integer Value that is greater than or equal to zero. A value within the POF stream has at most one Identity. Values within a uniform data structure can be assigned an identity.

Reference

A Reference is a pointer to an Identity that has been encountered inside the current POF stream, or a null pointer.

For platforms that support "by reference" semantics, the reference in the POF stream becomes a reference in the realized (deserialized) object, and a null reference in the POF stream becomes a null reference in the realized object. For platforms that do not support "by reference" semantics, and for cases in which a null reference is encountered in the POF stream for a non-reference value (for example, a primitive property in Java), the default value for the type of value is used.

Table E-19 illustrates examples of binary formats for several "by reference" semantics.

Table E-19 Binary Formats for "By Reference" Semantics

Value Binary Format

Id #1

0x5F01

Id #350

0x5F9E05

null

0x60

Support for forward and outer references is not required by POF. In POF, both the identity that is referenced and the value that is being referenced by the identity have occurred within the POF stream. In the first case, a reference is not made to an identity that has not yet been encountered, and in the second case, a reference is not made within a complex value (such as a collection or a user type) to that complex value itself.

Binary Format for User Types

All non-intrinsic types are referred to as User Types.User Types are composed of zero or more indexed values (also known as fields, properties, and attributes), each of which has a Type Identifier. Furthermore, User Types are versioned, supporting both forward and backward compatibility.

User Types have a Type Identifier with a value greater than or equal to zero. The Type Identifier has no explicit or self-describing meaning within the stream itself; in other words, a Value does not contain a type (or "class") definition. Instead, the encoder (the sender) and the decoder (the receiver) share an implicit understanding, called a Context, which includes the necessary metadata, including the user type definitions.

The binary format for a User Type is very similar to that of a Sparse Array; conceptually, a User Type can be considered a Sparse Array of property values. The format for User Types is the Type Identifier (an Integer Value greater than or equal to zero), followed by the Version Identifier (an Integer Value greater than or equal to zero), followed by index/value pairs, each of which is composed of a Property Index encoded as an Integer Value i (0 <= i) whose value is greater than the previous Property Index, and a Property Value encoded as a Value; the User Type is finally terminated with an illegal Property Index of -1.

Like the Sparse Array, any property that is not included as part of the User Type encoding is assumed to have a default value. The default value is false for the Boolean type, zero for all numeric, octet and char types, and null for all reference types.

This section includes the following topic:

Versioning of User Types

Versioning of User Types supports the addition of properties to a User Type, but not the replacement or removal of properties that existed in previous versions of the User Type. By including the versioning capability as part of the general binary contract, it is possible to support both backward and forward compatibility.

When a sender sends a User Type value of a version v1 to a receiver that supports version v2 of the same User Type, the receiver uses default values for the additional properties of the User Type that exist in v2 but do not exist in v1.

When a sender sends a User Type value of a version v2 to a receiver that only supports version v1 of the same User Type, the receiver treats the additional properties of the User Type that exist in v2 but do not exist in v1 as opaque. If the receiver must store the value (persistently), or if the possibility exists that the value is ever sent at a later point, then the receiver stores those additional opaque properties for later encoding. Sufficient type information is included to allow the receiver to store off the opaque property values in either a typed or binary form; when the receiver re-encodes the User Type, it must do so using the Version Indicator v2, since it is including the unaltered v2 properties.