E The PIF-POF Binary Format
This appendix includes the following sections:
- Overview of the PIF-POF Binary Format
The Portable Object Format (POF) allows object values to be encoded into a binary stream in such a way that the platform/language origin of the object value is both irrelevant and unknown. - Stream Format
The PIF-POF stream format is octet-based; a PIF-POF stream is a sequence of octet values. - Binary Formats for Predefined Types
Learn the binary formats for the predefined ("intrinsic") type identifiers that are supported with PIF-POF. - Binary Format for User Types
All non-intrinsic types are referred to as User Types.
Overview of the PIF-POF Binary Format
Parent topic: The PIF-POF Binary Format
Stream Format
A PIF stream contains exactly one Invocation. An Invocation consists of an initial POF stream that contains an Integer Value for the remaining length of the Invocation, immediately followed by a POF stream that contains an Integer Value that is the conversation identifier, immediately followed by a POF stream that contains a User Type value that is the message object. The remaining length indicates the total number of octets used to encode the conversation identifier and the message object; the remaining length is provided so that a process receiving an Invocation can determine when the Invocation has been fully received. The conversation identifier is used to support multiple logical clients and services multiplexed through a single connection, just as TCP/IP provides multiple logical port numbers for a given IP address. The message object is defined by the particular high-level conversational protocol.
A POF stream contains exactly one Value. The Value contains a Type Identifier, and if the Type Identifier does not imply a value, then it is immediately trailed by a data structure whose format is defined by the Type Identifier.
This section includes the following topics:
Parent topic: The PIF-POF Binary Format
Integer Values
The stream format relies extensively on the ability to encode integer values in a compact form. Coherence refers to this integer binary format as a packed integer. This format uses an initial octet and one or more trailing octets as necessary; it is a variable-length format.
Table E-1 describes the three regions in the first octet.
Table E-1 Regions in the First Octet of a Packed Integer
Region Mask | Description |
---|---|
0x80 |
Continuation indicator |
0x40 |
Negative indicator |
0x3F |
integer value (6 binary LSDs) |
Table E-2 describes the two regions in the trailing octets.
Table E-2 Regions in the Trailing Octet of a Packed Integer
Region Mask | Description |
---|---|
0x80 |
Continuation indicator |
0x7F |
integer value (next 7 binary LSDs) |
Example E-1 illustrates writing a 32-bit integer value to an octet stream as supported in Coherence.
Example E-1 Writing a 32-bit Integer Value to an Octet Stream
public static void writeInt(DataOutput out, int n) throws IOException { int b = 0; if (n < 0) { b = 0x40; n = ~n; } b |= (byte) (n & 0x3F); n >>>= 6; while (n != 0) { b |= 0x80; out.writeByte(b); b = (n & 0x7F); n >>>= 7; } out.writeByte(b); }
Example E-2 illustrates reading a 32-bit integer value from an octet stream as supported in Coherence.
Example E-2 Reading a 32-bit Integer Value from an Octet Stream
public static int readInt(DataInput in) throws IOException { int b = in.readUnsignedByte(); int n = b & 0x3F; int cBits = 6; boolean fNeg = (b & 0x40) != 0; while ((b & 0x80) != 0) { b = in.readUnsignedByte(); n |= ((b & 0x7F) << cBits); cBits += 7; } if (fNeg) { n = ~n; } return n; }
Integer values used within this documentation without an explicit Type Identifier are assumed to be 32-bit signed integer values that have a decimal range of -231 to 231-1.
Table E-3 illustrates some integer value examples.
Table E-3 Binary Formats for Integer Values Without a Type Identifier
Value | Binary Format |
---|---|
0 |
0x00 |
1 |
0x01 |
2 |
0x02 |
99 |
0xA301 |
9999 |
0x8F9C01 |
-1 |
0x40 |
-2 |
0x41 |
-99 |
0xE201 |
-9999 |
0xCE9C01 |
Parent topic: Stream Format
Type Identifiers
A Type Identifier is encoded in the binary stream as an Integer Value. Type Identifiers greater than or equal to zero are user Type Identifiers. Type Identifiers less than zero are predefined ("intrinsic") type identifiers.
Table E-4 lists the predefined identifiers.
Table E-4 Predefined Type Identifiers
Type ID | Description |
---|---|
-1 (0x40) |
int16 |
-2 (0x41) |
int32 |
-3 (0x42) |
int64 |
-4 (0x43) |
int128* |
-5 (0x44) |
float32 |
-6 (0x45) |
float64 |
-7 (0x46) |
float128* |
-8 (0x47) |
decimal32* |
-9 (0x48) |
decimal64* |
-10 (0x49) |
decimal128* |
-11 (0x4A) |
boolean |
-12 (0x4B) |
octet |
-13 (0x4C) |
octet-string |
-14 (0x4D) |
char |
-15 (0x4E) |
char-string |
-16 (0x4F) |
date |
-17 (0x50) |
year-month-interval* |
-18 (0x51) |
time |
-19 (0x52) |
time-interval* |
-20 (0x53) |
datetime |
-21 (0x54) |
day-time-interval* |
-22 (0x55) |
collection |
-23 (0x56) |
uniform-collection |
-24 (0x57) |
array |
-25 (0x58) |
uniform-array |
-26 (0x59) |
sparse-array |
-27 (0x5A) |
uniform-sparse-array |
-28 (0x5B) |
map |
-29 (0x5C) |
uniform-keys-map |
-30 (0x5D) |
uniform-map |
-31 (0x5E) |
identity |
-32 (0x5F) |
reference |
Type Identifiers less than or equal to -33 are a combination of a type and a value. This form is used to reduce space for these commonly used values.
Table E-5 lists the type identifiers that combine type and value.
Table E-5 Type Identifiers that Combine a Type and a Value
Type ID | Description |
---|---|
-33 (0x60) |
boolean:false |
-34 (0x61) |
boolean:true |
-35 (0x62) |
string:zero-length |
-36 (0x63) |
collection:empty |
-37 (0x64) |
reference:null |
-38 (0x65) |
floating-point:+infinity |
-39 (0x66) |
floating-point:-infinity |
-40 (0x67) |
floating-point:NaN |
-41 (0x68) |
int:-1 |
-42 (0x69) |
int:0 |
-43 (0x6A) |
int:1 |
-44 (0x6B) |
int:2 |
-45 (0x6C) |
int:3 |
-46 (0x6D) |
int:4 |
-47 (0x6E) |
int:5 |
-48 (0x6F) |
int:6 |
-49 (0x70) |
int:7 |
-50 (0x71) |
int:8 |
-51 (0x72) |
int:9 |
-52 (0x73) |
int:10 |
-53 (0x74) |
int:11 |
-54 (0x75) |
int:12 |
-55 (0x76) |
int:13 |
-56 (0x77) |
int:14 |
-57 (0x78) |
int:15 |
-58 (0x79) |
int:16 |
-59 (0x7A) |
int:17 |
-60 (0x7B) |
int:18 |
-61 (0x7C) |
int:19 |
-62 (0x7D) |
int:20 |
-63 (0x7E) |
int:21 |
-64 (0x7F) |
int:22 |
Parent topic: Stream Format
Binary Formats for Predefined Types
This section includes the following topics:
- Int
- Coercion of Integer Types
- Decimal
- Floating Point
- Boolean
- Octet
- Octet String
- Char
- Char String
- Date
- Year-Month Interval
- Time
- Time Interval
- Date-Time
- Coercion of Date and Time Types
- Day-Time Interval
- Collections
- Arrays
- Sparse Arrays
- Key-Value Maps (Dictionaries)
- Identity
- Reference
Parent topic: The PIF-POF Binary Format
Int
Four signed integer types are supported: int16
, int32
, int64
, and int128
. If a type identifier for a integer type is encountered in the stream, it is immediately followed by an Integer Value.
The four signed integer types vary only by the length that is required to support the largest value of the type using the common "twos complement" binary format. The Type Identifier, one of int16
, int32
, int64
, or int128
is followed by an Integer Value in the stream. If the Integer Value is outside of the range supported by the type (-215 to 215-1 for int16
, -231 to 231-1, for int32
, -263 to 263-1 for int64
, or -2127 to 2127-1 for int128
,) then the result is undefined and may be bitwise truncation or an exception.
Additionally, there are some Type Identifiers that combine the int
designation with a value into a single byte for purpose of compactness. As a result, these Type Identifiers are not followed by an Integer Value in the stream, since the value is included in the Type Identifier.
Table E-6 illustrates these type identifiers.
Table E-6 Type Identifiers that Combine an int Data Type with a Value
Value | int16 | int32 | int64 | int128 |
---|---|---|---|---|
0 |
0x69 |
0x69 |
0x69 |
0x69 |
1 |
0x6A |
0x6A |
0x6A |
0x6A |
2 |
0x6B |
0x6B |
0x6B |
0x6B |
99 |
0x40A301 |
0x41A301 |
0x42A301 |
0x43A301 |
9999 |
0x408F9C01 |
0x418F9C01 |
0x428F9C01 |
0x438F9C01 |
-1 |
0x68 |
0x68 |
0x68 |
0x68 |
-2 |
0x4041 |
0x4141 |
0x4241 |
0x4341 |
-99 |
0x40E201 |
0x41E201 |
0x42E201 |
0x43E201 |
-9999 |
0x40CE9C01 |
0x41CE9C01 |
0x42CE9C01 |
0x43CE9C01 |
The Java type equivalents are short
(int16
), int
(int32
), long
(int64
) and BigInteger
(int128
). Since BigInteger
can represent much larger values, it is not possible to encode all BigInteger
values in the int128
form; values out of the int128
range are basically unsupported, and would result in an exception or would use a different encoding, such as a string encoding.
Parent topic: Binary Formats for Predefined Types
Coercion of Integer Types
To enable the efficient representation of numeric data types, an integer type is coerced into any of the following types by a stream recipient:
Table E-7 Type IDs of Integer Types that can be Coerced into Other Types
Type IDss | Description |
---|---|
-1 (0x40) |
int16 |
-2 (0x41) |
int32 |
-3 (0x42) |
int64 |
-4 (0x43) |
int128 |
-5 (0x44) |
float32 |
-6 (0x45) |
float64 |
-7 (0x46) |
float128 |
-8 (0x47) |
decimal32 |
-9 (0x48) |
decimal64 |
-10 (0x49) |
decimal128 |
-12 (0x4B) |
octet |
-14 (0x4D) |
char |
In other words, if the recipient reads any of the above types from the stream and it encounters an encoded integer value, it automatically converts that value into the expected type. This capability allows a set of common (that is, small-magnitude) octet, character, integer, decimal and floating-point values to be encoded using the single-octet integer form (Type Identifiers in the range -41 to -64).
For purposes of unsigned types, the integer value -1 is translated to 0xFF for the octet
type, and to 0xFFFF for the char
type. (In the case of the char
type, this does unfortunately seem to imply a UTF-16 platform encoding; however, it does not violate any of the explicit requirements of the stream format.)
Parent topic: Binary Formats for Predefined Types
Decimal
There are three floating-point decimal types supported: decimal32
, decimal64
, and decimal128
. If a type identifier for a decimal type is encountered in the stream, it is immediately followed by two packed integer values. The first integer value is the unscaled value, and the second is the scale. These values are equivalent to the parameters to the constructor of Java's BigDecimal
class: java.math.BigDecimal(BigInteger unscaledVal, int scale)
.
In addition to the coercion of integer values into decimal values supported as described in Coercion of Integer Types, the constant type+value identifiers listed in Table E-8 are used to indicate special values supported by IEEE 754r.
Table E-8 Type Identifiers that can Indicate Decimal Values
Type ID | Description |
---|---|
-38 (0x65) |
floating-point:+infinity |
-39 (0x66) |
floating-point:-infinity |
-40 (0x67) |
floating-point:NaN |
Java does not provide a standard (that is, portable) decimal type; rather, it has the awkward BigDecimal
implementation that was intended originally for internal use in Java's cryptographic infrastructure. In Java, the decimal values for positive and negative infinity, and not-a-number (NaN
), are not supported.
Parent topic: Binary Formats for Predefined Types
Floating Point
Three base-2 floating point types are supported: float32
, float64
, and float128
. If a type identifier for a floating point type is encountered in the stream, it is immediately followed by a fixed-length floating point value, whose binary form is defined by IEEE 754/IEEE754r. IEEE 754 format is used to write floating point numbers to the stream, and IEEE 754r format is used for the float128
type.
In addition to the coercion of integer values into decimal values as described in Coercion of Integer Types, the constants in Table E-9 are used to indicate special values supported by IEEE-754
Table E-9 Type Identifiers that can Indicate IEEE 754 Special Values
Type ID | Description |
---|---|
-38 (0x65) |
floating-point:+infinity |
-39 (0x66) |
floating-point:-infinity |
-40 (0x67) |
floating-point:NaN |
Other special values defined by IEEE-754 are encoded using the full 32-bit, 64-bit or 128-bit format, and may not be supported on all platforms. Specifically, by not providing any means to differentiate among them, Java only supports one NaN
value.
Parent topic: Binary Formats for Predefined Types
Boolean
If the type identifier for Boolean occurs in the stream, it is followed by an integer value, which represents the Boolean value false
for the integer value of zero, or true
for all other integer values.
While it is possible to encode Boolean values as described in Coercion of Integer Types, the only values for the Boolean type are true
and false
. As such, the only expected binary formats for Boolean values are the predefined (and compact) forms described in Table E-10.
Table E-10 Type Identifiers that can Indicate Boolean Values
Type ID | Description |
---|---|
-33 (0x60) |
boolean:false |
-34 (0x61) |
boolean:true |
Parent topic: Binary Formats for Predefined Types
Octet
If the type identifier for Octet occurs in the stream, it is followed by the octet value itself, which is by definition in the range 0 to 255 (0x00 to 0xFF). The compact form of integer values can be used for Octet values, with the integer value -1 being translated as 0xFF. See Coercion of Integer Types,
Table E-11 lists the integer values that may be used as Octet values.
Table E-11 Integer Values that may be Used for Octet Values
Value | Octet |
---|---|
0 (0x00) |
0x69 |
1 (0x01) |
0x6A |
2 (0x02) |
0x6B |
99 (0x63) |
0x4B63 |
254 (0xFE) |
0x4BFE |
255 (0xFF) |
0x68 |
Parent topic: Binary Formats for Predefined Types
Octet String
If the type identifier for Octet String occurs in the stream, it is followed by an Integer Value for the length n of the string, and then n octet values.
An Octet String of zero length is encoded using the "string:zero-length" Type Identifier.
Parent topic: Binary Formats for Predefined Types
Char
If the type identifier for Char occurs in the stream, it is followed by a UTF-8 encoded character. The compact form of integer values may be used for Char values, with the integer value -1 being translated as 0xFFFF. See Coercion of Integer Types.
Note:
POF optimizes the storage of String data by using only one byte for each character when possible. Custom POF character codecs (ASCII for example) are not required and do not result in better performance.
Example E-3 illustrates writing a character value to an octet stream.
Example E-3 Writing a Character Value to an Octet Stream
public static void writeChar(DataOutput out, int ch) throws IOException { if (ch >= 0x0001 && ch <= 0x007F) { // 1-byte format: 0xxx xxxx out.write((byte) ch); } else if (ch <= 0x07FF) { // 2-byte format: 110x xxxx, 10xx xxxx out.write((byte) (0xC0 | ((ch >>> 6) & 0x1F))); out.write((byte) (0x80 | ((ch ) & 0x3F))); } else { // 3-byte format: 1110 xxxx, 10xx xxxx, 10xx xxxx out.write((byte) (0xE0 | ((ch >>> 12) & 0x0F))); out.write((byte) (0x80 | ((ch >>> 6) & 0x3F))); out.write((byte) (0x80 | ((ch ) & 0x3F))); } }
Example E-4 illustrates reading a character value from an octet stream.
Example E-4 Reading a Character Value from an Octet Stream
public static char readChar(DataInput in) throws IOException { char ch; int b = in.readUnsignedByte(); switch ((b & 0xF0) >>> 4) { case 0x0: case 0x1: case 0x2: case 0x3: case 0x4: case 0x5: case 0x6: case 0x7: // 1-byte format: 0xxx xxxx ch = (char) b; break; case 0xC: case 0xD: { // 2-byte format: 110x xxxx, 10xx xxxx int b2 = in.readUnsignedByte(); if ((b2 & 0xC0) != 0x80) { throw new UTFDataFormatException(); } ch = (char) (((b & 0x1F) << 6) | b2 & 0x3F); break; } case 0xE: { // 3-byte format: 1110 xxxx, 10xx xxxx, 10xx xxxx int n = in.readUnsignedShort(); int b2 = n >>> 8; int b3 = n & 0xFF; if ((b2 & 0xC0) != 0x80 || (b3 & 0xC0) != 0x80) { throw new UTFDataFormatException(); } ch = (char) (((b & 0x0F) << 12) | ((b2 & 0x3F) << 6) | b3 & 0x3F); break; } default: throw new UTFDataFormatException( "illegal leading UTF byte: " + b); } return ch; }
Parent topic: Binary Formats for Predefined Types
Char String
If the type identifier for Char String occurs in the stream, it is followed by an Integer Value for the length n of the UTF-8 representation string in octets, and then n octet values composing the UTF-8 encoding described above. Note that the format length-encodes the octet length, not the character length.
A Char String of zero length is encoded using the string:zero-length
Type Identifier. Table E-12 illustrates the Char String formats.
Table E-12 Values for Char String Formats
Values | Char String Format |
---|---|
zero length |
0x62 (or 0x4E00) |
"ok" |
0x4E026F6B |
Parent topic: Binary Formats for Predefined Types
Date
Date values are passed using ISO8601 semantics. If the type identifier for Date occurs in the stream, it is followed by three Integer Values for the year, month and day, in the ranges as defined by ISO8601.
Parent topic: Binary Formats for Predefined Types
Year-Month Interval
If the type identifier for Year-Month Interval occurs in the stream, it is followed by two Integer Values for the number of years and the number of months in the interval.
Parent topic: Binary Formats for Predefined Types
Time
Time values are passed using ISO8601 semantics. If the type identifier for Time occurs in the stream, it is followed by five Integer Values, which may be followed by two more Integer Values. The first four Integer Values are the hour, minute, second and fractional second values. Fractional seconds are encoded in one of three ways:
-
0 indicates no fractional seconds.
-
[1..999] indicates the number of milliseconds.
-
[-1..-999999999] indicates the negated number of nanoseconds.
The fifth Integer Value is a time zone indicator, encoded in one of three ways:
-
0 indicates no time zone.
-
1 indicates Universal Coordinated Time (UTC).
-
2 indicates a time zone offset, which is followed by two more Integer Values for the hour offset and minute offset, as described by ISO8601.
The encoding for variable fractional and time zone does add complexity to the parsing of a Time Value, but provide for much more complete support of the ISO8601 standard and the variability in the precision of clocks, while achieving a high degree of binary compactness. While time values tend to have no fractional encoding or millisecond encoding, the trend over time is toward higher time resolution.
Parent topic: Binary Formats for Predefined Types
Time Interval
If the type identifier for Time Interval occurs in the stream, it is followed by four Integer Values for the number of hours, minutes, seconds and nanoseconds in the interval.
Parent topic: Binary Formats for Predefined Types
Date-Time
Date-Time values are passed using ISO8601 semantics. If the type identifier for Date-Time occurs in the stream, it is followed by eight or ten Integer Values, which correspond to the Integer Values that compose the Date and Time values.
Parent topic: Binary Formats for Predefined Types
Coercion of Date and Time Types
Date Value can be coerced into a Date-Time Value. Time Value can be coerced into a Date-Time Value. Date-Time Value can be coerced into either a Date Value or a Time Value.
Parent topic: Binary Formats for Predefined Types
Day-Time Interval
If the type identifier for Day-Time Interval occurs in the stream, it is followed by five Integer Values for the number of days, hours, minutes, seconds and nanoseconds in the interval.
Parent topic: Binary Formats for Predefined Types
Collections
A collection of values, such as a bag, a set, or a list, are encoded in a POF stream using the Collection type. Immediately following the Type Identifier, the stream contains the Collection Size, an Integer Value indicating the number of values in the Collection, which is greater than or equal to zero. Following the Collection Size, is the first value in the Collection (if any), which is itself encoded as a Value. The values in the Collection are contiguous, and there is exactly n values in the stream, where n equals the Collection Size.
If all the values in the Collection have the same type, then the Uniform Collection format is used. Immediately following the Type Identifier (uniform-collection), the uniform type of the values in the collection writes to the stream, followed by the Collection Size n as an Integer Value, followed by n values without their Type Identifiers. Note that values in a Uniform Collection cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Collection has an explicit content type.
Table E-13 illustrates examples of Collection and Uniform Collection formats for several values.
Table E-13 Collection and Uniform Collection Formats for Various Values
Values | Collection Format | Uniform Collection Format |
---|---|---|
no value |
0x63 (or 0x5500) |
not applicable (n/a) |
1 |
0x55016A |
0x56410101 |
1,2,3 |
0x55036A6B6C |
0x564103010203 |
1, "ok" |
0x55026A4E026F6B |
n/a |
Parent topic: Binary Formats for Predefined Types
Arrays
An indexed array of values is encoded in a POF stream using the Array type. Immediately following the Type Identifier, the stream contains the Array Size, an Integer Value indicating the number of elements in the Array, which must be greater than or equal to zero. Following the Array Size is the value of the first element of the Array (the zero index) if there is at least one element in the array which is itself encoded using as a Value. The values of the elements of the Array are contiguous, and there must be exactly n values in the stream, where n equals the Array Size.
If all the values of the elements of the Array have the same type, then the Uniform Array format is used. Immediately following the Type Identifier (uniform-array), the uniform type of the values of the elements of the Array writes the stream, followed by the Array Size n as an Integer Value, followed by n values without their Type Identifiers. Note that values in a Uniform Array cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Array has an explicit array element type.
Table E-14 illustrates examples of Array and Uniform Array formats for several values.
Table E-14 Array and Uniform Array Formats for Various Values
Values | Array Format | Uniform Array Format |
---|---|---|
no value |
0x63 (or 0x5700) |
0x63 (or 0x584100) – This example assumes an element type of Int32. |
1 |
0x57016A |
0x58410101 |
1,2,3 |
0x57036A6B6C |
0x584103010203 |
1, "ok" |
0x57026A4E026F6B |
n/a |
Parent topic: Binary Formats for Predefined Types
Sparse Arrays
For arrays whose element values are sparse, the Sparse Array format allows indexes to be explicitly encoded, implying that any missing indexes have a default value. The default value is false for the Boolean type, zero for all numeric, octet and char types, and null for all reference types. The format for the Sparse Array is the Type Identifier (sparse-array), followed by the Array Size n as an Integer Value, followed by not more than n index/value pairs, each of which is composed of an array index encoded as an Integer Value i (0 <= i < n) whose value is greater than the previous element's array index, and an element value encoded as a Value; the Sparse Array is finally terminated with an illegal index of -1.
If all the values of the elements of the Sparse Array have the same type, then the Uniform Sparse Array format is used. Immediately following the Type Identifier (uniform-sparse-array), the uniform type of the values of the elements of the Sparse Array writes the stream, followed by the Array Size n as an Integer Value, followed by not more the n index/value pairs, each of which is composed of an array index encoded as an Integer Value i (0 <= i < n) whose value is greater than the previous element's array index, and a element value encoded as a Value without a Type Identifier; the Uniform Sparse Array is finally terminated with an illegal index of -1. Note that values in a Uniform Sparse Array cannot be assigned an identity, and that (as a side-effect of the explicit type encoding) an empty Uniform Sparse Array has an explicit array element type.
Table E-15 illustrates examples of Sparse Array and Uniform Sparse Array formats for several values.
Table E-15 Sparse Array and Uniform Sparse Array Formats for Various Values
Values | Sparse Array format | Uniform Sparse Array format |
---|---|---|
no value |
0x63 (or 0x590040) |
0x63 (or 0x5A410040) – This example assumes an element type of Int32. |
1 |
0x5901006A40 |
0x5A4101000140 |
1,2,3 |
0x5903006A016B026C40 |
0x5A410300010102020340 |
1,,,,5,,,,9 |
0x5909006A046E087240 |
0x5A410900010405080940 |
1,,,,"ok" |
0x5905006A044E026F6B40 |
n/a |
Parent topic: Binary Formats for Predefined Types
Key-Value Maps (Dictionaries)
For key/value pairs, a Key-Value Map (also known as Dictionary data structure) format is used. There are three forms of the Key-Value Map binary encoding:
-
The generic
map
encoding is a sequence of keys and values; -
The
uniform-keys-map
encoding is a sequence of keys of a uniform type and their corresponding values; -
The
uniform-map
encoding is a sequence of keys of a uniform type and their corresponding values of a uniform type.
The format for the Key-Value Map is the Type Identifier (map), followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as Value, and a corresponding value encoded as a Value.
Table E-16 illustrates several examples of key/value pairs and their corresponding binary format.
Table E-16 Binary Formats for Key/Value Pairs
Values | Binary format |
---|---|
no value |
0x63 (or 0x5B00) |
1="ok" |
0x5B016A4E026F6B |
1="ok", 2="no" |
0x5B026A4E026F6B6B4E026E6F |
If all of the keys of the Key-Value Map are of a uniform type, then the encoding uses a more compact format, starting with the Type Identifier (uniform-keys-map), followed by the Type Identifier for the uniform type of the keys of the Key-Value Map, followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as a Value without a Type Identifier, and a corresponding value encoded as a Value.
Table E-17 illustrates several examples of the binary formats for Key/Value pairs where the Keys are of uniform type.
Table E-17 Binary Formats for Key/Value Pairs where Keys are of Uniform Type
Values | Binary format |
---|---|
no value |
0x63 (or 0x5C4100) |
1="ok" |
0x5C4101014E026F6B |
1="ok", 2="no" |
0x5C4102014E026F6B024E026E6F |
If all of the keys of the Key-Value Map are of a uniform type, and all the corresponding values of the map are also of a uniform type, then the encoding uses a more compact format, starting with the Type Identifier (uniform-map), followed by the Type Identifier for the uniform type of the keys of the Key-Value Map, followed by the Type Identifier for the uniform type of the values of the Key-Value Map, followed by the Key-Value Map Size n as an Integer Value, followed by n key/value pairs, each of which is composed of a key encoded as a Value without a Type Identifier, and a corresponding value encoded as a Value without a Type Identifier.
Table E-18 illustrates several examples of the binary formats for Key/Value pairs where the Keys and Values are of uniform type.
Table E-18 Binary Formats for Key/Value Pairs where Keys and Values are of Uniform Type
Values | Binary format |
---|---|
no value |
0x63 (or 0x5D414E00) |
1="ok" |
0x5D414E0101026F6B |
1="ok", 2="no" |
0x5D414E0201026F6B02026E6F |
Parent topic: Binary Formats for Predefined Types
Identity
If the type identifier for Identity occurs in the stream, it is followed by an Integer Value, which is the Identity. Following the Identity is the value that is being identified, which is itself encoded as a Value.
Any value within a POF stream that occurs multiple times, is labeled with an Identity, and subsequent instances of that value within the same POF stream are replaced with a Reference. For platforms that support "by reference" semantics, the identity represents a serialized form of the actual object identity.
An Identity is an Integer Value that is greater than or equal to zero. A value within the POF stream has at most one Identity. Values within a uniform data structure can be assigned an identity.
Parent topic: Binary Formats for Predefined Types
Reference
A Reference is a pointer to an Identity that has been encountered inside the current POF stream, or a null pointer.
For platforms that support "by reference" semantics, the reference in the POF stream becomes a reference in the realized (deserialized) object, and a null reference in the POF stream becomes a null reference in the realized object. For platforms that do not support "by reference" semantics, and for cases in which a null reference is encountered in the POF stream for a non-reference value (for example, a primitive property in Java), the default value for the type of value is used.
Table E-19 illustrates examples of binary formats for several "by reference" semantics.
Table E-19 Binary Formats for "By Reference" Semantics
Value | Binary Format |
---|---|
Id #1 |
0x5F01 |
Id #350 |
0x5F9E05 |
null |
0x60 |
Support for forward and outer references is not required by POF. In POF, both the identity that is referenced and the value that is being referenced by the identity have occurred within the POF stream. In the first case, a reference is not made to an identity that has not yet been encountered, and in the second case, a reference is not made within a complex value (such as a collection or a user type) to that complex value itself.
Parent topic: Binary Formats for Predefined Types
Binary Format for User Types
User Types have a Type Identifier with a value greater than or equal to zero. The Type Identifier has no explicit or self-describing meaning within the stream itself; in other words, a Value does not contain a type (or "class") definition. Instead, the encoder (the sender) and the decoder (the receiver) share an implicit understanding, called a Context, which includes the necessary metadata, including the user type definitions.
The binary format for a User Type is very similar to that of a Sparse Array; conceptually, a User Type can be considered a Sparse Array of property values. The format for User Types is the Type Identifier (an Integer Value greater than or equal to zero), followed by the Version Identifier (an Integer Value greater than or equal to zero), followed by index/value pairs, each of which is composed of a Property Index encoded as an Integer Value i (0 <= i) whose value is greater than the previous Property Index, and a Property Value encoded as a Value; the User Type is finally terminated with an illegal Property Index of -1.
Like the Sparse Array, any property that is not included as part of the User Type encoding is assumed to have a default value. The default value is false for the Boolean type, zero for all numeric, octet and char types, and null for all reference types.
This section includes the following topic:
Parent topic: The PIF-POF Binary Format
Versioning of User Types
Versioning of User Types supports the addition of properties to a User Type, but not the replacement or removal of properties that existed in previous versions of the User Type. By including the versioning capability as part of the general binary contract, it is possible to support both backward and forward compatibility.
When a sender sends a User Type value of a version v1 to a receiver that supports version v2 of the same User Type, the receiver uses default values for the additional properties of the User Type that exist in v2 but do not exist in v1.
When a sender sends a User Type value of a version v2 to a receiver that only supports version v1 of the same User Type, the receiver treats the additional properties of the User Type that exist in v2 but do not exist in v1 as opaque. If the receiver must store the value (persistently), or if the possibility exists that the value is ever sent at a later point, then the receiver stores those additional opaque properties for later encoding. Sufficient type information is included to allow the receiver to store off the opaque property values in either a typed or binary form; when the receiver re-encodes the User Type, it must do so using the Version Indicator v2, since it is including the unaltered v2 properties.
Parent topic: Binary Format for User Types