5Data Types
Data Types
This chapter describes the ACORD XML data types.
ACORD XML Data Types
The following data types are used to represent all data passed between clients and servers using the messages defined in the ACORD specification. All information elements are based on these data types. Supported data types are discussed in the sections that follow.
Character
Character indicates an element that allows character data up to a maximum number of characters, regardless of the number of bytes required to represent each character. The number after the hyphen specifies the maximum number of characters. For example, C-12 specifies an element of characters with maximum length 12 characters. C-Infinite indicates an element with no maximum length.
Narrow Character
Elements of type Narrow Character are elements of character data type with the additional restriction that the only allowable characters are those contained within the ISO Latin-1 character set.
Boolean
The Boolean data type has two states, true and false. True is represented by the literal character 1 (one), while false is represented by the literal character 0 (zero). Unless otherwise indicated in this specification, an optional element of type Boolean is implied to be not answered if it is absent.
Date and Time Formats
This specification uses the date and time format specified in the ISO 8601 standard. The specification includes five time-related compound data types: YrMon, Date, Time, DateTime, and Timestamp. In all types that describe Date information, the specification uses the Gregorian calendar. Data types including time information refer to a 24-hour clock.
There is one format for representing dates, times, and time zones. The complete form is:
YYYY-MM-DDTHH:mm:ss.ffffff±HH:mm
where all punctuation and the T are literal characters; “YYYY” represents a four-digit year; “MM” represents a two-digit month; “DD” represents a two-digit date; the first “HH” represents a two-digit, 24-hour format hour; the first “mm” represents a two-digit minute; “ss” represents a two-digit second; and “ffffff” represents fractional seconds, and may be of any length. The second “HH” and “mm” describe the time zone offset from coordinated universal time (UTC), in hours and minutes, respectively. The “±” can be either a “+” or a “-” depending on whether the time zone offset is positive or negative.
All date and time types include (with the largest units given first) year, month, day, hour, minute, second, and fractions of a second. Any particular type may include a subset of these possible values. Types including time information (hour, minute, and so on) may also include an offset from Coordinated Universal Time (UTC).
As a general rule for date and time compound data types, values may be entered that omit the smallest logical elements. In every case, the value is taken to mean the same thing as if the minimum values (such as zeros) were included. (The default is always the start of an otherwise ambiguous range for types other than YrMon.) For example, a DateTime value omitting the time portion means the start of the day (12:00 midnight). Note that time zone qualifiers (in time and DateTime values) are exceptions to this rule, as they may be included even if times are not specified to the millisecond.
The logical elements appearing in each of these compound data types are summarized in the following table. “Required” means that the element must occur in all instances of the data type. “Recommended” means that the element should be included in all instances of the data type. “Optional” elements may be omitted from an instance of the data type. Optional elements must be included if smaller elements are to be included. For example, month must not be omitted from a date value if day is included.
Contains | YrMon | Date | Time | DateTime | Timestamp | |
---|---|---|---|---|---|---|
Year |
YYYY 0000-9999 |
Required |
Required |
N/A |
Required |
Required |
Month |
MM 01-12 |
Required |
Optional |
N/A |
Required |
Required |
Day |
DD 01-31 |
N/A |
Optional |
N/A |
Required |
Required |
Hours |
HH 0-23 |
N/A |
N/A |
Required |
Optional |
Required |
Minutes |
MM 0-59 |
N/A |
N/A |
Optional |
Optional |
Required |
Seconds |
SS 0-60 |
N/A |
N/A |
Optional |
Optional |
Required |
Fractional Seconds |
XXX (minimum) Precision is determined by the implementation |
N/A |
N/A |
Optional |
Optional |
Optional |
UTC offset (time zone indication) |
Hours/Minutes -12:59 to +12:59 |
N/A |
N/A |
Recommended |
Recommended |
Recommended |
YrMon
Elements of data type YrMon contain an indication of a particular month. This data type describes a unique period of time (not a repeating portion of every year
Tags specified as type YrMon accept years and months in the YYYY-MM format.
Date
Elements of data type Date contain an indication of a particular day. This data type describes a unique period of time, normally 24 hours (not a repeating portion of every year).
Tags specified as type Date accept dates in the YYYY-MM-DD format.
Time
Elements of data type Time contain an indication of a particular time during a date. This data type describes a repeating portion of a day. That is, each time described (ignoring leap seconds) occurs once per calendar date. In the specification, it is required that a time data type be able to represent a specific period with indefinite precision. Milliseconds are the minimum required precision of the time data type.
Tags specified as type Time accept times in the following format:
hh:mm:ss.ffffff±HH:mm
A time represented using this data type must not be ambiguous with respect to morning and afternoon. That is, the time must occur once and only once each 24-hour period.
In addition, the Time data type must not be ambiguous with respect to location at which the time occurs. If unspecified, the time zone defaults to Coordinated Universal Time (UTC). Generally, use of a specific time zone in the representation is preferred. The time zone should always be specified to avoid ambiguous communication between clients and servers.
DateTime
Tags specified as type DateTime accept a fully formatted date/time/time zone string. For example, “1996-10-05T13:22:00.124-5:00” represents October 5, 1996, at 1:22 and 124 milliseconds p.m., in Eastern Standard Time. This is the same as 6:22 p.m. Coordinated Universal Time (UTC).
Several portions of a DateTime element are optional. The following table describes the optional components and the meaning if they are absent.
Component | Meaning If Absent |
---|---|
±HH:mm (time zone offset) |
+00:00 (UTC) |
THH:mm:ss.ffffff±HH:mm (time component) |
T00:00:00+00:00 (midnight, UTC) |
:ss.ffffff (seconds and fractional seconds) |
:00.000000 (zero seconds) |
.ffffff (fractional seconds) |
.000000 (zero fractional seconds) |
Timestamp
Elements of data type Timestamp contain the same information as DateTime values. Unlike that data type, Timestamp information is not intended to have meaning at the other end of the communication. In addition, microseconds are the minimum required precision of the time portion of this data type.
The intent here is to describe a type identical to DateTime but without semantic meaning between two machines. The general DateTime data type has meaning on both ends of the protocol (even though time synchronization is not required by this specification). Timestamp indicates an exact point in time with respect to the generating application.
For example, a Timestamp value may be generated at a server when creating an audit response. The client application may return that value to the server in later requests, but the client software should not interpret the information.
Phone Number
Phone Number indicates a string of up to 32 narrow characters in length (NC-32). It must begin with a plus sign (+) followed by country code, a hyphen, city/area code, another hyphen, and then the local phone number. If a PBX extension is to be included, it must appear at the end of the field, separated from the rest of the telephone number by a plus sign.
For example, “+1-800-5551212+739” indicates PBX extension 739 at phone number 5551212 within area code 800 of North America (country code 1).
Decimal
Decimal indicates a numeric value that meets the following rules:
The value is up to 15 digits in length, excluding any punctuation (sign, decimal, currency symbol, and so on).
The value is not restricted to integer values and has a decimal point that may be placed anywhere from the start to the second last digit in the value, but not after the last digit (for example, +.12345678901234 is acceptable while 12345678901234567 is not).
The sign is always optional. If it is absent, the value is assumed to be positive.
Absence of a decimal point implies one after the last digit (that is, an integer).
The Decimal data type is always expressed as a Base-10, ASCII-character-set string.
For example: +1234567890.12345 is acceptable, while 12345678901234567 is not.
Long
The Long data type is an Integer expressed as a Base-10, ASCII-character-set string representation of a 32-bit signed integer in the range -2147483648 to +2147483647. Elements of type Long do not permit a decimal point.
Enum
Enum is a Narrow Character type that has a limited number of specified valid values, each of which is represented by a tag of up to 80 characters. The Enum data type is either a Closed Enum or an Open Enum. Adding a value to a Closed Enum requires a spec update, while adding a value to an Open Enum only requires out-of-band agreement by the end points. Open Enums may also be extended using SPX.
Wherever it is appropriate to reference a non-ACORD code list, a reference to the <CodeList> aggregate can be created as listed in the following table. (See the Common Aggregates section of the Business Message Specification.)
Tag | Type | Usage | Description |
---|---|---|---|
@CodeInfoRef |
Identifier Reference |
Optional |
A Reference to the Identifier of the <CodeList> |
Closed Enum
A Closed Enum is an element where a number of valid values are defined within this specification. All other values should be rejected as invalid.
Open Enum
An Open Enum is an element where a number of valid values are defined within this specification, but other values should not be rejected as invalid by any system other than the final message destination. Open Enums provide a mechanism for a client and final destination server to communicate with values that may be known to both endpoints but not to all intermediate servers that route the message. Open Enums are typically used for elements related to system message processing and have been defined as open to support extensibility and customization of the specification.
Identifiers
This specification provides three different types of identifiers:
Assigned Identifiers
Transient Unique Identifiers
Universally Unique Identifiers
Assigned Identifiers
An assigned identifier is created by an organization, carrier, agent, state, or other body. These include policy numbers, social security numbers, passport IDs, driver’s license numbers, and so on.
Object identifiers in the specification are of the data type “Assigned Identifier”. This is a Character data type with a maximum length of 36.
Transient Unique Identifiers
An ACORD document provides a unique identifier with the XML stream that is used for referencing information within the document. This is a transient identifier, as listed in the following table, that is only used to link information within a document stream. As the word transient implies, the identifiers are not meant for use once the message has been processed by the receiving system.
Transient identifiers in the specification are of the data type “Identifier.” This is Character data that matches the XML rules for ID attribute data type values.
Tag | Type | Usage | Description |
---|---|---|---|
@id |
Identifier |
Optional |
A document unique identifier used when an object (element) needs to be referenced elsewhere in the document. An ID should only be present on an element when it is being referenced within the stream. |
Transient identifiers are not used on framework tags. These identifiers are used on all elements and aggregates that appear after the business message level except:
<ActionCd>
<PreviousValue>
<ChangeDesc>
<RqUID>
<SystemID>
The transient identifier is optional, except when used with the following tags, when it is required:
<SPFieldEditDefinition>
<SPRelationalEditDefinition>
Universally Unique Identifier (UUID)
UUID elements are Narrow Character with a maximum length of 36. Applications can often obtain conforming UUIDs by calls to the operating system or the run-time environment.
The following information on UUID is based on Internet-Draft <leach-uuids-guids-01.txt>:
A UUID is an identifier that is unique across both space and time, with respect to the space of all UUIDs. To be precise, the UUID consists of a finite bit space. Thus, the time value used for constructing a UUID is limited and will roll over in the future (approximately at A.D. 3400, based on the specified algorithm). A UUID may be used for multiple purposes, from tagging objects with an extremely short lifetime to reliably identifying very persistent objects across a network.
The generation of UUIDs does not require that a registration authority be contacted for each identifier. Instead, it requires a unique value over space for each UUID generator. This spatially unique value is specified as an IEEE 802 address, which is usually already available to network-connected systems. This 48-bit address may be assigned based on an address block obtained through the IEEE registration authority. This section of the UUID specification assumes the availability of an IEEE 802 address to a system desiring to generate a UUID, but if one is not available, section 4 specifies a way to generate a probabilistically unique one that cannot conflict with any properly assigned IEEE 802 address.
In its most general form, all that may be said of the UUID format is that a UUID is 16 octets, and that some bits of octet 8 of the UUID called the variant field (specified in the next section) determine finer structure.
For use in human-readable text, a UUID string representation is specified as a sequence of fields, some of which are separated by single dashes. Each field is treated as an integer and has its value printed as a zero-filled hexadecimal digit string with the most significant digit first. The hexadecimal values a to f inclusive are output as lowercase characters, and are not case sensitive on input. The sequence is the same as the UUID constructed type. The formal definition of the UUID string representation is provided by the following extended BNF:
String |
Value |
---|---|
UUID |
<time_low> “-” <time_mid> “-” <time_high_and_version> “-” <clock_seq_and_reserved> <clock_seq_low> “-” <node> |
time_low |
4*<hexOctet> |
time_mid |
2*<hexOctet> |
time_high_and_version |
2*<hexOctet> |
clock_seq_and_reserved |
<hexOctet> |
clock_seq_low |
<hexOctet> |
node |
6*<hexOctet |
hexOctet |
<hexDigit> <hexDigit> |
hexDigit |
zero | “1” | “2” | “3” | “4” | “5” | “6” | “7” | “8” | “9”| “a” | “b” | “c” | “d” | “e” | “f” | “A” | “B” | “C” | “D” | “E” | “F” |
The following is an example of the string representation of a UUID:
f81d4fae-7dec-11d0-a765-00a0c91e6bf6
Identifier References
An Identifier Reference is a technique for referencing an identifier on an aggregate or element that is elsewhere in the stream. This specification provides two different types of identifier references:
Identifier Reference
Multiple Identifier References
Identifier Reference
Related to the transient identifier is its matching reference. Identifier References in the specification are of the data type “Identifier Reference.” This is Character data that matches the XML rules for ID attribute data type values and it must match a value in the current data stream. These values are typically shown as @xxxRef, where xxx is replaced with a value that describes the type of object or tag that the item references.
There is a special Identifier Reference, called CodeListRef, used on all tags of type Open and Closed Enum. Its usage is always optional. When used, it should reference the ID of a CodeList aggregate that identifies (among other things) the owner of the code list. Although it is not shown in the rest of this document, it is defined in the next section.
Multiple Identifier References
When an aggregate or element references more than one item in the stream, a “Multiple Identifier References” data type is defined and is typically shown as @xxxRefs, where xxx is replaced with a value that describes the type of object that the item references.
URL
A Uniform Resource Locator (URL) is of the Narrow Character data type with a length of up to 1024 characters (NC-1024). URLs are defined in RFC 1738, which is a subset of the Uniform Resource Identifier (URI) specification (RFC 2396). URLs contain only the printable US-ASCII characters 32 through 126 decimal.
An element of the Uniform Resource Locator URL data type specifies the URL where a customer may access information. A URL is of the Narrow Character data type with a length of 1024 Characters (NC-1024). The format of a URL begins with a string that identifies which protocol is to be used to access the information, such as “http://”.