Skip Headers
Oracle® Communications Services Gatekeeper Patch Release Notes
Release 5.0.0.1

Part Number E24004-03
Go to Documentation Home
Home
Go to Book List
Book List
Go to Table of Contents
Contents
Go to Feedback page
Contact Us

Go to previous page
Previous
View PDF

12 Data Coding

This chapter provides additional information about how the Parlay X 2.1 Short Messaging/SMPP communication service handles data coding. It does not describe any new or updated features.

Character Set Encoding

The SMPP protocol expects the sender name value in ASCII characters. The use of non-ASCII characters can cause the request to become garbled or even to be removed at the SMSC.

The maximum size of an SMS message is 140 bytes, regardless of the type of data coding used. If the content exceeds 140 bytes, Service Gatekeeper sends it as multiple SMS messages.

Standard and Extended GSM Alphabets

The standard GSM 03.38 alphabet uses 7 bits per character, allowing for 128 different characters with hexadecimal values 0x00 to 0x7F.

If all the characters in an SMS message are from the standard GSM alphabet, it is possible to send 160 of these 7-bit encoded characters in one SMS message of 140 bytes. This is because 140 bytes equals 1120 bits and if each character uses 7-bits, 160 (1120/7) characters fit into the message.

There is also an extended GSM alphabet that defines an additional 10 characters along with the original 128. These characters are sent as two 7-bit encoded characters, starting with the 7-bit encoded escape character (0x1B) from the standard alphabet. For example, if a message contains the character { from the extended alphabet, this character is encoded as 1B 28 where 1B is the escape character and 28 is the { extended character.

Each extended character requires two 7-bit encoded characters (escape character + extended character). Therefore, an SMS message containing a combination of characters from the standard GSM alphabet and characters from the extended GSM alphabet will hold fewer than 160 characters. The exact number depends on the particular mix of standard and extended characters.

For a list of the characters defined in the GSM standard and extended alphabets see:

http://www.csoft.co.uk/sms/character_sets/gsm.htm

To indicate that only SMS messages in which all the characters are from the standard or extended GSM alphabet, the DefaultDataCoding attribute should be set to 0. This is the default. setting. If the DefaultDataCoding attribute is set to 0 and the SMS message contains characters that are not in the standard or extended GSM alphabets, Services Gatekeeper rejects the message and throws an exception.

Other Alphabets

It is possible to send characters that are not in the standard or extended GSM alphabets if the DefaultDataCoding attribute is configured appropriately.

In addition to the standard and extended GSM alphabets (called the "SMSC Default Alphabet" in the SMPP v3.4 specification), two other common character sets are the IA5/ASCII character set and the UCS2 character set.

In the IA5/ASCII alphabet, the characters are 8-bit encoded, in other words one byte per character, so it is possible to send 140 of these 8-bit encoded characters in one SMS message that uses this coding scheme. If you are using the IA5/ASCII alphabet, set the DefaultDataCoding attribute for the plug-in to 1.

Characters in the UCS2 alphabet are 16-bit encoded, requiring two bytes per character, so it is possible to send only 70 of these characters in a single SMS message. If you are using the UCS2 alphabet, set the DefaultDataCoding attribute for the plug-in to 8.

For a complete list of supported character set values, see the "data_coding" section in the SMPP v3.4 specification.

Overriding the DefaultDataCoding Attribute

You can override the DefaultDataCoding attribute in requests using an xparameter or an SLA setting. This makes it possible, for example, to use the standard 7-bit GMS alphabet as the default but to send specific SMS messages using a different character set.

Use the data_coding xparameter for parameter tunneling in the header of the request or the com.bea.wlcp.wlng.plugin.sms.DataCoding parameter for defining the coding scheme in the <requestContext> element of an SLA.

For example, although the DefaultDataCoding parameter may be set to 0 for a plug-in instance, the following SOAP header sets the data coding scheme for its SMS message to 8, stipulating that the UCS2 character set should be used for encoding the SMS message in this particular request:

<soapenv: Header>
. . . 
    <xparams>
        <param key="data_coding" value="8" />
    <xparams>
. . . 
</soapenv:Header>

In the next example, the <requestContext> element in an SLA sets the data coding scheme to 1, stipulating that the IA5/ASCII character set should be used for encoding SMS messages initiated by the application associated with this particular SLA:

<requestContext>
  <contextAttribute>          
    <attributeName>ccom.bea.wlcp.wlng.plugin.sms.DataCoding</attributeName>
    <attributeValue>1</attributeValue>
  </contextAttribute>
</requestContext>