Document Information


Part I Introduction

1.  Overview

2.  Using the Tutorial Examples

Part II The Web Tier

3.  Getting Started with Web Applications

4.  Java Servlet Technology

5.  JavaServer Pages Technology

6.  JavaServer Pages Documents

7.  JavaServer Pages Standard Tag Library

8.  Custom Tags in JSP Pages

9.  Scripting in JSP Pages

10.  JavaServer Faces Technology

11.  Using JavaServer Faces Technology in JSP Pages

12.  Developing with JavaServer Faces Technology

13.  Creating Custom UI Components

14.  Configuring JavaServer Faces Applications

15.  Internationalizing and Localizing Web Applications

Part III Web Services

16.  Building Web Services with JAX-WS

17.  Binding between XML Schema and Java Classes

18.  Streaming API for XML

19.  SOAP with Attachments API for Java

Part IV Enterprise Beans

20.  Enterprise Beans

21.  Getting Started with Enterprise Beans

22.  Session Bean Examples

23.  A Message-Driven Bean Example

Part V Persistence

24.  Introduction to the Java Persistence API

25.  Persistence in the Web Tier

26.  Persistence in the EJB Tier

27.  The Java Persistence Query Language

Part VI Services

28.  Introduction to Security in the Java EE Platform

29.  Securing Java EE Applications

30.  Securing Web Applications

31.  The Java Message Service API

32.  Java EE Examples Using the JMS API

33.  Transactions

34.  Resource Connections

35.  Connector Architecture

Part VII Case Studies

36.  The Coffee Break Application

37.  The Duke's Bank Application

Part VIII Appendixes

Further Information about Character Encoding

B.  About the Authors



Appendix A

Java Encoding Schemes

This appendix describes the character-encoding schemes that are supported by the Java platform.


US-ASCII is a 7-bit character set and encoding that covers the English-language alphabet. It is not large enough to cover the characters used in other languages, however, so it is not very useful for internationalization.


ISO-8859-1 is the character set for Western European languages. It’s an 8-bit encoding scheme in which every encoded character takes exactly 8 bits. (With the remaining character sets, on the other hand, some codes are reserved to signal the start of a multibyte character.)


UTF-8 is an 8-bit encoding scheme. Characters from the English-language alphabet are all encoded using an 8-bit byte. Characters for other languages are encoded using 2, 3, or even 4 bytes. UTF-8 therefore produces compact documents for the English language, but for other languages, documents tend to be half again as large as they would be if they used UTF-16. If the majority of a document’s text is in a Western European language, then UTF-8 is generally a good choice because it allows for internationalization while still minimizing the space required for encoding.


UTF-16 is a 16-bit encoding scheme. It is large enough to encode all the characters from all the alphabets in the world. It uses 16 bits for most characters but includes 32-bit characters for ideogram-based languages such as Chinese. A Western European-language document that uses UTF-16 will be twice as large as the same document encoded using UTF-8. But documents written in far Eastern languages will be far smaller using UTF-16.

Note - UTF-16 depends on the system’s byte-ordering conventions. Although in most systems, high-order bytes follow low-order bytes in a 16-bit or 32-bit “word,” some systems use the reverse order. UTF-16 documents cannot be interchanged between such systems without a conversion.