The Jar Format

[Contents]


THE JAR FORMAT

This document describes the Java archive (JAR) format, a set of conventions for associating digital signatures, installer scripts, and other information with files in a directory. Signing tools such as the JAR Packager use this format to create JAR archive files, which are used by Communicator client software to support automatic software installation, user-controlled access to local system resources by Java applets, and other features that help address potential security problems.

The JAR file type is a registered Internet MIME type based on the standard cross-platform ZIP archive format. A JAR file functions as a digital envelope for a compressed collection of files. The JAR file type is distinct from the JAR format, which is simply a way of organizing information in a directory.

This document is intended for developers who want to package files into a JAR archive without using the JAR Packager or who want to write software that can create JAR archives.

For information about hashing algorithms, digital signatures, and other aspects of cryptography mentioned in this document, see Applied Cryptography by Bruce Schneier (Wiley, 1996).

CONTENTS OF A JAR ARCHIVE

Every JAR archive is organized according to the scheme outlined in Figure 1.1.

Figure 1.1 The JAR format

A JAR archive has a subdirectory of meta-information named META-INF. This subdirectory contains the following:

  • A single manifest file named MANIFEST.MF. Manifest files can contain arbitrary information about the files in the archive, such as their encoding or language. If the JAR archive is intended for use with the SmartUpdate feature of Communicator, the manifest file must include, at a minimum, the address of the installation file. Manifest files are discussed in "Creating a Manifest File".
  • Zero or more signature instruction files named name.SF. There is one of these files for each entity that has signed files in the archive. Signature instruction files are discussed in "Creating a Signature Instruction File".
  • Zero or more digital signature files named name.suf, where the suffix is determined by the digital signature format. There is at least one of these files for each signature instruction file. Digital signature files are discussed in "Creating a Digital Signature File".

You can use the JAR archive format to package files even if you don't want to sign any of them. In this case, you won't have any signature instruction files or digital signature files.

In addition to the MANIFEST.MF subdirectory, the archive contains whatever files you want to package in the archive, such as the files to be installed using the automatic software installation feature of Communicator. These files can be arranged hierarchically. For automatic software installation, one of those files must be the installer for the package.

[Contents] [Top]

CREATING A JAR ARCHIVE

To create a JAR file from scratch (that is, without using the JAR Packager or a similar signing tool), follow these steps:

  1. Identify the components of the software package that need to be installed on the end user's machine. If you're going to sign the files, you may want to create a ZIP file for all components signed by a single entity. In this way, you can sign the single ZIP file instead of signing the individual files that make up the complete installation. That signed ZIP file is then itself "zipped" as part of the larger JAR archive.
  2. Create an installer for your files.
  3. Create a manifest file for your archive. To support automatic software installation, at minimum the manifest file must contain the location of the installer. It can also contain arbitrary extra information about the content of the archive. See "Creating a Manifest File" for information on how to create this file.
  4. If you're going to sign your files, create the appropriate files for the META-INF directory. See "Associating Digital Signatures with Files in a JAR Archive" for how to create these files.
  5. Combine the META-INF files with the other files of your package into a single ZIP file.

[Contents] [Top]

CREATING A MANIFEST FILE

A JAR archive contains exactly one manifest file. The manifest file is an ASCII file named MANIFEST.MF and is in the META-INF directory. Its filename should be generated as uppercase, but should be recognizable using any case.

A manifest file is created from the files in the archive, as shown in Figure 1.2.

Figure 1.2 The manifest file

In Figure 1.2, lines in the archive box indicate individual files, as do the filenames. In the manifest box, the lines indicate lines in the actual file.

The manifest file has sections for a subset of the files in the archive. Not all files in the archive need be included in the manifest. These sections can use RFC822-style headers to store arbitrary information associated with a file, such as the encoding of the file, its language, or its creator.

For automatic software installation, the manifest file must contain a section for the installer. This section must contain the name of the installer and the Install-Script keyword indicating that it is the installer. This section can contain any other appropriate information about the installer.

The manifest file must include sections for all files that are to be signed. The manifest file must not be one of the files referred to in itself.

[Contents] [Top]

SAMPLE MANIFEST FILE

The following is an example manifest file, including the base 64 representation of the relevant hashes:

Manifest-Version: 1.0
Install-Script: myinstall.js

Name: com/cl1.class 
Digest-Algorithms: MD5 SHA1 
MD5-Digest: CmnX58sSgpufbr2RW/WgStS= 
SHA1-Digest: 6goIqkIAUwu6T+sOs3O+hURpLRQ=

Name: com/cl2.class 
Digest-Algorithms: MD5
MD5-Digest: EnpY95pTfqytbr1MV/RgKwR=

Name: myinstall.js 
Digest-Algorithms: MD5 
MD5-Digest: DnpX49tTgqygak1LH/YgTw==

[Contents] [Top]

FILE FORMAT DESCRIPTION

A preliminary portion appears at the top of the manifest file containing this standard's version number:

Manifest-Version: 1.0

Optionally, a version required for use may also be specified. This indicates that only tools of the given version or later can be used to manipulate the file. It looks like:

Required-Version: 1.0

If Version is higher than Required-Version, extensions may be used.

After the preliminaries, the manifest file consists of sections, each of which represents a file in the archive. In general, a section specifies a file in the archive, the names of one or more hashing algorithms used on that file (although this is optional), and a hash (also called digest) value of the file for each of those algorithms.

For use with Netscape products, the file should be hashed with either the MD5 or SHA1 algorithm. It is beyond the scope of this document to tell you how to produce the hash value for your file using your chosen hashing algorithm; for more information, see Applied Cryptography by Bruce Schneier (Wiley, 1996).

A section can also contain other headers not covered here. These may include information used by applications for their own purposes.

[Contents] [Top]

FILE FORMAT GRAMMAR

The following is a grammar for a manifest file. In the grammar, items in quotation marks are literals to be entered exactly as they appear, without the surrounding quotation marks. Other items are placeholders defined elsewhere in the grammar.

manifest-file: "Manifest-Version: 1.0" newline
   optional-header newline
   manifest-start
   *manifest-entry

manifest-start: section

manifest-entry: section

optional-header: "Required-Version: " number

section: 
"Name: " filename newline
   "Digest-Algorithms: " algorithm-list newline
   +{algorithm "-Digest: " (base64 representation of hash) newline}
   *header +newline

algorithm-list: algorithm *whitespace algorithm-list | algorithm

algorithm: +headerchar

filename: any filename supported by the format used to package the JAR archive

header: alphanum *headerchar ":" SPACE *otherchar newline *continuation

continuation: SPACE *otherchar newline

alphanum: {"A"-"Z"} | {"a"-"z"} | {"0"-"9"}

headerchar: alphanum | "-" | "_"

otherchar: any Unicode character except NUL, CR and LF

number: +{"0"-"9"} "." +{"0"-"9"}

newline: CR LF | LF | CR (not followed by LF)

whitespace: "," | SPACE | TAB | NEWLINE

Additional restrictions:

For further information about this file format, see "Additional File Information".

[Contents] [Top]

ASSOCIATING DIGITAL SIGNATURES WITH FILES IN A JAR ARCHIVE

To associate digital signatures with files in a JAR archive, you need several files in the META-INF directory of the archive. Figure 1.3 shows the necessary files.

Figure 1.3 Files involved in a signed archive

In Figure 1.3, each line in the archive box represents a file. In all of the other boxes, each line represents a section of the individual file.

These are the files listed in the META-INF directory:

  • A manifest file. Manifest files are discussed in "Creating a Manifest File". This ASCII file contains a section for each file in the archive that is signed. (Not all files in the archive need to be signed.) A section contains the name of the file, the hashing algorithm used on the file, and a hash of the file.
  • A signature instruction file for each signer of a file in the archive. This ASCII file contains a section for each section of the manifest file corresponding to a file this signer is signing. A section contains the name of the original file in the archive, the hashing algorithm used on the section of the manifest file, and a hash of that section.
  • A digital signature file for each signer of a file in the archive. This binary file is not intended to be interpreted by humans. It contains a hash of the contents of the corresponding signature instruction file using the private key of that signer.

The following sections explain in more detail the contents of each file and how each is created.

CREATING A SIGNATURE INSTRUCTION FILE

The manifest file simply lists the files in the archive that are to be signed. It does not associate signatures with any of those files. Between them, this is the function of signature instruction files and the digital signature files.

A JAR archive has a pair of signature instruction and digital signature files for each signer of one or more files in the archive. For example, an archive that contains two signed by Netscape and two files signed by MyCo contains these four files:

NETSCAPE.SF
NETSCAPE.RSA
MYCO.SF
MYCO.RSA

You create the signature instruction file first and then create the digital signature file from it. This section describes the contents of the signature instruction file; see "Creating a Digital Signature File" for a description of the digital signature file.

As shown in Figure 1.4, each signature instruction file corresponds to one signer to the archive. It is an error to have multiple signers in the same signature file. If more than one signature is found, additional signatures may be ignored.

The purpose of signature instruction files is to allow policy to be embedded in headers supplied by the signer rather than by the manifest owner (who is normally the person who generates the archive).

Figure 1.4 The signature instruction files

A signature instruction file has a name of the form name.SF where name contains only the characters A-Z, 0-9, and dash or underscore. name must not be more than 8 characters.

Note that the header section of the MANIFEST.MF file may optionally have a corresponding section in each signature instruction file that contains a hash of the header section. This can be useful if the manifest header contains sensitive information that you wish to be able to validate.

Sample Signature Instruction File

The following is a sample signature instruction file:

Signature-Version: 1.0

Name: com/cl1.class
Digest-Algorithms: MD5 SHA1
MD5-Digest: DnpZ49tTfqyfbr1MV/YgTw==

Name: com/cl2.class
Digest-Algorithms: MD5
MD5-Digest: QmsW83uUfpzgad2NH/WkMx==

File Format Description

A preliminary portion appears at the top of a signature instruction file containing, at minimum, this standard's version number:

Signature-Version: 1.0

Policy information supplied by the signer but not specific to any particular pathname should be included here.

The sections in a signature instruction file correspond to sections in the manifest file (and hence to files in the archive). There is one section for each file for which this signer claims responsibility. These sections are similar in form to those in the manifest file.

In general, a section specifies a file in the archive, the names of one or more hashing algorithms used on the manifest section corresponding to that file, and hashes of the section using each of those algorithms.

Once again, you must use the MD5 or SHA1 hashing algorithm. It is beyond the scope of this guide to tell you how to produce hash values.

As with a manifest file, a section can also contain other headers not covered here and which are ignored by Netscape.

Sections appearing in the manifest file but not in the signature file are not used in calculating the security of the archive. This allows subsets of the archive to be signed.

File Format Grammar

The following is the grammar for a signature instruction file. In the grammar, items in quotation marks are literals that are to be entered exactly as they appear, without the surrounding quotation marks. Other items are placeholders defined elsewhere in the grammar.

signature-file: "Signature-Version: 1.0" newline
   optional-header newline
   signature-start
   *signature-entry

signature-start: section

signature-entry: section

optional-header: "Required-Version: " number

section: 
   "Name: " filename newline
   optional-"Digest-Algorithms: " algorithm-list newline
   +{algorithm "-Digest: " (base64 representation of hash) newline}
   *header +newline

algorithm-list: algorithm *whitespace algorithm-list | algorithm

algorithm: +headerchar

filename: any filename supported by the format used to package the JAR archive

header: alphanum *headerchar ":" SPACE *otherchar newline *continuation

continuation: SPACE *otherchar newline

alphanum: {"A"-"Z"} | {"a"-"z"} | {"0"-"9"}

headerchar: alphanum | "-" | "_"

otherchar: any Unicode character except NUL, CR and LF

number: +{"0"-"9"} "." +{"0"-"9"}

newline: CR LF | LF | CR (not followed by LF)

whitespace: "," | SPACE | TAB | NEWLINE

Additional restrictions:

Validating a File in the Archive

The signature of the signature instruction file is first verified when the manifest is first parsed. This verification can be remembered, for efficiency. This only validates the signature directions themselves, not the actual archive files.

To validate a file, the hash value in the signature file is compared against a hash calculated against the corresponding entry in the manifest file. Then, a hash value in the manifest file is compared against a hash calculated against the actual data referenced in the Name header, which is either a relative path or a URL.

Some URLs may refer to data that cannot be signed; for example, telnet URLs cannot be signed. Hashes on URLs where signing is sensible are calculated on data only and do not include HTTP or FTP overhead.

CREATING A DIGITAL SIGNATURE FILE

A JAR archive has at least one digital signature file for each signature instruction file in the archive, as shown in Figure 1.5. A digital signature file is a signed version of the signature instruction file. It is a binary file not intended to be interpreted by humans.

Figure 1.5 Digital signature files

A digital signature file has the same name as its corresponding signature instruction file, but with a suffix corresponding to the type of digital signature. For example, if the archive contains the signature instruction file NETSCAPE.SF, it must also contain a digital signature file, such as NETSCAPE.RSA.

Netscape products understand files signed with the PKCS #7 signature using MD5 and RSA to sign it. These files have a suffix of RSA. Other possible suffixes are .DSA for a PKCS #7 signature using DSA and .PGP for a Pretty Good Privacy signature. In general, suffixes may be 1 to 3 alphanumeric characters. Suffixes for unrecognized filetypes are ignored.

For those formats that do not support external signed data, the digital signature file consists of a signed copy of the signature instruction file. Thus some data may be duplicated and a verifier ought to compare the two files.

Formats that support external data either refer to the signature instruction file or perform calculations on it with implicit reference.

Each signature instruction file may have multiple digital signatures, but those signatures ought to be generated by the same legal entity.

If is beyond the scope of this document to tell you how to create a digital signature file from a private key and a given file.

ADDITIONAL FILE INFORMATION

This section describes additional information about the manifest and signature instruction file formats.

Before parsing:

  • If the last character of the file is an EOF character (code 26), the EOF is treated as whitespace.
  • Two newlines are appended to the end of each file.

Binary data:

  • Binary data of any form is represented as base 64. Continuations are required for binary data which causes line length to exceed 72 bytes. Examples of binary data are hashes and signatures.

Line length:

  • No line may be longer than 72 bytes (not characters), in its UTF8-encoded form. If a value would make the initial line longer than this, it should be continued on extra lines (each starting with a single SPACE).

Errors:

  • If a file cannot be parsed, a warning should be output and none of the signatures should be trusted.

Limitations:

  • NUL, CR, and LF cannot be embedded in header values. NUL, CR, LF and ":" cannot be embedded in header names.
  • It's more difficult to test conformance without limits on the complexity of a file. So, implementations should support 65535-byte (not character) header values, and 65535 headers per file. They might run out of memory, but there should not be hard-coded limits below these values.

Signers:

  • A single signature file must correspond to a single signer. It is technically possible for different entities to use different signing algorithms to share a single signature file. In this case, extra signatures may be ignored.

Algorithms:

  • When multiple hashing algorithms are presented for validation, a client application need only choose one, even if the client understands multiple algorithms. A client is also free to choose to check with any other hashing algorithms it knows, but this is not mandated.

Magic:

[Contents] [Top]

Any sample code included above is provided for your use on an "AS IS" basis, under the Netscape License Agreement - Terms of Use