ZIP files are supported as follows:
ZIP files can have either no compression or the standard Deflate compression algorithm. ZIP files that use a compression scheme other than the Deflate algorithm are not treated as ZIP files. In this case, one record is created for the file, with the
Endeca.File.IsArchive
property set tofalse
.There is no support for ZIP files with password-protected entries. ZIP files that contain password-protected entries are not fully processed. The actual behavior depends on the form of password protection:
If using the AES-128 or AES-256 forms of password encryption, the file is not marked as a ZIP file. One record is created for the file, with the
Endeca.File.IsArchive
property set tofalse
.If using the ZipCrypto password protection, the ZIP is recognized, and each entry that is encountered in order that is not password-protected will have a record created for it. Once a password-protected entry is encountered, the processing on the ZIP stops, and no further records are created.
For a number of ZIP utilities, directory entries are not password-protected (so that only the files are encrypted), and that directory entries are often put at the beginning of a ZIP. One record is created for the file, with the
Endeca.File.IsArchive
property set totrue
, and additional records are created for those (directories) that are not encrypted.
There is no support for entries that are split across multiple Zip files. Splitting a file over multiple ZIP files results in two kinds of ZIP files: those that store the partial data for the underlying file and and a "last" one that also stores the entry information. Different tools use different naming conventions, so sometimes the partials have a .zip extension and sometimes they do not. However, the last file will be a .zip file. These files are handled as follows:
The partial files will not be recognized as ZIP files. One record is created for the file, with the
Endeca.File.IsArchive
property set tofalse
.The last file will be recognized as a ZIP file, but its entry will be unreadable. One record is created for the file, with the
Endeca.File.IsArchive
property set totrue
.
When a Zip file is not treated as a valid ZIP file for any reason, the log file will contain a warning that the ZIP file in question contains an "invalid CEN header", and the record generated for the ZIP file will not indicate that it is an archive.
Note
JAR files are handled the same way as ZIP files. Therefore, any caveats that apply to ZIP files also apply to JAR files as well.