Use Catalog with XML Processors
Use the XML Catalog API with various Java XML processors.
The XML Catalog API is supported throughout JDK XML processors. The following sections describe how it can be enabled for a particular type of processor.
Use Catalog with DOM
To use a catalog with DOM, set the FILES
property on a DocumentBuilderFactory
instance as demonstrated in the following code:
static final String CATALOG_FILE = CatalogFeatures.Feature.FILES.getPropertyName();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
if (catalog != null) {
dbf.setAttribute(CATALOG_FILE, catalog);
}
Note that catalog
is a URI to a catalog file. For example, it could be something like "file:///users/auser/catalog/catalog.xml"
.
It’s best to deploy resolving target files along with the catalog entry file, so that the files can be resolved relative to the catalog file. For example, if the following is a uri
entry in the catalog file, then the XSLImport_html.xsl
file will be located at /users/auser/catalog/XSLImport_html.xsl
.
<uri name="pathto/XSLImport_html.xsl" uri="XSLImport_html.xsl"/>
Use Catalog with SAX
To use the Catalog feature on a SAX parser, set the catalog file to the SAXParser
instance:
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
spf.setXIncludeAware(true);
SAXParser parser = spf.newSAXParser();
parser.setProperty(CATALOG_FILE, catalog);
In the prior sample code, note the statement spf.setXIncludeAware(true)
. When this is enabled, any XInclude
is resolved using the catalog as well.
Given an XML file XI_simple.xml
:
<simple>
<test xmlns:xinclude="http://www.w3.org/2001/XInclude">
<latin1>
<firstElement/>
<xinclude:include href="pathto/XI_text.xml" parse="text"/>
<insideChildren/>
<another>
<deeper>text</deeper>
</another>
</latin1>
<test2>
<xinclude:include href="pathto/XI_test2.xml"/>
</test2>
</test>
</simple>
Additionally, given another XML file XI_test2.xml
:
<?xml version="1.0"?>
<!-- comment before root -->
<!DOCTYPE red SYSTEM "pathto/XI_red.dtd">
<red xmlns:xinclude="http://www.w3.org/2001/XInclude">
<blue>
<xinclude:include href="pathto/XI_text.xml" parse="text"/>
</blue>
</red>
Assume another text file, XI_text.xml
, contains a simple string, and the file XI_red.dtd
is as follows:
<!ENTITY red "it is read">
In these XML files, there is an XInclude
element inside an XInclude
element, and a reference to a DTD. Assuming they are located in the same folder along with the catalog file CatalogSupport.xml
, add the following catalog entries to map them:
<uri name="pathto/XI_text.xml" uri="XI_text.xml"/>
<uri name="pathto/XI_test2.xml" uri="XI_test2.xml"/>
<system systemId="pathto/XI_red.dtd" uri="XI_red.dtd"/>
When the parser.parse
method is called to parse the XI_simple.xml
file, it’s able to locate the XI_test2.xml
file in the XI_simple.xml
file, and the XI_text.xml
file and the XI_red.dtd
file in the XI_test2.xml
file through the specified catalog.
Use Catalog with StAX
To use the catalog feature with a StAX parser, set the catalog file on the XMLInputFactory
instance before creating the XMLStreamReader
object:
XMLInputFactory factory = XMLInputFactory.newInstance();
factory.setProperty(CatalogFeatures.Feature.FILES.getPropertyName(), catalog);
XMLStreamReader streamReader =
factory.createXMLStreamReader(xml, new FileInputStream(xml));
When the XMLStreamReader
streamReader
object is used to parse the XML source, external references in the source are then resolved in accordance with the specified entries in the catalog.
Note that unlike the DocumentBuilderFactory
class that has both setFeature
and setAttribute
methods, the XMLInputFactory
class defines only a setProperty
method. The XML Catalog API features including XMLConstants.USE_CATALOG
are all set through this setProperty
method. For example, to disable USE_CATALOG
on a XMLStreamReader
object, you can do the following:
factory.setProperty(XMLConstants.USE_CATALOG, false);
Use Catalog with Schema Validation
To use a catalog to resolve any external resources in a schema, such as XSD import
and include
, set the catalog on the SchemaFactory
object:
SchemaFactory factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
factory.setProperty(CatalogFeatures.Feature.FILES.getPropertyName(), catalog);
Schema schema = factory.newSchema(schemaFile);
The XMLSchema schema document contains references to external DTD:
<!DOCTYPE xs:schema PUBLIC "-//W3C//DTD XMLSCHEMA 200102//EN" "pathto/XMLSchema.dtd" [
...
]>
And to xsd
import:
<xs:import
namespace="http://www.w3.org/XML/1998/namespace"
schemaLocation="http://www.w3.org/2001/pathto/xml.xsd">
<xs:annotation>
<xs:documentation>
Get access to the xml: attribute groups for xml:lang
as declared on 'schema' and 'documentation' below
</xs:documentation>
</xs:annotation>
</xs:import>
Following along with this example, to use local resources to improve your application performance by reducing calls to the W3C server:
-
Include these entries in the catalog set on the
SchemaFactory
object:
<public publicId="-//W3C//DTD XMLSCHEMA 200102//EN" uri="XMLSchema.dtd"/>
<!-- XMLSchema.dtd refers to datatypes.dtd -->
<systemSuffix systemIdSuffix="datatypes.dtd" uri="datatypes.dtd"/>
<uri name="http://www.w3.org/2001/pathto/xml.xsd" uri="xml.xsd"/>
-
Download the source files
XMLSchema.dtd
,datatypes.dtd
, andxml.xsd
and save them along with the catalog file.
As already discussed, the XML Catalog API lets you use any of the entry types that you prefer. In the prior case, instead of the uri
entry, you could also use either one of the following:
-
A
public
entry, because thenamespace
attribute in theimport
element is treated as thepublicId
element:
<public publicId="http://www.w3.org/XML/1998/namespace" uri="xml.xsd"/>
-
A
system
entry:
<system systemId="http://www.w3.org/2001/pathto/xml.xsd" uri="xml.xsd"/>
Note:
When experimenting with the XML Catalog API, it might be useful to ensure that none of the URIs or system IDs used in your sample files points to any actual resources on the internet, and especially not to the W3C server. This lets you catch mistakes early should the catalog resolution fail, and avoids putting a burden on W3C servers, thus freeing them from any unnecessary connections. All the examples in this topic and other related topics about the XML Catalog API, have an arbitrary string"pathto"
added to any URI for that purpose, so that no URI could possibly resolve to an external W3C resource.
To use the catalog to resolve any external resources in an XML source to be validated, set the catalog on the Validator
object:
SchemaFactory schemaFactory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema();
Validator validator = schema.newValidator();
validator.setProperty(CatalogFeatures.Feature.FILES.getPropertyName(), catalog);
StreamSource source = new StreamSource(new File(xml));
validator.validate(source);
Use Catalog with Transform
To use the XML Catalog API in a XSLT transform process, set the catalog file on the TransformerFactory
object.
TransformerFactory factory = TransformerFactory.newInstance();
factory.setAttribute(CatalogFeatures.Feature.FILES.getPropertyName(), catalog);
Transformer transformer = factory.newTransformer(xslSource);
If the XSL source that the factory is using to create the Transformer
object contains DTD, import, and include statements similar to these:
<!DOCTYPE HTMLlat1 SYSTEM "http://openjdk.java.net/xml/catalog/dtd/XSLDTD.dtd">
<xsl:import href="pathto/XSLImport_html.xsl"/>
<xsl:include href="pathto/XSLInclude_header.xsl"/>
Then the following catalog entries can be used to resolve these references:
<system
systemId="http://openjdk.java.net/xml/catalog/dtd/XSLDTD.dtd"
uri="XSLDTD.dtd"/>
<uri name="pathto/XSLImport_html.xsl" uri="XSLImport_html.xsl"/>
<uri name="pathto/XSLInclude_header.xsl" uri="XSLInclude_header.xsl"/>