Market Maker 1.0 Catalog Import Guide

Chapter 3   Representing Catalog Information

Chapter 2 "Catalog Concepts" described the concepts that relate to mapping the categories from seller catalogs to the Public Master catalog (PMC). You need to be familiar with these concepts before you can prepare to import your catalog. The next step is to understand the conventions the Import utility uses to represent the catalog ontologies and do the mapping.

Each catalog ontology is a hierarchy of categories and sub-categories. The Import utility uses the extensible markup language (XML) to represent the ontologies of a seller catalog and the PMC.

Representing Catalogs

A seller catalog is represented in two files. The first file contains the unstructured catalog data, which is the standard iPlanet Market Maker catalog input. The second file provides the structural information to interpret the data in the first file. Table 3-1 describes these files.

Table 3-1    Seller Catalog Files

Seller Catalog File



Character-separated fields (CSF)  


Contains the catalog data in which the hierarchy is represented as a table. Each row is a line and each column is field. Each field is separated by a special character, such as a tilde ("~").  

Column structure specification (CSS)  


Contains the structural information that the Import utility uses to interpret the catalog data in the CSF file.  

The result of interpreting the CSF and CSS files is a load file that the Import utility uses to create the catalog database for an iPlanet Market Maker marketplace. The seller catalog categories are represented by XML elements in the load file. The XML category names are derived from the seller catalog category names. The XML element hierarchy represents the seller catalog hierarchy.

The load file contains the most compact representation of the seller catalog categories. The default load file name is mm.xml.

Creating the Seller Catalog and PMC in XML

There are two steps to convert the seller catalog data to an XML file that the Import utility can load into the iPlanet Market Maker catalog database. The first step is to create an XML representation of the seller catalog using the CSF and CSS files. The Import utility creates this XML representation in a file called vendor.xml.

The second step is to apply ontology mapping and attribute normalization rules to the vendor.xml file. The result of this step is an mm.xml file. The Import utility uses UTF8 encoding to process the vendor.xml and mm.xml files.

Figure 3-1 summarizes the process of creating the XML files.

Figure 3-1   

Flow To Create the Seller Catalog and PMC in XML Format

Adding Column Structure to the CSF File

An example of a CSF file is show in Figure 3-2. Note that this text file does not contain any structural information.

Figure 3-2    CSF File Example

Hard Drives~SCSI~1234452~case~15000.00~9.1 Gig HD~MassFastSCSI~add

Hard Drives~EIDE~1234472~case~15000.00~9.1 Gig HD~MassFastSCSI~add

The Import utility treats each row in the CSF file as an item. The columns define the categories and attributes. To define the column structure of the CSF file in Figure 3-2, you need to determine which columns represent the categories and attributes. Figure 3-3 shows an example of how you might apply column names to a CSF file.

Figure 3-3    CSF File Annotated with Structural Information

To represent this structure in XML format, it is useful to think of the column structure in terms of a table, such as the example shown in Table 3-2. Note that the "Key" column identifies the SKU attribute as the unique identifier for the items. You must define at least one attribute as a key in the CSS file, but you can use multiple columns to define a key.

Table 3-2    Table Format Representation of the CSF File

Column Name

Column Number

Column Type


Hard Drives  












Unit of Measurement (UOM)  












Product Name  




Action column  




To implement a column structure for the CSF file such as the one shown in Table 3-2, you create a CSS file in XML format. The CSS file must conform to the ColumnStructureSpecification.dtd file in the <install_dir>/catalog/import/dtd directory.

Figure 3-4 shows an example of a CSS file that describes the column structure of the CSF file shown in Figure 3-2. The columns are numbered from left to right. Each column specification defines a category or attribute, except for column 8 (see the following section).

Adding, Deleting, and Updating Items

Column 8 in Table 3-2 is an action column that defines which action to take when loading this item in the PMC. The result of the "add" action is an item defined in the PMC just as it was defined in the load file. You can also specify "delete" or "update" in an action column to delete an item or update an existing item in the database.

You can specify only one action per column row. If you do not specify an action column, the default action is to add items.

Suppose, for example, that you define an "add" action for item "x" in load file, and item "x" already exists in the PMC with the same key as defined in the load file. As a result of the "add" action, the attributes of item "x" are changed in the PMC as necessary so that they match the attributes of item "x" in the load file. For more information about how the actions work, see Table 3-3.

Table 3-3    Action Results




PMC Contains an Item with the Same Key Defined in the Load File

PMC Does Not Contain an Item with the Same Key Defined in File


Item attributes in the PMC are replaced by the item attributes in the load file.  

A new item is created in the PMC.  

As a result of the "add" action, the PMC contains an item with the identical load file attribute names, numbers, and units. This means that exiting attributes in the PMC can be added, deleted, and updated to match those in the load file.  


Item is deleted from the PMC.  

An error message  



Item attributes in the PMC are updated to match the attributes in the load file.  

An error message  

Only the attribute values and units changed in the load file are updated in the PMC. Existing attributes in the PMC are not added or deleted.  

Figure 3-4    CSS File Example

<?xml version="1.0" ?>

<!DOCTYPE column-specifications SYSTEM



       <column-specification number="1" >

               <category level="1" />


       <column-specification number="2" >

               <category level="2" />


       <column-specification number="3" >

        <value key="yes"><name><fixed>SKU</fixed></name></value>


       <column-specification number="4" >



       <column-specification number="5" >





       <column-specification number="6" >





       <column-specification number="7" >


                <name><fixed>Product Name</fixed></name>



       <column-specification number="8" >

               <action />



Figure 3-5 shows a graphical representation of the hierarchy structure implemented by the CSS file in Figure 3-4.

Figure 3-5    Seller Catalog Hierarchy Structure

Using Column References

The CSS file shown in Figure 3-4 specifies fixed columns, which means that the Import utility interprets all the attribute values or units in a column the same way for every row. But there are cases when the attribute names or units in a column are different in each row, such as when there are different currencies for different items. When the attribute names or units are different in each row, you use a column reference.

Suppose, for example, that you wanted to specify different currency types for the items shown in Figure 3-6.

Figure 3-6    CSF File Example with Different Currency Types

         1    2      3     4       5          6        7          8

Hard Drives~SCSI~1234452~case~15000.00~9.1 Gig HD~MassFastSCSI~USD

Hard Drives~EIDE~1234472~case~45000.00~9.1 Gig HD~MassFastSCSI~JPY

Assume that in Figure 3-6 column 5 contains the values of the "Price" attribute, and column 8 contains the units (currencies) for this same attribute. See Figure 3-7.

Figure 3-7    CSF File with Structural Information

Figure 3-8 shows the section of the CSS file entry to specify that the prices in column 5 have their units (USD for U.S. dollars, JPY for Japanese yen) defined in column 8.

Figure 3-8    Column Reference Section of a CSS File

<column-specification number="5" >



      <unit><column-ref column-number="8"/></unit>



Creating the Seller Catalog

The seller catalog created from processing the CSF and CSS file is a vendor.xml file. This file maps categories to XML elements. If the category names contain spaces or special characters, those characters are replaced with an underscore ("_") character. See Figure 3-9.

Figure 3-9    Seller Catalog Represented in vendor.xml

XML Representation Issues

In the process of representing a seller catalog in XML, the following two issues can arise.

  • Sibling categories with unique names in the seller catalog are renamed to have identical names in the vendor.xml file. This renaming causes what is known as a name clash.

  • When two or more categories in the vendor.xml file have identical names, the result is what is known as a path clash. This situation can occur because two categories in the CSF file have the same name, or as a consequence of renaming categories in the vendor.xml file.

To identify clashes, run the Import utility with the CHECK option. If your catalog has clashes, this option creates name_clashes.xml and path_clashes.xml report files that contain instructions for resolving the clashes. In the report files, the names with clash problems are identified with a <tag-name> tag.

Detecting Clashes

The following two types of clashes can occur.

  • Name clashes

    Occurs when sibling categories are renamed to the same name. For a list of special characters that cause this renaming, see the following URL.

  • Path clashes

    Occurs when categories have the same name, but their "child" categories have different names.

Resolving Name Clashes

If there are name clashes in your hierarchy, you need to resolve them to create the PMC.

Figure 3-10 shows an example of a hierarchy with a name clash, where the names "Y*" and "Y?" create a name clash.

Figure 3-10    Name Clash

By default, the "Y*" and "Y?" categories are renamed to the same category in the vendor.xml file. See Figure 3-11 and Figure 3-12.

Figure 3-11    Result of Name Clash

Figure 3-12    vendor.xml File with a Name Clash

<?xml version="1.0" encoding="UTF-8"?>


   <X vortex-type="category" name="X">

     <Y_ vortex-type="category" name="Y*">

        <Item vortex-type="item" action="add">

         <Attribute vortex-type="attribute" value="1" name="SKU" key="yes" />




   <X vortex-type="category" name="X">

     <Y_ vortex-type="category" name="Y?">

        <Item vortex-type="item" action="add">

         <Attribute vortex-type="attribute" value="2" name="SKU" key="yes" />





If name clashes exist, the Import utility creates a report file called name_clashes.xml. This file contains an XML description of the name clashes that you can use as a template file to resolve the clashes. Figure 3-13 shows an example of such a file, edited to perform the renaming shown in Figure 3-14. The <tag-name> tags that define the renamed names are in bold in Figure 3-13.

Figure 3-13    Edited Name Clash File

<?xml version="1.0" encoding="UTF-8"?>











Figure 3-14    Name Clash Resolution

Figure 3-15 shows the vendor.xml with resolved name clashes.

Figure 3-15    vendor.xml File with Resolved Name Clash

<?xml version="1.0" encoding="UTF-8"?>


  <X vortex-type="category" name="X">

    <Y_star vortex-type="category" name="Y*">

      <Item vortex-type="item" action="add">

        <Attribute vortex-type="attribute" value="1" name="SKU" key="yes" />




  <X vortex-type="category" name="X">

    <Y_q vortex-type="category" name="Y?">

    <Item vortex-type="item" action="add">

        <Attribute vortex-type="attribute" value="2" name="SKU" key="yes" />





Resolving Path Clashes

If there are path clashes in your hierarchy, you need to resolve them to create an mm.xml file you can load into the catalog database. To find out if there are path clashes, always specify the CHECK option the first time you run the Import utility.

Figure 3-16 shows an example of a path clash.

Figure 3-16    Path Clash

In Figure 3-16, the two "parent" categories named "Y" have different sets of "child" categories. Under XML validation rules, the two "Y" categories contain the "child" categories "A" and "B". As a result of validation, the "Y" categories are considered to be the same category represented as <Y> in the vendor.xml file shown in Figure 3-12.

Figure 3-17   

<?xml version="1.0" encoding="UTF-8"?>


  <X vortex-type="category" name="X">

    <Y vortex-type="category" name="Y/">

      <A vortex-type="category" name="A">

        <Item vortex-type="item" action="add">

          <Attribute vortex-type="attribute" value=" 1" name="SKU" key="yes" />





  <X vortex-type="category" name="X">

    <Z vortex-type="category" name="Z">

      <Y vortex-type="category" name="Y?">

        <B vortex-type="category" name="B">

           <Item vortex-type="item" action="add">

             <Attribute vortex-type="attribute" value="2" name="SKU" key="yes" />







vendor.xml File with a Path Clash

The category hierarchy in Figure 3-17 is defined too broadly. To apply strict XML validation, the "Y" categories need to be renamed so that they are unique. For example, see Figure 3-19.

If path clashes exist, the Import utility creates a report file called path_clashes.xml. This file contains an XML description of the path clashes that you can use as a template file to resolve the clashes. Figure 3-18 shows an example of such a file, edited to perform the renaming shown in Figure 3-19. The <tag-name> tags that define the renamed names are in bold in Figure 3-18.

Figure 3-18    Edited Path Clash File

<?xml version="1.0" encoding="UTF-8"?>


     <path-clash clashing-tag="Y">











Figure 3-19    Path Clash Resolution

Figure 3-20 shows the vendor.xml with resolved path clashes.

Figure 3-20    vendor.xml File with Resolved Path Clashes

<?xml version="1.0" encoding="UTF-8"?>


  <X vortex-type="category" name="X">

    <Y_slash vortex-type="category" name="Y/">

      <A vortex-type="category" name="A">

        <Item vortex-type="item" action="add">

          <Attribute vortex-type="attribute" value=" 1" name="SKU" key="yes" />





  <X vortex-type="category" name="X">

    <Z vortex-type="category" name="Z">

      <Y_star vortex-type="category" name="Y?">

        <B vortex-type="category" name="B">

          <Item vortex-type="item" action="add">

            <Attribute vortex-type="attribute" value="2" name="SKU" key="yes" />







Defining the Ontology Mapping

The "Mapping Catalog Hierarchies" section in Chapter 2 "Catalog Concepts" describes the concept of mapping items from a seller catalog to the PMC. Seller ontologies often vary from seller to seller, but the PMC has only one ontology. The PMC ontology is likely to be different from a seller catalog. So a mapping mechanism is necessary to ensure that a seller catalog ontology conforms to the PMC ontology. The instructions to do this mapping are described in an ontology mapping description file named omd.xml.

When the ontology in the PMC is the same as that of a seller catalog, the Import utility can generate the omd.xml file automatically. The Import utility stills needs to do the mapping in this case because the load file is a compact version of the catalog input data, and the load program uses this compact version to optimize processing.

When the ontology of the PMC is different from that of a seller catalog, you must create the file omd.xml manually.

Changing the Ontology Mapping

The Import utility creates an ontology mapping file, omd.xml, that assumes the mm.xml and vendor.xml ontologies are identical. Figure 3-21 shows an example of a vendor.xml ontology.

Figure 3-21    vendor.xml Hierarchy Example

Figure 3-22 shows the resulting omd.xml file that the Import utility creates. The categories defined in the omd.xml file are the categories that the Import utility creates in the mm.xml file. The <path> tags tells the Import utility where in the vendor.xml file to find the items to be contained in a category in the mm.xml file.

In Figure 3-22, for example, there is a category named "SCSI":

<SCSI vortex-type="category" name="SCSI">

In this example, the "SCSI" category in the vendor.xml file also appears in the mm.xml file. The <path> tag defines the path to locate the items (in this case only one item) in the vendor.xml file to be contained in the "SCSI" category in the mm.xml file. The "Hard_Drives" category is the root category.


Figure 3-22    Resulting omd.xml Ontology Mapping File

<?xml version="1.0" encoding="UTF-8"?>

<root version="1.0">

    <Hard_Drives vortex-type="category" name="Hard Drives">


         <SCSI vortex-type="category" name="SCSI">



          <EIDE vortex-type="category" name="EIDE">





When the ontologies in the vendor.xml and mm.xml files are different, you need to edit the omd.xml file. Figure 3-23 illustrates such a case.

Figure 3-23    Different vendor.xml and mm.xml Hierarchies

In Figure 3-23, the items in vendor.xml are mapped to a different ontology in mm.xml in which there is only one category. This means that the "SCSI" and "EIDE" categories are not in the mm.xml file. The items under these two categories need to be mapped to the "Hard Drives" category. To do this mapping, you need to edit the omd.xml file.

Figure 3-24 shows the edited version of the omd.xml file needed to map the items in the vendor.xml file to the mm.xml file. Note that there is only a "Hard Drives" category. The <path> tags tell the Import utility to take the items under the "SCSI" and "EDIE" categories in the vendor.xml file and map them to the "Hard Drives" category in the mm.xml file.



Figure 3-24    PMC omd.xml Ontology Mapping File

<?xml version="1.0" encoding="UTF--8"?>

<root version="1.0"

    <Hard_Drives vortex-type="category" name="Hard Drives">





As a result of ontology mapping, the mm.xml file defines the hierarchy of category, item, and attribute elements. Figure 3-25 shows an example of an mm.xml file. The elements are highlighted in bold.

Figure 3-25    Sample mm.xml File

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE vortex-data-load SYSTEM "file:////usr/netscape/server4/iMM/labs/mm.dtd">
<vortex-data-load version="1.0">
    <Hard_Drives name="Hard Drives" vortex-type="category">
        <Item vortex-type="item" action="add">
            <Attribute vortex-type="attribute" value="1234452" name="SKU" key="yes"/>
           <Attribute vortex-type="attribute" value="case" name="UOM" key="no"/>
            <Attribute vortex-type="attribute" value="15000.00" dataType="currency" name="Price" key="no"/>
            <Attribute vortex-type="attribute" value="9.1 Gig HD" name="Description" key="no"/>
            <Attribute vortex-type="attribute" value="MassFastSCSI" name="Name" key="no"/>
        <Item vortex-type="item" action="add">
            <Attribute vortex-type="attribute" value="1234472" name="SKU" key="yes"/>
            <Attribute vortex-type="attribute" value="case" name="UOM" key="no"/>
            <Attribute vortex-type="attribute" value="15000.00" dataType="currency" name="Price" key="no"/>
            <Attribute vortex-type="attribute" value="9.1 Gig HD" name="Description" key="no"/>
            <Attribute vortex-type="attribute" value="MassFastSCSI" name="Name" key="no"/>

