7 Configuring EDQ to Process XML Data Files

This chapter describes how EDQ can be configured to read and write XML data files.

This chapter includes the following sections:

You can use XML data files in snapshots to read and write the data contained in the file. A snapshot is a staged copy of data in a data store that is used in one or more processes. EDQ provides two types of data stores for working with XML data files: Simple XML and XML and Stylesheet. Both are available for server-side and client-side data stores.

7.1 Using Simple XML Data Stores

Simple XML data stores can read and write XML files that have a simple 2-level structure in which the top level tag represents the entity and the lower level tags represent the attributes of the entity. XML files exported from Microsoft Access are an example.

Following is an example of a simple XML file format that could be used with EDQ:


<dataroot>
  <Person>
    <Id>1</Id>
    <FirstName>Fred</FirstName>
    <LastName>Bloggs</LastName>
    <DateOfBirth>1972-01-31T00:00:00.000+0000</DateOfBirth>
    <Weight>85</Weight>
  </Person>
  <Person>
    <Id>2</Id>
    <FirstName>Jane</FirstName>
    <LastName>Smith</LastName>
    <DateOfBirth>1985-07-16T00:00:00.000+0100</DateOfBirth>
    <Weight>63</Weight>
  </Person>
</dataroot>

7.1.1 Reading Simple XML Files

When EDQ reads Simple XML files the following occurs:

7.1.2 Writing Simple XML Files

When generating Simple XML files using an EDQ export to the data store, the name of the data store defines the record XML element name. The element Person in the example in Section 7.1 shows how this appears in the XML.

The XML element names of the lower level tags are taken from the EDQ attribute names. EDQ names are encoded to ensure that invalid XML is not generated. For example, space characters in names are replaced by the character sequence _x0020_, so an EDQ attribute named Date Of Birth would generate XML elements in the following format:

<Date_x0020_Of_x0020_Birth>

7.2 Using XML and Stylesheet Data Stores

When there is a requirement to work with XML of a different structure than that of Simple XML, then you use the XML and Stylesheet data stores.

These data stores read and write XML conforming to the DN-XML schema and optionally allow the use of a custom stylesheet to:

  • Transform XML from a custom XML format to DN-XML during data snapshot

  • Transform XML from DN-XML to a custom XML format during data export

For more information about XML stylesheets, see the W3C website found at http://www.w3.org/Style/XSL/ and http://www.w3.org/standards/xml.

7.2.1 Using DN-XML

DN-XML is the format by which custom XML can be processed by EDQ.

An example of DN-XML is as follows:

<dn:data xmlns:dn="http://www.datanomic.com/2008/dnx">
  <dn:record>
    <dn:value name="Id" type="string"/>
    <dn:value name="FirstName" type="string"/>
    <dn:value name="LastName" type="string"/>
    <dn:value name="DateOfBirth" type="date"/>
    <dn:value name="Height" type="number"/>
    <dn:value name="Weight" type="number"/>
  </dn:record>
  <dn:record>
    <dn:value name="Id">1</dn:value>
    <dn:value name="FirstName">Fred</dn:value>
    <dn:value name="LastName">Bloggs</dn:value>
    <dn:value name="DateOfBirth">1972-01-31</dn:value>
    <dn:value name="Height">1.85</dn:value>
    <dn:value name="Weight">85</dn:value>
  </dn:record>
  <dn:record>
    <dn:value name="Id">2</dn:value>
    <dn:value name="FirstName">Jane</dn:value>
    <dn:value name="LastName">Smith</dn:value>
    <dn:value name="DateOfBirth">1985-07-16</dn:value>
    <dn:value name="Height">1.65</dn:value>
    <dn:value name="Weight">63</dn:value>
  </dn:record>
</dn:data>

This is the equivalent DN-XML for the example given in Section 7.1, "Using Simple XML Data Stores."

Note that the EDQ attribute names are defined differently in DN-XML compared with Simple XML. Because DN-XML uses attribute content to specify EDQ attribute names, it is possible to create EDQ attributes with spaces and other special characters in their names.

In the previous example, the <dn:record skip="true"> XML element and its contents allows the definition of the structure of the source including the field names and their data types. All other record elements define a row of data in EDQ. This is analogous to the header row in a comma-separated values file. The following data types are permitted:

  • string

  • date

  • number

Note:

Date values in DN-XML files should be specified in the XSD date format (ISO 8601). For example, '2008-10-31T15:07:38.6875000-05:00' or without the time component simply as '2008-10-31'.

Within a data record, value elements are used to specify EDQ attribute values for the record. The name attribute is used to specify the EDQ attribute in question and the text content of the attribute specifies the value for that EDQ attribute. For example, the XML fragment, <dn:value name="FirstName">Fred</dn:value>, assigns the value 'Fred' to the EDQ attribute 'FirstName'.

DN-XML files can be read in to EDQ by creating an XML and Stylesheet data store and specifying the location of the XML source file; the XSLT file option should be left blank:

Description of xml_ss_config.png follows
Description of the illustration xml_ss_config.png

Similarly, EDQ can write DN-XML files by exporting data to an XML and Stylesheet data store with the XSLT option left blank.

7.2.2 Reading Custom XML Files

XML files in custom formats can be read by EDQ using the XML and Stylesheet data store configured to use a custom XML stylesheet (XSLT) to transform from the custom schema to the DN-XML schema during data snapshotting.

Description of dqsad_dt_002.png follows
Description of the illustration dqsad_dt_002.png

Following is an example custom XML file that could be read into EDQ:


<crmdata>
  <contacts>
    <contact id="1">
      <name>
        <firstname>Fred</firstname>
        <surname>Bloggs</surname>
      </name>
      <dob>1972-01-31</dob>
      <properties>
        <property name="height" value="1.85"/>
        <property name="weight" value="85"/>
      </properties>
    </contact>
    <contact id="2">
      <name>
        <firstname>Jane</firstname>
        <surname>Smith</surname>
      </name>
      <dob>1985-07-16</dob>
      <properties>
        <property name="height" value="1.68"/>
        <property name="weight" value="63"/>
      </properties>
    </contact>
  <contacts>
</crmdata>

The following XML stylesheet demonstrates one way that the preceding example custom XML can be transformed into a suitable DN-XML format:

<xsl:stylesheet version="1.0" xmlns:dn="http://www.datanomic.com/2008/dnx"    
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fn="http://www.w3.org/2005/02/xpath-functions">
 
  <xsl:output method="xml"/>
 
    <xsl:template match="/">
        <dn:data>
 
          <!-- Write out the header record -->
          <dn:record skip="true">
              <dn:value name="Id" type="string"/>
              <dn:value name="FirstName" type="string"/>
              <dn:value name="LastName" type="string"/>
              <dn:value name="DateOfBirth" type="date"/>
              <dn:value name="Height" type="number"/>
              <dn:value name="Weight" type="number"/>
          </dn:record>
                
         <!-- Get each contact record -->
         <xsl:apply-templates select="/crmdata/contacts/contact"/>
 
       </dn:data>
     </xsl:template>
 
     <xsl:template match="contact">
 
       <!-- Write out a data record -->
       <dn:record>
         <dn:value name="Id"><xsl:value-of select="@id"/></dn:value>
         <dn:value name="FirstName"><xsl:value-of select="name/firstname"/></dn:value>
         <dn:value name="LastName"><xsl:value-of select="name/surname"/></dn:value>
         <dn:value name="DateOfBirth"><xsl:value-of select="dob"/></dn:value>
         <dn:value name="Height">
          <xsl:value-of select="properties/property[@name='height']/@value"/>
         </dn:value>
         <dn:value name="Weight">
          <xsl:value-of select="properties/property[@name='weight']/@value"/>
         </dn:value>
      </dn:record>
 
    </xsl:template>
 
  </xsl:stylesheet>

7.2.2.1 Configuring the Data Store

The data can be read in to EDQ by creating an XML and Stylesheet data store and specifying the location of the XML source file and the XSLT file (stylesheet).

Description of data_store_2.png follows
Description of the illustration data_store_2.png

EDQ reads the source XML file in chunks for efficiency breaking up the file on record boundaries. By default EDQ uses the element immediately below the root as the record element. If this is not the case in the source XML file then an XPath-style expression to the record element from the root must be specified.

7.2.3 Writing Custom XML Files

XML files in custom formats can be written by EDQ using the XML and Stylesheet data store configured to use a custom XSLT to transform from the DN-XML schema to the custom target schema the during data export.

Description of dqsad_dt_001.png follows
Description of the illustration dqsad_dt_001.png

Following is an example target custom XML format that needs to be generated by EDQ:


<Report>
  <Person Id="1" FullName="Fred Bloggs"/>
  <Person Id="2" FullName="Jane Smith"/>
</Report>

The following XML stylesheet demonstrates one way in which the DN-XML format can be transformed into the target custom XML format:

<xsl:stylesheet version="1.0"
  xmlns:dn="http://www.datanomic.com/2008/dnx"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:fn="http://www.w3.org/2005/02/xpath-functions">
 
  <xsl:output method="xml"/>
 
  <xsl:template match="/">
    <Report>
      <xsl:apply-templates select="/dn:data/dn:record"/>
    </Report>
  </xsl:template>
 
  <xsl:template match="dn:record">
    <Person>
      <xsl:attribute name="Id">
        <xsl:value-of select="dn:value[@name = 'Id']"/>
      </xsl:attribute>
      <xsl:attribute name="FullName">
        <xsl:value-of select="dn:value[@name = 'FirstName']"/>
        <xsl:text> </xsl:text>
        <xsl:value-of select="dn:value[@name = 'LastName']"/>
      </xsl:attribute>
    </Person>
  </xsl:template>
 
</xsl:stylesheet>

7.2.3.1 Configuring the Data Store

The data can be written by EDQ by creating an XML and Stylesheet data store and specifying the destination for the custom XML file and XSLT (stylesheet) file.

Surrounding text describes xml_ss_config_write.png.