Setting a default time zone for incoming data

You can specify the default time zone to use when the incoming data does not have time zone information on the dates.

By setting a default time zone, you can avoid the following scenario where you are reading date/time values from a database. The values in the database might indicate midnight of various dates. But when you look at the values in the Dgraph (for example, through Studio), you might see the same dates being shown with a 4am time stamp. This time difference may affect your application logic and your EQL statements.

The reason for this is that Integrator ETL parses time values using current time by default, unless the values contain an explicit time zone specifier. The Information Discovery component was correctly sending the time values to the Dgraph, which was storing them internally as UTC values. The Dgraph's query service only returns values in UTC, causing Studio to show the 4am values.

In this use case, the important factor is an end-to-end consistency in the time stamps. Therefore, the solution is to interpret these time stamps as UTC. There are three ways to do this.

Method 1: Modify the source data

The first method is to modify the source data to include a timestamp and change the format string in Integrator ETL to reflect that (e.g., from dd.MM.yyyy HH:mm:ss to dd.MM.yyyy HH:mm:ss z).

The advantage of this method is that you do not have to add components to your existing graph. However, this approach is not as appealing as the next two because the data comes from a database and the changes have to be made there.

Method 2: Use a Reformat component

The second method is, in Integrator ETL, to treat the timestamp as a string (i.e., change the metadata definition), and then write a CTL expression (such as in a Reformat component) to append an explicit time zone ("UTC") and parse it into a date value.

A sample of the CTL code would be:
$0.OrderDate = str2date($0.OrderDate + " UTC", "dd.MM.yyyy HH:mm:ss Z");

Method 3: Configure the JVM

The third method is to change the default time zone in the JVM running the Integrator ETL graph to UTC. This can be accomplished by specifying the following argument (the quotes are important):
"-Duser.timezone=UTC"

Add this argument in: Run > Run Configurations > launch-config > Arguments tab > VM arguments, where launch-config is the name of the configuration you want to change.

Note: If you choose this option, you must also implement the configuration on the Integrator ETL Server. For details about adding JVM configurations when installing Integrator ETL Server into a WebLogic Server container, see "Installing Integrator ETL Server on WebLogic" in the Oracle Endeca Information Discovery Integrator ETL Installation Guide. For details about adding JVM configurations when installing Integrator ETL Server into a Tomcat container, see "Tomcat configuration recommendations" in the Oracle Endeca Information Discovery Integrator ETL Installation Guide.