55.4 Using the Job Specification Language
The Job Specification Language (JSL) enables you to define the steps in a job and their execution order using an XML file. The following example shows how to define a simple job that contains one chunk step and one task step:
<job id="loganalysis" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
version="1.0">
<properties>
<property name="input_file" value="input1.txt"/>
<property name="output_file" value="output2.txt"/>
</properties>
<step id="logprocessor" next="cleanup">
<chunk checkpoint-policy="item" item-count="10">
<reader ref="com.example.pkg.LogItemReader"></reader>
<processor ref="com.example.pkg.LogItemProcessor"></processor>
<writer ref="com.example.pkg.LogItemWriter"></writer>
</chunk>
</step>
<step id="cleanup">
<batchlet ref="com.example.pkg.CleanUp"></batchlet>
<end on="COMPLETED"/>
</step>
</job>
This example defines the loganalysis batch job, which consists of the logprocessor chunk step and the cleanup task step. The logprocessor step transitions to the cleanup step, which terminates the job when completed.
The job element defines two properties, input_file and output_file. Specifying properties in this manner enables you to run a batch job with different configuration parameters without having to recompile its Java batch artifacts. The batch artifacts can access these properties using the context objects from the batch runtime.
The logprocessor step is a chunk step that specifies batch artifacts for the reader (LogItemReader), the processor (LogItemProcessor), and the writer (LogItemWriter). This step creates a checkpoint for every ten items processed.
The cleanup step is a task step that specifies the CleanUp class as its batch artifact. The job terminates when this step completes.
The following sections describe the elements of the Job Specification Language (JSL) in more detail and show the most common attributes and child elements.
55.4.1 The job Element
The job element is always the top-level element in a job definition file. Its main attributes are id and restartable. The job element can contain one properties element and zero or more of each of the following elements: listener, step, flow, and split. For example:
<job id="jobname" restartable="true">
<listeners>
<listener ref="com.example.pkg.ListenerBatchArtifact"/>
</listeners>
<properties>
<property name="propertyName1" value="propertyValue1"/>
<property name="propertyName2" value="propertyValue2"/>
</properties>
<step ...> ... </step>
<step ...> ... </step>
<decision ...> ... </decision>
<flow ...> ... </flow>
<split ...> ... </split>
</job>
The listener element specifies a batch artifact whose methods are invoked before and after the execution of the job. The batch artifact is an implementation of the javax.batch.api.listener.JobListener interface. See The Listener Batch Artifacts for an example of a job listener implementation.
The first step, flow, or split element inside the job element executes first.
55.4.2 The step Element
The step element can be a child of the job and flow elements. Its main attributes are id and next. The step element can contain the following elements.
-
One
chunkelement for chunk-oriented steps or onebatchletelement for task-oriented steps. -
One
propertieselement (optional).This element specifies a set of properties that batch artifacts can access using batch context objects.
-
One
listenerelement (optional); onelistenerselement if more than one listener is specified.This element specifies listener artifacts that intercept various phases of step execution.
For chunk steps, the batch artifacts for these listeners can be implementations of the following interfaces:
StepListener,ItemReadListener,ItemProcessListener,ItemWriteListener,ChunkListener,RetryReadListener,RetryProcessListener,RetryWriteListener,SkipReadListener,SkipProcessListener, andSkipWriteListener.For task steps, the batch artifact for these listeners must be an implementation of the
StepListenerinterface.See The Listener Batch Artifacts for an example of an item processor listener implementation.
-
One
partitionelement (optional).This element is used in partitioned steps which execute in more than one thread.
-
One
endelement if this is the last step in a job.This element sets the batch status to
COMPLETED. -
One
stopelement (optional) to stop a job at this step.This element sets the batch status to
STOPPED. -
One
failelement (optional) to terminate a job at this step.This element sets the batch status to
FAILED. -
One or more
nextelements if thenextattribute is not specified.This element is associated with an exit status and refers to another step, a flow, a split, or a decision element.
The following is an example of a chunk step:
<step id="stepA" next="stepB">
<properties> ... </properties>
<listeners>
<listener ref="MyItemReadListenerImpl"/>
...
</listeners>
<chunk ...> ... </chunk>
<partition> ... </partition>
<end on="COMPLETED" exit-status="MY_COMPLETED_EXIT_STATUS"/>
<stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step0"/>
<fail on="MY_ERROR_EXIT_STATUS" exit-status="MY_ERROR_EXIT_STATUS"/>
</step>
The following is an example of a task step:
<step id="stepB" next="stepC"> <batchlet ...> ... </batchlet> <properties> ... </properties> <listener ref="MyStepListenerImpl"/> </step>
55.4.2.1 The chunk Element
The chunk element is a child of the step element for chunk-oriented steps. The attributes of this element are listed in Table 55-2.
Table 55-2 Attributes of the chunk Element
| Attribute Name | Description | Default Value |
|---|---|---|
|
|
Specifies how to commit the results of processing each chunk:
The checkpoint is updated when the results of a chunk are committed. Every chunk is processed in a global Java EE transaction. If the processing of one item in the chunk fails, the transaction is rolled back and no processed items from this chunk are stored. |
|
|
|
Specifies the number of items to process before committing the chunk and taking a checkpoint. |
10 |
|
|
Specifies the number of seconds before committing the chunk and taking a checkpoint when If |
0 (no limit) |
|
|
Specifies if processed items are buffered until it is time to take a checkpoint. If true, a single call to the item writer is made with a list of the buffered items before committing the chunk and taking a checkpoint. |
true |
|
|
Specifies the number of skippable exceptions to skip in this step during chunk processing. Skippable exception classes are specified with the |
No limit |
|
|
Specifies the number of attempts to execute this step if retryable exceptions occur. Retryable exception classes are specified with the |
No limit |
The chunk element can contain the following elements.
-
One
readerelement.This element specifies a batch artifact that implements the
ItemReaderinterface. -
One
processorelement.This element specifies a batch artifact that implements the
ItemProcessorinterface. -
One
writerelement.This element specifies a batch artifact that implements the
ItemWriterinterface. -
One
checkpoint-algorithmelement (optional).This element specifies a batch artifact that implements the
CheckpointAlgorithminterface and provides a custom checkpoint policy. -
One
skippable-exception-classeselement (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing should skip. The
skip-limitattribute from thechunkelement specifies the maximum number of skipped exceptions. -
One
retryable-exception-classeselement (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that chunk processing will retry. The
retry-limitattribute from thechunkelement specifies the maximum number of attempts. -
One
no-rollback-exception-classeselement (optional).This element specifies a set of exceptions thrown from the reader, writer, and processor batch artifacts that should not cause the batch runtime to roll back the current chunk, but to retry the current operation without a rollback instead.
For exception types not specified in this element, the current chunk is rolled back by default when an exception occurs.
The following is an example of a chunk-oriented step:
<step id="stepC" next="stepD">
<chunk checkpoint-policy="item" item-count="5" time-limit="180"
buffer-items="true" skip-limit="10" retry-limit="3">
<reader ref="pkg.MyItemReaderImpl"></reader>
<processor ref="pkg.MyItemProcessorImpl"></processor>
<writer ref="pkg.MyItemWriterImpl"></writer>
<skippable-exception-classes>
<include class="pkg.MyItemException"/>
<exclude class="pkg.MyItemSeriousSubException"/>
</skippable-exception-classes>
<retryable-exception-classes>
<include class="pkg.MyResourceTempUnavailable"/>
</retryable-exception-classes>
</chunk>
</step>
This example defines a chunk step and specifies its reader, processor, and writer artifacts. The step updates a checkpoint and commits each chunk after processing five items. It skips all MyItemException exceptions and all its subtypes, except for MyItemSeriousSubException, up to a maximum of ten skipped exceptions. The step retries a chunk when a MyResourceTempUnavailable exception occurs, up to a maximum of three attempts.
55.4.2.2 The batchlet Element
The batchlet element is a child of the step element for task-oriented steps. This element only has the ref attribute, which specifies a batch artifact that implements the Batchlet interface. The batch element can contain a properties element.
The following is an example of a task-oriented step:
<step id="stepD" next="stepE">
<batchlet ref="pkg.MyBatchletImpl">
<properties>
<property name="pname" value="pvalue"/>
</properties>
</batchlet>
</step>
This example defines a batch step and specifies its batch artifact.
55.4.2.3 The partition Element
The partition element is a child of the step element. It indicates that a step is partitioned. Most partitioned steps are chunk steps where the processing of each item does not depend on the results of processing previous items. You specify the number of partitions in a step and provide each partition with specific information on which items to process, such as the following.
-
A range of items. For example, partition 1 processes items 1 through 500, and partition 2 processes items 501 through 1000.
-
An input source. For example, partition 1 processes the items in
input1.txtand partition 2 processes the items ininput2.txt.
When the number of partitions, the number of items, and the input sources for a partitioned step are known at development or deployment time, you can use partition properties in the job definition file to specify partition-specific information and access these properties from the step batch artifacts. The runtime creates as many instances of the step batch artifacts (reader, processor, and writer) as partitions, and each artifact instance receives the properties specific to its partition.
In most cases, the number of partitions, the number of items, or the input sources for a partitioned step can only be determined at runtime. Instead of specifying partition-specific properties statically in the job definition file, you provide a batch artifact that can access your data sources at runtime and determine how many partitions are needed and what range of items each partition should process. This batch artifact is an implementation of the PartitionMapper interface. The batch runtime invokes this artifact and then uses the information it provides to instantiate the step batch artifacts (reader, writer, and processor) for each partition and to pass them partition-specific data as parameters.
The rest of this section describes the partition element in detail and shows two examples of job definition files: one that uses partition properties to specify a range of items for each partition, and one that relies on a PartitionMapper implementation to determine partition-specific information.
See The Phone Billing Chunk Step in The phonebilling Example Application for a complete example of a partitioned chunk step.
The partition element can contain the following elements.
-
One
planelement, if themapperelement is not specified.This element defines the number of partitions, the number of threads, and the properties for each partition in the job definition file. The
planelement is useful when this information is known at development or deployment time. -
One
mapperelement, if theplanelement is not specified.This element specifies a batch artifact that provides the number of partitions, the number of threads, and the properties for each partition. The batch artifact is an implementation of the
PartitionMapperinterface. You use this option when the information required for each partition is only known at runtime. -
One
reducerelement (optional).This element specifies a batch artifact that receives control when a partitioned step begins, ends, or rolls back. The batch artifact enables you to merge results from different partitions and perform other related operations. The batch artifact is an implementation of the
PartitionReducerinterface. -
One
collectorelement (optional).This element specifies a batch artifact that sends intermediary results from each partition to a partition analyzer. The batch artifact sends the intermediary results after each checkpoint for chunk steps and at the end of the step for task steps. The batch artifact is an implementation of the
PartitionCollectorinterface. -
One
analyzerelement (optional).This element specifies a batch artifact that analyzes the intermediary results from the partition collector instances. The batch artifact is an implementation of the
PartitionAnalyzerinterface.
The following is an example of a partitioned step using the plan element:
<step id="stepE" next="stepF">
<chunk>
<reader ...></reader>
<processor ...></processor>
<writer ...></writer>
</chunk>
<partition>
<plan partitions="2" threads="2">
<properties partition="0">
<property name="firstItem" value="0"/>
<property name="lastItem" value="500"/>
</properties>
<properties partition="1">
<property name="firstItem" value="501"/>
<property name="lastItem" value="999"/>
</properties>
</plan>
</partition>
<reducer ref="MyPartitionReducerImpl"/>
<collector ref="MyPartitionCollectorImpl"/>
<analyzer ref="MyPartitionAnalyzerImpl"/>
</step>
In this example, the plan element specifies the properties for each partition in the job definition file.
The following example uses a mapper element instead of a plan element. The PartitionMapper implementation dynamically provides the same information as the plan element provides in the job definition file:
<step id="stepE" next="stepF">
<chunk>
<reader ...></reader>
<processor ...></processor>
<writer ...></writer>
</chunk>
<partition>
<mapper ref="MyPartitionMapperImpl"/>
<reducer ref="MyPartitionReducerImpl"/>
<collector ref="MyPartitionCollectorImpl"/>
<analyzer ref="MyPartitionAnalyzerImpl"/>
</partition>
</step>
Refer to The phonebilling Example Application for an example implementation of the PartitionMapper interface.
55.4.3 The flow Element
The flow element can be a child of the job, flow, and split elements. Its attributes are id and next. Flows can transition to flows, steps, splits, and decision elements. The flow element can contain the following elements:
-
One or more
stepelements -
One or more
flowelements (optional) -
One or more
splitelements (optional) -
One or more
decisionelements (optional)
The last step in a flow is the one with no next attribute or next element. Steps and other elements in a flow cannot transition to elements outside the flow.
The following is an example of the flow element:
<flow id="flowA" next="stepE"> <step id="flowAstepA" next="flowAstepB">...</step> <step id="flowAstepB" next="flowAflowC">...</step> <flow id="flowAflowC" next="flowAsplitD">...</flow> <split id="flowAsplitD" next="flowAstepE">...</split> <step id="flowAstepE">...</step> </flow>
This example flow contains three steps, one flow, and one split. The last step does not have the next attribute. The flow transitions to stepE when its last step completes.
55.4.4 The split Element
The split element can be a child of the job and flow elements. Its attributes are id and next. Splits can transition to splits, steps, flows, and decision elements. The split element can only contain one or more flow elements that can only transition to other flow elements in the split.
The following is an example of a split with three flows that execute concurrently:
<split id="splitA" next="stepB"> <flow id="splitAflowA">...</flow> <flow id="splitAflowB">...</flow> <flow id="splitAflowC">...</flow> </split>
55.4.5 The decision Element
The decision element can be a child of the job and flow elements. Its attributes are id and next. Steps, flows, and splits can transition to a decision element. This element specifies a batch artifact that decides the next step, flow, or split to execute based on information from the execution of the previous step, flow, or split. The batch artifact implements the Decider interface. The decision element can contain the following elements.
-
One or more
endelements (optional).This element sets the batch status to
COMPLETED. -
One or more
stopelements (optional).This element sets the batch status to
STOPPED. -
One or more
failelements (optional).This element sets the batch status to
FAILED. -
One or more
nextelements (optional). -
One
propertieselement (optional).
The following is an example of the decider element:
<decision id="decisionA" ref="MyDeciderImpl"> <fail on="FAILED" exit-status="FAILED_AT_DECIDER"/> <end on="COMPLETED" exit-status="COMPLETED_AT_DECIDER"/> <stop on="MY_TEMP_ISSUE_EXIST_STATUS" restart="step2"/> </decision>
