A Practical Introduction to Achieving Determinism (Annex)

Sun Java Real-Time System 2.2 Update 1

This document is the annex to the document A Practical Introduction to Achieving Determinism.

Technical Documentation: Links to all the Java RTS technical documents

Introduction

This document is an annex that complements the guide A Practical Introduction to Achieving Determinism.

The basic Practical Introduction guide demonstrates and explains some simple examples of using Java RTS to quickly and easily achieve determinism. The "Getting Started" package (GettingStarted.zip on Solaris OS, GettingStarted.tar.gz on Linux) includes a shell script that automatically runs all the predefined scenarios for the example programs. If you simply want to run the script and observe the results, then you do not need to read this annex document.

This annex contains additional information for those of you who want to understand in more detail how the example programs work, how to make changes in the script or in the program code, or how to test your own code for determinism.

The section Description of the Example Programs explains the purpose of the example programs and how they work. If you plan to make any changes to the example programs, you need to understand in detail the program internals.
If you want to run the program without using the script that is provided with the example programs, if you want to make changes to the script, or if you want to make changes to the programs, you need to read the sections Input Parameters for the Example Programs and Output from the Example Programs.
You can perform some experiments with specific code that you want to convert to real-time. Read Testing Determinism With Your Own Code to see how to plug your own code into the example programs in order to test your code for determinism.

[Contents]

Description of the Example Programs

The example programs in this set are Java applications to generate Fibonacci numbers, sleep for a specified time or wait for the next (calculated) period, and optionally allocate memory to stress the garbage collector. The programs create instances of the java.lang.Thread class (JLT) and the javax.realtime.RealtimeThread class (RTT).

Purpose of the Programs

The purpose of the example programs is to use system resources (CPU time, memory) in a controllable way and measure execution time. At the end of the execution, the time measurements and various statistical values are printed to the system output. The measurements show how much each iteration deviates from the previous, mean, or "reference" value because of jitter. In addition, the measurements show whether or not the deviation is bounded within an acceptable limit.

The embedded measurement functions in the examples measure the variations in the execution time, providing a wealth of information. The last two figures in the output are the most important: jitter and standard deviation on execution time. See the section Output from the Example Programs.

Structure of the Programs

All the example programs are structured in the same way:

A set of threads (either JLT or RTT) run iterations of Fibonacci calculations and/or put a controlled load on heap memory.
One of these threads calculates and records the time spent in performing the calculations. This thread is called the bench thread in this document.
The purpose of the other threads is to stress the Java RTS VM on computation and memory resources (especially garbage collection) in order to show that the behavior of the real-time threads can be controlled and kept deterministic and independent from the other JLTs or Real-Time Garbage Collector (RTGC) activities. Any thread that is not the bench thread is called a stress thread.
The bench thread and the stress threads execute the stress classes.
The static initialization code forces a first load and initialization of the classes needed to run the program by calling the methods to calculate a few Fibonacci numbers and to allocate a small amount of memory. In this way we avoid jitter that would be caused later by loading and initializing these classes. (See the Java RTS Compilation Guide for an explanation of how preinitialization can be used to reduce compilation jitter in the general case.)
The main method performs the following functions:
- Reads the input parameters.
- Initializes the state of the program.
- Forces a garbage collection to "clean up," that is, reclaim unused memory.
- Runs the JLT stress thread, if any.
- Runs the bench and then stress threads.
See the section Input Parameters for the Example Programs for the number and meaning of the input parameters.
Fibonacci numbers are computed by a recursive method. The threads run the Fibonacci calculations a number of times that is specified as an input parameter of the program.
Stressing the garbage collector is accomplished by iterating requests for the allocation of a fixed amount of memory.
For each iteration, the bench thread records the execution time spent in computing the Fibonacci numbers, pausing or sleeping, and allocating memory.
When all of the bench and stress threads have stopped, the main thread of the program computes data (execution time, mean values, deviations, and so forth), using the recorded values, and outputs that data to the standard output. In this way the output process has no effect on the behavior of the bench thread.
When the output is finished, the program stops.

Although the examples force early class initialization and early JIT compilation, jitter still occurs at the beginning of program execution. That is why, for all the scenarios for these particular examples, we consider each execution of the bench thread to be in its steady state only at the beginning of the tenth loop, when we know that JIT compilation has completed.

There are three programs in the example set:

NonDeterministic
This program uses only JLTs (even for the bench thread), and runs either in a standard Java SE VM, or on the Java RTS VM. It shows that determinism is not guaranteed with a vanilla VM using JLTs, even if there is no GC activity.
Deterministic
This program executes at most one bench RTT thread, one stress RTT thread, and one stress JLT concurrently. It shows that very accurate determinism can be achieved with Java RTS Realtime threads.
GCDeterministic
This program performs the same functions as Deterministic. In addition, it stresses memory management by allocating objects and consuming memory, and in the worst case trying to overflow the memory. It shows that under execution conditions that do not overconsume memory, the bench RTT is still deterministic. It also shows that, even in the worst case, the RTGC and the real-time priorities can be used in combination to maintain determinism.

Various relevant Java HotSpot VM options can be used with the runs.

-XX:+PrintCompilation: Print a message when JIT compilation occurs during the run of the bench thread. JIT compilation is a source of jitter that occurs only once for a given method.
-XX:+PrintGC: Print messages at garbage collection. You can use this option to verify that no GC happens during the execution of a bench thread, thus ensuring that there is no jitter due to garbage collection.
The PrintGC parameter is used in all scenarios. Note that output from this parameter will be mixed in with the program output.

In addition, the runs that are executed by the Java RTS VM can use the Java RTS VM options. .

[Contents]

Processing Logic

Each of the three example programs is built using almost the same schema. The main method reads the parameters, controls their values, then instantiates all the bench and stress threads from those parameters. The bench thread only is also linked to a Measurements instance that will perform all the calculations for the output.

The threads are started in the following order: JLT stress thread, JLT or RTT bench thread, and finally RTT stress thread, if any.

The run() method of each thread is the same for each bench and for each stress thread. The logic consists of three embedded loops, controlled by the input arguments nb_outer_iterations, nb_inner_iterations, and nb_stress_class_iterations. These loops control the number of times the stress methods are performed, as described in the following detailed logic:

Outer loop: Iterate <nb_outer_iterations> times and do the following in each loop:

If this is the bench thread, get the wall-clock time.

Inner loop: Iterate <nb_inner_iterations> times and do the following in each loop:

Call a method to calculate <nb_stress_class_iterations> times the Fibonacci values for <nb_stress_class_iterations>.

A second time, call a method to calculate <nb_stress_class_iterations> times the Fibonacci values for <nb_stress_class_iterations>.

Pause or wait for next period:

For the bench thread, wait for the next (calculated) period. The calculation of the period is explained in the Thread Parameters section.
For the stress thread, pause for the <pause_time> milliseconds value provided by the input parameter for this thread.

If the thread is a garbage production thread (<allocation_array_size> not zero), execute a second inner loop: Iterate <nb_inner_iterations> times and do the following in each loop:

Call a garbage production method that will allocate <nb_stress_class_iterations> times an array of int and an array of float that are <allocation_array_size> large.

A second time, call a garbage production method that will allocate <nb_stress_class_iterations> times an array of int and an array of float that are <allocation_array_size> large.

Note: The Deterministic program does not allocate memory, and so this second inner loop will not be present in that program.

If this is the bench thread, get the wall-clock time.

If this is the bench thread, then the wall-clock time values before the Fibonacci calculation and after the garbage production (if any) are recorded in two arrays that will be used by the Measurements instance as a basis for output calculation. The wall-clock time used is a javax.realtime.Clock instance if the bench thread is a javax.realtime.RealtimeThread instance. The elapsed time will be calculated as the difference between the wall-clock time at the end and at the beginning of the outer loop.

A join() operation is done on all the bench and stress threads to wait for proper and clean termination of the threads. When all threads have finished, the Measurements instance linked to the bench thread computes all the statistical results for the run and prints the output.

How Much Stress is Produced?

For comparison purposes, you can estimate how much stress is produced in a given run by counting the number of times the Fibonacci numbers are calculated and the amount of memory that was allocated.

The total number of times the Fibonacci numbers are calculated is equal to:

(<nb_outer_iterations> * <nb_inner_iterations> * <nb_stress_class_iterations> * 2)

The factor 2 is included because the Fibonacci method is called twice.

The parameter <nb_stress_class_iterations> also specifies how many Fibonacci numbers to calculate, that is, the limit of the recursion. (This number is the value of "n" in the formula for Fibonacci numbers in the next subsection.)

The total amount of memory, in bytes, that is allocated for the two arrays (int and float) is equal to:

(<nb_outer_iterations> * <nb_inner_iterations> *
<nb_stress_class_iterations> * 2 * <total_array_size>)

The factor 2 is included because the allocation method is called twice.

The arrays are two-dimensional, where the first dimension equals <nb_stress_class_iterations> and the second dimension equals <allocation_array_size>. Therefore, <total_array_size> equals:

(<nb_stress_class_iterations> * <allocation_array_size> * 8)

The factor 8 represents 4 bytes for each int array element plus 4 bytes for each float element.

We must also add the size of the two allocated array objects, which is constant but VM-dependent.

Calculation of the Fibonacci Numbers

Fibonacci calculation is performed by two simple Fibonacci classes (Fibonacci1 and Fibonacci2), which implement a recursive calculation of Fibonacci numbers: F(n) = F(n-1) + F(n-2) for n greater than 2, otherwise F(n) = 1.

Embedding of Measurement Equipment

Recording of measurement is done inside each outer loop and simply consists of getting the wall-clock time value and putting it into an array cell. This is done at the beginning and the end of each outer loop. The overhead of this recording is thus minimal and constant.

Computing and displaying the output is done outside the bench thread when all the threads have terminated, in the main() method of the main thread of the program. Therefore, this activity does not affect in any manner the values of the results.

[Contents]

Input Parameters for the Example Programs

A script file is provided with the example programs. If you run this script as-is, without any changes, then you can skip this section. You will simply execute the example programs with preset parameter values and then examine the output.

If you want to execute the programs individually, specifying various values for the input parameters, read this section for a detailed description of the parameters.

The input parameters for the example programs are in the following format:

<bench-thread-parameters> <stress-thread-parameters>+ <verbosity-flag>

The parameters have the following meanings:

<bench-thread-parameters>: The run parameters for the bench thread. See the section "Thread Parameters" below for a detailed description.
<stress-thread-parameters>: The run parameters for the stress threads, which can be either RTTs or JLTs. (Since there can be one or two stress threads, the "+" indicates that there can be one or two sets of stress thread parameters.) See the section "Thread Parameters" below for a detailed description of these parameters.
<verbosity-flag>: A String specifying the verbosity level of the program output. The value "verbose" means that detailed output will be produced; any other value, or no value at all, produces a summary. See the section Output from the Example Programs.

The <bench-thread-parameters> and the <stress-thread-parameters> are mandatory. The verbosity flag is optional; if it is not given, the output is a summary.

It is possible to suppress the creation of a stress thread by specifying zero as the value of any one of the first three stress thread parameters, that is, the parameters that determine the number of iterations (inner loop, outer loop, and internal stress class).

Thread Parameters

The list of bench thread parameters and the list of stress thread parameters have the same format, as follows:

<nb_outer_iterations> <nb_inner_iterations> <nb_stress_class_iterations> [<allocation_array_size>] <pause_time> <thread_priority>

The meanings of the parameters are as follows:

<nb_outer_iterations>: Mandatory. An integer value. The number of outer loop iterations. If zero for a stress thread, no stress thread will be created.
<nb_inner_iterations>: Mandatory. An integer value. The number of inner loop iterations. If zero for a stress thread, no stress thread will be created.
<nb_stress_class_iterations>: Mandatory. An integer value. The number of iterations inside the stress class. The stress class executes Fibonacci calculation, garbage production, or your own stress object implementation. The greater the value, the higher the CPU usage (for Fibonacci calculation) or the heap load (for garbage production). If zero for a stress thread, no stress thread will be created.
[<allocation_array_size>]: Mandatory for NonDeterministic and GCDeterministic programs only. Not expected for Deterministic program. An integer value. The value of the second dimension of the two-dimensional int and float arrays allocated inside the GarbageProducer classes. The <nb_stress_class_iterations> parameter is used as the first dimension of the int and float arrays. As an example, if <nb_stress_class_iterations> is 10 and <allocation_array_size> is 100, the arrays instantiated for the GarbageProducers will be int[10][100] and float[10][100], which is 8x10x100=8000 bytes, plus the size of the two allocated array objects. The size of the array objects is fixed, but VM-dependent.
<pause_time>: Mandatory. An integer value, in milliseconds.
- For the bench thread, we use wait-for-next-period logic. The program first calculates the average execution time for a certain number of iterations of the logic, including garbage collection. This is called the "stress cost." To this is added the value of this <pause_time> parameter, which represents additional inactive time, in order to arrive at the calculated period. If you enter zero for the value of the parameter, the program assumes 100 microseconds for the additional inactive time.
- For the stress thread, this is the time the thread will sleep within each of the outer loops.
<thread_priority>: Mandatory. A String value. Can be "max", "norm", or "min". Any other value defaults to "min". The priority value depends on the type of thread:
- RTT: (Deterministic and GCDeterministic programs). The priority is a real-time priority (starting from 11), which is obtained through the methods getMaxPriority(), getNormPriority(), getMinPriority() called on the default scheduler PriorityScheduler instance.
- JLT: (NonDeterministic program). The priority value is obtained via the three priority constants from the java.lang.Thread class, which are Thread.MAX_PRIORITY, Thread.NORM_PRIORITY, and Thread.MIN_PRIORITY.

As an example, the following thread parameters can be specified for the NonDeterministic program or the GCDeterministic program:

1000 10 20 50 30 norm

These parameter values mean: I want a stress or bench thread that will run 1000 outer iterations, each of which with 10 inner iterations, stressing classes (Fibonacci calculations and garbage production) with 20 iterations, the allocation array size is 50, the pause time inside the outer loop will be 30 milliseconds, and the priority will be at the normal priority level (either real-time or not, depending on the type of thread).

Note that the array size is not to be specified for the Deterministic program.

Input for NonDeterministic Program

In this case the threads are JLTs, because the program must be executable on a vanilla VM. In addition there are at most two threads running: the bench thread and the stress thread. Since the program also stresses the GC activity, the <allocation_array_size> parameter is needed in the specification of the thread parameters. The parameters are in the format:

<JLT-bench-thread-parameters> <JLT-stress-thread-parameters> <verbosity-flag>

Here is an example of the parameters:

 1000 20 10 500 100 max 
 1000 20 10 200 10 min
 verbose

(The line above has been shortened for printability.)

All the parameters except the verbosity flag are mandatory. The script file contains an example of the thread parameters for the NonDeterministic program.

Input for Deterministic Program

The Deterministic program does not allocate anything and so the GC does not have to collect anything during the run. Thus the <allocation_array_size> parameter is not provided in any of the three thread parameters for this thread. There can be at most one bench RTT thread, one stress RTT thread, and one stress JLT running concurrently in this program. The parameters for this program have the following format:

<RTT-bench-thread-parameters> <RTT-stress-thread-parameters> <JLT-stress-thread-parameters> <verbosity-flag>

Here is an example of the parameters:

 1000 20 10 0 norm 
 1000 20 10 200 norm 
 1000 20 10 10 min 
 summary

(The line above has been shortened for printability.)

All the parameters except the verbosity flag are mandatory. The script file contains an example of the thread parameters for the Deterministic program.

Input for GCDeterministic Program

In addition to the functions of the Deterministic program, GCDeterministic also produces garbage to stress the RTGC. Thus the <allocation_array_size> parameter has to be provided for any of the runs of GCDeterministic. There can be at most one bench RTT thread, one stress RTT thread, and one stress JLT running concurrently in this program. So the parameters for this program have the following format:

<RTT-bench-thread-parameters> <RTT-stress-thread-parameters> <JLT-stress-thread-parameters> <verbosity-flag>

Here is an example of the parameters:

 1000 10 10 20 100 max
 1000 10 10 200 0 norm 
 1000 10 10 800 0 min 
 verbose

(The line above has been shortened for printability.)

All the parameters except the verbosity flag are mandatory. The script file contains an example of the thread parameters for the GCDeterministic program.

[Contents]

Output from the Example Programs

This section describes in detail the output from the example programs. A great deal of output can be generated by these programs. You can request shorter output by using the verbosity flag for the run.

If the verbosity flag is set to verbose, the output will contain the following information:

Reminder of parameters for the run
Output header of the run
Raw execution time measurements
List of noticeable deviations based on previous iteration
Summary of deviations based on previous iteration
Percentage report of deviations
Ordered list of values
Summary of results
End-of-output banner

If the verbosity flag is set to any value other than verbose, just a subset of the output is printed:

Reminder of parameters for the run
Output header of the run
Summary of deviations based on previous iteration
Percentage report of deviations
Summary of results
End-of-output banner

All wall-clock times and elapsed times are expressed in microseconds.

The "reference" execution time value used in the output is defined for a given run as the elapsed time value that occurred the most often. If there are several different elapsed times which occurred the same number of times, the first one found is the one that is chosen as the reference. Reference value make sense when the run is very stable and most of the elapsed time values are identical. Otherwise the "mean" execution time value makes more sense for a run with widely varying values. Both the reference value and the mean value are used in the calculations in the output.

The script file that is provided for running the recommended scenarios sets the VM option flag PrintGC (and sometimes also the VM option flag PrintCompilation) to "on." Therefore, GC or compilation activity is also displayed in the output at various places, mixed in with the program output. This is to show clearly where garbage collection or compilation occurs. When reading the example program output, take care to differentiate this kind of output from the output from the programs.

Reminder of Parameters for the Run

As a convenient reminder, in addition to the output from the example programs, the script prints the command that was executed, including all the input parameters.

Example of Output

The following is an example of this part of the output on Solaris OS.

/usr/sbin/psrset -e 1 
/net/amos/mackinac/releases/b43a/binaries/
     solaris-i486/bin/java 
-XX:CompilationList=nhrt.precompile
-XX:PreInitList=itc.preinit
-Xms64m -Xmx64m 
-XX:RTGCCriticalReservedBytes=10m 
-XX:+PrintGC
-classpath /net/amos/mackinac/work/olagneau/
     docs-pass/getting-started-hands-on/
     apps/mypacking/GettingStarted/dist/
     GettingStarted.jar GCDeterministic 
1000 10 10 500 4 max 1000 10 10 300 0 min 
0 10 10 10 0 norm verbose

(The lines above have been shortened for printability.)

Output Header for the Run

The header for each run consists of the following:

Name of the example program
Indication of program initialization, including the computation of the stress cost for the bench thread
Summary of the input parameters provided for each thread, whether JLT or RTT:
- Number of outer loops: integer value
- Number of inner iterations: integer value
- Number of stress class iterations: integer value
- Size of allocated arrays inside GarbageProducer classes, if expected (Nondeterministic and GCDeterministic programs): integer value
- Pause time: integer value, in milliseconds and nanoseconds:
  - For the bench thread, the calculated period, which is the average execution time (stress cost) added to the entered pause time (additional inactive time)
  - For the stress thread, the entered pause time (the forced sleep time inside each loop)
- Priority level and corresponding value used (either JLT or real-time priority): min, norm, or max for the priority level specified, and an integer value for the value used
Indication of the level of detail of this output: summary or verbose
Indication that thread execution has started, including the number of outer iterations for the bench thread
VM memory: total amount, free amount, maximum amount after initial memory clean-up
Indication that thread execution has ended, including the number of outer iterations for the bench thread
Banner to introduce the rest of the output

Example of Output

The following is an example of this part of the output.

--------------- Running GCDeterministic Program ---------------

Note:
    All time measurements are expressed in microsecond units.
    unless otherwise indicated, for example ms (milliseconds) and ns (nanoseconds).

Static initialization finished:
  ==> Stressing Fibonacci and GarbageProducer classes initialized and loaded

Calculating stress cost for bench thread
  Calculated computation time for bench: (3 ms, 452308 ns)
  Calculated period for bench: (7 ms, 452308 ns)
Stress cost time for bench calculated

   Initialization and configuration of program finished 

============================================
   GCDeterministic program parameters
============================================
   The options used for this run are:   
----------------------------------------
  Realtime bench thread parameters:
  =================================
   Number of outer loops: 1000
   Number of inner iterations: 10
   Number of stress classes iterations: 10
   Size of elementary array to allocate: 300
   Entered pause time: 4 milliseconds
   Calculated period (deadline + pause): (7 ms, 452308 ns)
   Priority level: norm
   Effective priority assigned: 30

  Realtime stress thread parameters:
  =================================
   Number of outer loops: 1000
   Number of inner iterations: 10
   Number of stress classes iterations: 10
   Size of elementary array to allocate: 200
   Entered pause time: 4 milliseconds
   Priority level: min
   Effective priority assigned: 11

  Output detail level: detailed
----------------------------------------

Time Measurement Results

The next output is a list of the raw time measurement results for each iteration of the outer loop. This output is printed only if the verbosity flag is set for the run.

This is a large amount of data that will allow you to observe the evolution of the behavior of the run. In addition, it is formatted to be easily imported into any spreadsheet or statistical analysis tool, allowing you perform any analysis on a given run.

The first ten iterations are printed out with their execution time only. The rest of this list is as follows:

Loop_no: The outer loop index of the bench thread.
exec_time: Elapsed time for the bench thread for the execution of the entire outer loop.
start_time: Absolute start time in nanoseconds
end_time: Absolute end time in nanoseconds
mean_exec: Mean elapsed time for iterations from the tenth iteration up to and including this iteration.
delta_previous: Difference between the elapsed time for this iteration and the previous one (exec_time[n] - exec_time[n-1]).
mean_delta_previous: Mean of delta_previous from the tenth iteration up to and including this iteration.
delta_mean_exec: Difference between the elapsed time for this iteration and mean_exec value for this iteration.
delta_ref: Difference between the elapsed time for this iteration and the reference value up to and including this iteration.
min_exec: Minimum elapsed time value that has occurred so far. This data is reported only after the first ten loops, which is when we consider that the values have stabilized.
max_exec: Maximum elapsed time value that has occurred so far. This data is reported only after the first ten loops, which is when we consider that the values have stabilized.
std_dev: The standard deviation for the set of execution times that occurred so far, from the tenth iteration up and including this iteration.
exec_jitter: Execution time jitter encountered so far, from the tenth iteration up to and including this iteration. This value is defined as (max_exec - min_exec).

Example of Output

The following is an example of this part of the output (reformatted for more readability).

================================
Time measurement results:
================================
Loop_no: 0 | exec_time: 5361 | 
Loop_no: 1 | exec_time: 5213 | 
Loop_no: 2 | exec_time: 5245 | 
Loop_no: 3 | exec_time: 5300 | 
Loop_no: 4 | exec_time: 5189 | 
Loop_no: 5 | exec_time: 5186 | 
Loop_no: 6 | exec_time: 5245 | 
Loop_no: 7 | exec_time: 5256 | 
Loop_no: 8 | exec_time: 5209 | 
Loop_no: 9 | exec_time: 5178 | 
start time: 1215790305050079335 | 
end time:   1215790305055257658 | mean_exec: 5178 | 
delta_previous: -31 | mean_delta_previous:31 | 
delta_mean_exec: 0 | delta_ref: 87 | min_exec: 5178 | 
max_exec: 5178 | std_dev: 0.0 | 
exec_jitter: 0 | 

Loop_no: 10 | exec_time: 5235 | 
start time: 1215790305056771280 | 
end time:   1215790305062006112 | mean_exec: 5206 | 
delta_previous: 57 | mean_delta_previous:44 | 
delta_mean_exec: 29 | delta_ref: 144 | min_exec: 5178 | 
max_exec: 5235 | std_dev: 19.79 | 
exec_jitter: 57 | 

..... Lines omitted .....

Loop_no: 999 | exec_time: 4986 | 
start time: 1215790305468070019 | 
end time:   1215790305473056011 | mean_exec: 5140 | 
delta_previous: -79 | mean_delta_previous:61 | 
delta_mean_exec: -154 | delta_ref: -105 | min_exec: 4913 | 
max_exec: 5551 | std_dev: 97.84 | 
exec_jitter: 638 |

Noticeable Deviations Based on Previous Iteration

This output is printed only if the verbosity flag is set for the run. In addition, if no noticeable deviations were found, nothing is printed in this section of the output.

Not counting the first ten iterations, the execution time is given for iterations where the elapsed time difference values vary more than 500 microseconds from the preceding value. This is a sign of any unexpected deviation that could be jitter. The shorter the execution time of the outer loop, the more relevant this data. Such deviations may indicate an occurrence of GC activity, a jitter occurrence, or simply an instability of the program due to some other reason (such as waiting for a resource or I/O).

Example of Output

The following is an example of this part of the output.

===================================================
Noticeable deviations:
===================================================
For loop 49, execution time: 5551 microseconds, deviation: 614 microseconds

Summary of Deviations Based on Previous Iteration

This output summarizes the deviation occurrences for this run:

First is printed the number of "noticeable" deviations based on the previous iteration. These are deviations that are greater than 500 microseconds in absolute value, that is, either positive or negative deviation. Since the deviation for a loop is measured relative to the previous loop, the exact number of deviations is in most cases half the number of the reported number, because positive deviations are generally followed by a negative deviation, as the deviation tends to return closer to the reference value. This is the case for a very stable run, but not when the values vary greatly.
The next part of the output is the longest length of series of identical values. This figure represents the highest number of identical elapsed time results in a row that occurred, and the value of that result. For example, if the value "5066" was found 2 times in a row, and the value "5122" was found 3 times in a row, then this latter figure would be reported.

Example of Output

The following is an example of this part of the output.

============================================
Summary of deviations based on previous:
============================================
Nb noticeable positive deviations ( > 500 microseconds) = 1
Nb noticeable negative deviations ( < -500 microseconds) = 0
Nb max identical results = 3 for value 5122

(The lines above have been shortened for printability.)

Percentage Report of Deviations

This part of the output reports within which deviation range the deviations have been found, both related to reference value and to mean value. Thus there are two deviations reported. This provides a valuable distribution of the deviations.

This output is a list of deviations grouped by interval, for which is given the percentage of occurrence against the total number of loops. This output is ordered by time slice, from 0 microseconds to 1 millisecond, where the time slices are fine-grained up to 100 microseconds (5, 10, 20, 40, 50, 75, 100) and then coarser-grained from 100 to 1000 microseconds (200, 300, 400, 500, 750, 1000). The last value is for all deviations above 1 millisecond.

If the run is very deterministic, the reference value will make much sense, and so the deviations from this reference are the most relevant. Otherwise the deviations to consider are rather the deviations from the mean value.

Example of Output

The following is an example of this part of the output.

==============================================
Percentage report of deviations:
==============================================

---- deviations based on most frequent value 174 ----
Lower values below are exclusive, higher value are inclusive : 
0 to 5 microseconds deviations: 333, 33.30  %
5 to 10 microseconds deviations: 215, 21.50  %
10 to 20 microseconds deviations: 103, 10.30  %
20 to 40 microseconds deviations: 13, 1.30  %
40 to 50 microseconds deviations: 4, 0.40  %
50 to 75 microseconds deviations: 5, 0.50  %
75 to 100 microseconds deviations: 2, 0.20  %
100 to 200 microseconds deviations: 1, 0.10  %
200 to 300 microseconds deviations: 0, 0.00  %
300 to 400 microseconds deviations: 0, 0.00  %
400 to 500 microseconds deviations: 0, 0.00  %
500 to 750 microseconds deviations: 0, 0.00  %
750 to 1000 microseconds deviations: 0, 0.00  %
More than 1000 microseconds deviations: 0, 0.00  %

---- deviations based on mean execution time 177 ----
Lower values below are exclusive, higher value are inclusive : 
0 to 5 microseconds deviations: 851, 85.10  %
5 to 10 microseconds deviations: 84, 8.40  %
10 to 20 microseconds deviations: 33, 3.30  %
20 to 40 microseconds deviations: 12, 1.20  %
40 to 50 microseconds deviations: 3, 0.30  %
50 to 75 microseconds deviations: 4, 0.40  %
75 to 100 microseconds deviations: 2, 0.20  %
100 to 200 microseconds deviations: 1, 0.10  %
200 to 300 microseconds deviations: 0, 0.00  %
300 to 400 microseconds deviations: 0, 0.00  %
400 to 500 microseconds deviations: 0, 0.00  %
500 to 750 microseconds deviations: 0, 0.00  %
750 to 1000 microseconds deviations: 0, 0.00  %
More than 1000 microseconds deviations: 0, 0.00  %

Ordered List of Values

This output is printed only if the verbosity flag is set for the run.

This output summarizes the elapsed times in decreasing order of the number of occurrences in the loop. The elapsed times are grouped by the number of occurrences, and are in decreasing order of deviation from the “reference” value within a group. The deviation to mean execution time value is also given.

Example of Output

The following is an example of this part of the output.

=======================================
Ordered list of values:
=======================================
Execution time 5091 occurred 18 times, 
     ref deviation is 0 microseconds,  
     mean deviation is -49 microseconds

Execution time 5122 occurred 13 times, 
     ref deviation is 31 microseconds,  
     mean deviation is -18 microseconds

Execution time 5121 occurred 12 times, 
     ref deviation is 30 microseconds,  
     mean deviation is -19 microseconds

Execution time 5086 occurred 12 times, 
     ref deviation is -5 microseconds,  
     mean deviation is -54 microseconds

..... Lines omitted .....

Execution time 4921 occurred 1 times, 
     ref deviation is -170 microseconds,  
     mean deviation is -219 microseconds

Execution time 4913 occurred 1 times, 
     ref deviation is -178 microseconds,  
     mean deviation is -227 microseconds

(The lines above have been shortened for printability.)

Summary of Results

This part of the output provides the most relevant information about the determinism of the run.

Mean execution time: The mean of all the elapsed time values.
Best execution time: The minimum elapsed time value that occurred.
Worst execution time: The maximum elapsed time value that occurred.
Most frequent execution time: The elapsed execution time value that occurred the most frequently, together with the number of occurrences.
Execution time jitter: The final jitter value found for the run.
Standard deviation: The final standard deviation value found for the run.

Example of Output

The following is an example of this part of the output.

============================================
Summary of results:
============================================
Mean execution time: 5140 microseconds
Best execution time: 4913 microseconds
Worst execution time: 5551 microseconds
Most frequent execution time: 5091, 312 occurrences
Execution time jitter: 638
Standard deviation: 97.84

[Contents]

Testing Determinism With Your Own Code

Read this section if you want to execute your own code and test it for determinism. First, be sure to read Description of the Example Programs.

The example programs use "stress classes" to put stress on the system resources. There are two stress classes to calculate Fibonacci numbers (Fibonacci1 and Fibonacci2), and two stress classes to allocate memory that will be garbage-collected (GarbageProducer1 and GarbageProducer2).

The structure of the example programs allows you to easily plug in your own code as an additional stress class. In this way you can observe the deterministic behavior of the programs as they execute your code. This provides you with a good benchmark for testing parts of your code with Java RTS.

This section describes in detail how to plug in your code and have it executed by the example programs.

For the purpose of plugging in external code, a simple hook has been put in the source code:

A dedicated interface named StressingObject is available in the source directory. It declares two methods:
- void setStressClassIterations(int nb_stress_class_iterations)
  This method sets the number of iterations inside the stressing object, just as the Fibonacci1 and GarbageProducer classes do.
- void stress()
  This is the stress method itself, just as computeFibs() is the stress method for the two Fibonacci classes and produceGarbage() is the one for the two GarbageProducer classes.
The Fibonnacci1 class implements the StressingObject interface. You do not have to modify this class.
Each program in the set declares the method
void stress(String stressingClassName, int nb_inner_iterations, int nb_stress_class_iterations).
This method dynamically instantiates one StressingObject of the passed class, sets the nb_stress_class_iterations for this stressing object, and then calls the stress() method of the stressing object for a number of times equal to nb_inner_iterations. You do not have to modify anything in this method.
Each run() method of the programs in the set has the following commented call right after the Fibonacci calculation call:
stress("Fibonacci1",nb_inner_iterations, nb_stress_class_iterations)
For your own stressing class to be active, just uncomment this line of code and replace "Fibonacci1" by the name of your class that implements the StressingObject interface. You will have to do this once in the NonDeterministic program, twice in the Deterministic program, and twice in the GCDeterministic program. You can suppress the Fibonacci stress by commenting out the computeFibonacci line at the beginning of the outer loop, in the run() method.
Each static class initialization includes also a commented call to stress("Fibonacci1",10,10). You can uncomment this section to preinitialize your own stressing class before the effective run of the program.
You can also comment out the garbage-producing calls to produceGarbage() if you want to see only the effect of the stress produced by your code.

Therefore, to plug in your own code, you create a class that will be a wrapper above your code and which will implement the StressingObject interface. You also have to uncomment the programs inside their static initialization part and run() methods. You then recompile, and you have finished: the programs will now use the stress provided by your own code.

Note, however, that the stress methods of the programs instantiate only one StressingObject. This is different from the Fibonacci and GarbageProducer stressing methods, which instantiate two Fibonacci objects and two garbage-producer objects, respectively.

[Contents]