 |
This document is the annex to the document
A Practical Introduction to Achieving Determinism.
Technical Documentation: Links to all the Java RTS technical documents
Contents
Introduction
This document is an annex that complements the guide
A Practical Introduction to Achieving Determinism.
The basic Practical Introduction guide demonstrates and explains some simple examples
of using Java RTS to quickly and easily achieve determinism. The "Getting Started"
package (GettingStarted.zip on Solaris OS,
GettingStarted.tar.gz on Linux)
includes a shell script that automatically runs all the predefined scenarios
for the example programs. If you simply want to run the script and observe
the results, then you do not need to read this annex document.
This annex contains additional information for those of you who want to
understand in more detail how the example programs work, how to
make changes in the script or in the program code, or how to test your own code
for determinism.
The section Description of the Example Programs
explains the purpose of the example programs and how they work.
If you plan to make any changes to the example programs, you need to understand
in detail the program internals.
If you want to run the program without using the script that is provided with the
example programs, if you want to make changes to the script,
or if you want to make changes to the programs, you need to read
the sections Input Parameters for the Example Programs
and Output from the Example Programs.
You can perform some experiments with specific code that you
want to convert to real-time. Read Testing Determinism With Your Own Code
to see how to plug your own code into the example programs in order
to test your code for determinism.
[Contents]
Description of the Example Programs
The example programs in
this set are Java applications to generate Fibonacci
numbers, sleep for a specified time or wait for the next (calculated) period, and optionally allocate
memory to stress the garbage collector. The programs create
instances of the java.lang.Thread class (JLT) and
the javax.realtime.RealtimeThread class (RTT).
Purpose of the Programs
The purpose of the example programs
is to use system resources (CPU time, memory) in a
controllable way and measure execution time. At the end of the
execution, the time measurements and various statistical values are
printed to the system output. The measurements show how much each
iteration deviates from the previous, mean, or "reference" value
because of jitter. In addition, the
measurements show whether or not the deviation is bounded
within an acceptable limit.
The embedded measurement functions in the examples measure the
variations in the execution time, providing a wealth of information.
The last two figures in the output are the most important:
jitter and standard deviation on execution time.
See the section Output from the Example Programs.
Structure of the Programs
All the example programs are structured in the same way:
-
A set of threads (either JLT or
RTT) run iterations of Fibonacci
calculations and/or put a controlled load on heap memory.
-
One of these threads calculates and records the time
spent in performing the calculations. This thread is called the
bench thread in this document.
-
The purpose of the other threads is to stress the Java RTS VM on computation
and memory resources (especially garbage collection) in order to show
that the behavior of the real-time
threads can be controlled and kept deterministic and independent from
the other JLTs
or Real-Time Garbage Collector (RTGC)
activities. Any thread that is not the bench thread is called a stress
thread.
- The bench thread and the stress threads execute the stress classes.
-
The static initialization code forces a first
load and initialization of the classes needed to run the program by
calling the methods
to calculate a few Fibonacci numbers and to allocate a small amount
of memory. In this way we avoid jitter that would be caused later
by loading and initializing these classes.
(See the Java RTS Compilation Guide
for an explanation of how preinitialization can be
used to reduce compilation jitter in the general case.)
-
The main method performs the following functions:
- Reads the input parameters.
- Initializes the state of the program.
- Forces a garbage collection to "clean up," that is,
reclaim unused memory.
- Runs the JLT stress thread, if any.
- Runs the bench and then stress threads.
See the section Input Parameters for the Example Programs for the
number and meaning of the input parameters.
-
Fibonacci numbers are
computed by a recursive method. The threads run the
Fibonacci calculations a number of times that is specified as an input
parameter of the program.
Stressing the garbage collector is accomplished by iterating requests for
the allocation of a fixed amount of memory.
-
For each iteration, the bench thread records the execution time
spent in computing the Fibonacci numbers, pausing or sleeping,
and allocating memory.
-
When all of the bench and stress threads have stopped,
the main thread of the program computes data (execution time, mean
values, deviations, and so forth), using the recorded values, and outputs
that data to the standard output. In this way the
output process has no effect on the behavior of the bench thread.
-
When the output is finished, the program stops.
Although the examples force early class initialization and
early JIT compilation, jitter still occurs at the beginning of program execution.
That is why, for all the scenarios for these particular examples,
we consider each execution of the bench thread to be in its
steady state only at the beginning of the tenth loop, when
we know that JIT compilation has completed.
There are three programs in the example set:
NonDeterministic
This program uses only JLTs (even for the bench thread), and runs either in a
standard Java SE VM, or on the Java RTS VM. It shows that
determinism is not guaranteed with a vanilla VM using
JLTs, even if there is no GC activity.
Deterministic
This program executes at most one bench RTT thread, one stress RTT thread, and
one stress JLT concurrently.
It shows that very accurate determinism can be achieved with Java RTS Realtime threads.
GCDeterministic
This program performs the same functions as Deterministic.
In addition, it stresses memory management by allocating objects and
consuming memory, and in the worst case trying to overflow the memory.
It shows that under execution conditions that do not
overconsume memory, the bench RTT is still deterministic. It also
shows that, even in the worst case, the RTGC and the real-time
priorities can be used in combination to maintain determinism.
Various relevant Java HotSpot VM options can be used with the runs.
-XX:+PrintCompilation:
Print a message when JIT compilation occurs during the run of the bench thread.
JIT compilation is a source of jitter that occurs only once for
a given method.
-XX:+PrintGC:
Print messages at garbage collection. You can use this option to verify
that no GC happens during the execution of a bench thread, thus
ensuring that there is no jitter due to garbage collection.
The PrintGC parameter is used in all scenarios.
Note that output from this parameter will be mixed in with the program output.
In addition, the runs that are executed by the Java RTS VM can use the Java RTS VM options.
.
[Contents]
Processing Logic
Each of the three example programs is built
using almost the same schema. The main
method reads the parameters, controls their values, then instantiates
all the bench and stress threads from those parameters. The bench
thread only is also linked to a Measurements
instance that will perform all the calculations for the output.
The threads are started in the following order:
JLT stress thread, JLT or RTT bench thread,
and finally RTT stress thread, if any.
The run()
method of each thread is the same for each bench and for each stress thread.
The logic consists of three embedded loops, controlled by the
input arguments nb_outer_iterations, nb_inner_iterations, and
nb_stress_class_iterations.
These loops control the number of times the stress methods
are performed, as described in the following detailed logic:
Outer loop: Iterate <nb_outer_iterations> times and
do the following in each loop:
If this is the bench thread, get the wall-clock time.
Inner loop: Iterate <nb_inner_iterations> times
and do the following in each loop:
- Call a method to calculate <nb_stress_class_iterations>
times the Fibonacci values for <nb_stress_class_iterations>.
- A second time, call a method to calculate
<nb_stress_class_iterations> times the Fibonacci values
for <nb_stress_class_iterations>.
Pause or wait for next period:
- For the bench thread, wait for the next (calculated) period. The calculation
of the period is explained in the Thread Parameters section.
- For the stress thread, pause for the <pause_time> milliseconds value
provided by the input parameter for this thread.
If the thread is a garbage production thread
(<allocation_array_size> not zero), execute a second inner loop:
Iterate <nb_inner_iterations> times and do the following in each loop:
- Call a garbage production method that will allocate
<nb_stress_class_iterations> times an array of
int and an
array of float that are <allocation_array_size> large.
- A second time, call a garbage production method that
will allocate <nb_stress_class_iterations> times an array of
int and an
array of float that are <allocation_array_size> large.
Note: The Deterministic program does not allocate memory, and so
this second inner loop will not be present in that program.
-
If this is the bench thread, get the wall-clock time.
-
If this is the bench thread, then
the wall-clock time values before the Fibonacci calculation and after
the garbage production (if any) are recorded in two arrays that will be
used by the Measurements
instance as a basis for output calculation. The wall-clock time used
is a javax.realtime.Clock instance if the bench thread
is a javax.realtime.RealtimeThread instance.
The elapsed time will be calculated as the difference between
the wall-clock time at the end and at the beginning of the outer loop.
A join() operation is done on all the bench and stress threads to wait for
proper and clean termination of the threads. When all threads have
finished, the Measurements
instance linked to the bench thread computes all the statistical
results for the run and prints the output.
How Much Stress
is Produced?
For comparison purposes, you can estimate how much stress is
produced in a given run by counting
the number of times the Fibonacci numbers are calculated and the amount
of memory that was allocated.
The total number of times the Fibonacci
numbers are calculated is equal to:
(<nb_outer_iterations> *
<nb_inner_iterations> * <nb_stress_class_iterations> * 2)
The factor 2 is included because the Fibonacci method is
called twice.
The parameter <nb_stress_class_iterations> also
specifies how many Fibonacci
numbers to calculate, that is, the limit of the recursion. (This number
is the value of "n" in the formula for Fibonacci numbers in the next
subsection.)
The total amount of memory, in bytes,
that is allocated for the two arrays (int and float) is
equal to:
(<nb_outer_iterations> *
<nb_inner_iterations> *
<nb_stress_class_iterations> * 2 * <total_array_size>)
The factor 2 is included because the allocation method is
called twice.
The arrays are two-dimensional, where the first dimension
equals
<nb_stress_class_iterations> and the second dimension equals
<allocation_array_size>. Therefore, <total_array_size>
equals:
(<nb_stress_class_iterations> *
<allocation_array_size> * 8)
The factor 8 represents 4 bytes for each int array
element plus 4 bytes for each float element.
We must also add the size of the two allocated array objects,
which is constant but VM-dependent.
Calculation of
the Fibonacci Numbers
Fibonacci calculation is
performed by two simple Fibonacci classes
(Fibonacci1 and Fibonacci2), which implement a recursive calculation of
Fibonacci numbers: F(n) = F(n-1) + F(n-2) for n greater than 2,
otherwise F(n) = 1.
Embedding
of Measurement Equipment
Recording of measurement is done
inside
each outer loop and simply consists of getting the wall-clock
time value and putting it into an array cell. This is done at the
beginning and
the end of each outer loop. The overhead of this recording is thus
minimal and constant.
Computing and displaying the output is
done outside the bench thread when all the threads have terminated, in
the main()
method of the main thread of the program. Therefore, this activity does
not affect in any manner the values of the results.
[Contents]
Input
Parameters for the Example Programs
A script file is provided with the example programs. If you
run this script as-is, without any changes, then you can skip this section.
You will simply execute the example programs with
preset parameter values and then examine the output.
If you want to execute the programs individually, specifying
various values
for the input parameters, read this section for a detailed description
of the parameters.
The input parameters for the example programs are in the
following format:
<bench-thread-parameters>
<stress-thread-parameters>+ <verbosity-flag>
The parameters have the following meanings:
-
<bench-thread-parameters>: The run parameters for
the bench thread. See the section "Thread Parameters" below for a
detailed description.
-
<stress-thread-parameters>: The run parameters for
the stress threads, which can be either RTTs or
JLTs. (Since there can be one or two stress threads,
the "+" indicates that there can be one or two sets
of stress thread parameters.)
See the section "Thread Parameters" below for a
detailed description of these parameters.
-
<verbosity-flag>: A String specifying the verbosity level
of the program output. The value "verbose" means that detailed
output will be produced;
any other value, or no value at all, produces a summary. See the section
Output from the Example Programs.
The <bench-thread-parameters>
and the <stress-thread-parameters> are mandatory. The verbosity
flag is optional; if it is not given, the output is a summary.
It is possible to suppress the creation of a stress thread
by specifying zero as the value of any one of the first three stress
thread parameters, that is, the parameters that determine the number of
iterations (inner loop, outer loop, and internal stress class).
Thread Parameters
The list of bench thread parameters and the list of stress
thread parameters have the same format, as follows:
<nb_outer_iterations>
<nb_inner_iterations>
<nb_stress_class_iterations> [<allocation_array_size>]
<pause_time> <thread_priority>
The meanings of the parameters are as follows:
-
<nb_outer_iterations>: Mandatory. An integer value.
The number of outer loop iterations. If zero for a stress thread, no
stress thread will be created.
-
<nb_inner_iterations>: Mandatory. An integer
value. The number of inner loop iterations. If zero for a stress
thread, no stress thread will be created.
-
<nb_stress_class_iterations>: Mandatory. An
integer value.
The number of iterations inside the stress class. The stress class
executes Fibonacci calculation, garbage production, or
your own stress object implementation. The greater the
value, the higher the CPU usage (for Fibonacci calculation) or the heap
load (for garbage production). If zero for a stress thread, no stress
thread will be created.
-
[<allocation_array_size>]: Mandatory for
NonDeterministic and
GCDeterministic programs only. Not expected for Deterministic program.
An integer value. The value of
the second dimension of the two-dimensional int and float
arrays allocated inside the GarbageProducer classes. The
<nb_stress_class_iterations> parameter is used as the first
dimension of the int and float
arrays. As an example, if <nb_stress_class_iterations> is
10 and <allocation_array_size> is 100, the arrays
instantiated for the GarbageProducers will be int[10][100]
and float[10][100],
which is 8x10x100=8000 bytes, plus the size of the two allocated array
objects. The size of the array objects is fixed, but VM-dependent.
-
<pause_time>: Mandatory. An integer value, in milliseconds.
- For the bench thread, we use wait-for-next-period logic.
The program first calculates the average execution time
for a certain number of iterations of the logic, including garbage collection.
This is called the "stress cost." To this is added the value of this <pause_time>
parameter, which represents additional inactive time, in order to arrive at
the calculated period. If you enter zero for the value of the parameter,
the program assumes 100 microseconds for the additional inactive time.
- For the stress thread, this is the time
the thread will sleep within each of the outer loops.
-
<thread_priority>: Mandatory. A String value. Can be
"max", "norm", or "min". Any other value defaults to
"min". The priority value depends on the type of thread:
- RTT: (Deterministic and
GCDeterministic programs). The priority is a real-time priority
(starting
from 11), which is obtained through the methods
getMaxPriority(),
getNormPriority(), getMinPriority() called on the default
scheduler PriorityScheduler instance.
- JLT: (NonDeterministic program). The
priority value is obtained via the three priority constants from the
java.lang.Thread class, which are
Thread.MAX_PRIORITY,
Thread.NORM_PRIORITY, and Thread.MIN_PRIORITY.
As an example, the following thread parameters can be
specified
for the NonDeterministic program or the GCDeterministic program:
1000 10 20 50 30 norm
These parameter values mean: I want a stress or bench
thread that will run 1000 outer iterations, each of which with 10
inner iterations, stressing classes (Fibonacci calculations and garbage
production) with
20 iterations, the allocation array size is 50, the pause time inside
the outer loop will be 30
milliseconds, and the priority will be at the normal priority level
(either real-time or not, depending on the type of thread).
Note that the array size is not to be specified for the
Deterministic program.
Input for
NonDeterministic Program
In this case the threads are
JLTs, because the program must be executable on a vanilla VM.
In addition there are at most two threads running: the bench thread
and the stress thread. Since the program also stresses the GC
activity, the <allocation_array_size> parameter is needed in the
specification of the thread parameters. The parameters are in the
format:
<JLT-bench-thread-parameters>
<JLT-stress-thread-parameters> <verbosity-flag>
Here is an example of the parameters:
1000 20 10 500 100 max
1000 20 10 200 10 min
verbose
(The line above has been shortened for printability.)
All the parameters except the verbosity flag are mandatory.
The script file contains an example of the thread parameters for the
NonDeterministic program.
Input for
Deterministic Program
The Deterministic program does
not allocate anything and so the GC does not have to collect anything
during the run. Thus the <allocation_array_size> parameter is not
provided in any of the three thread parameters for this thread. There
can be at most one bench RTT thread, one stress RTT thread, and
one stress JLT running concurrently in this program. The
parameters for this program have the following format:
<RTT-bench-thread-parameters>
<RTT-stress-thread-parameters>
<JLT-stress-thread-parameters> <verbosity-flag>
Here is an example of the parameters:
1000 20 10 0 norm
1000 20 10 200 norm
1000 20 10 10 min
summary
(The line above has been shortened for printability.)
All the parameters except the
verbosity flag are mandatory. The script file contains an example of
the thread parameters for the
Deterministic program.
Input for
GCDeterministic Program
In addition to the functions of the Deterministic program,
GCDeterministic also produces garbage to stress the RTGC. Thus the
<allocation_array_size> parameter has to be provided for any of
the runs of GCDeterministic. There can be at most one bench RTT
thread, one stress RTT thread, and
one stress JLT running concurrently in this program. So
the parameters for this program have the following format:
<RTT-bench-thread-parameters>
<RTT-stress-thread-parameters>
<JLT-stress-thread-parameters> <verbosity-flag>
Here is an example of the parameters:
1000 10 10 20 100 max
1000 10 10 200 0 norm
1000 10 10 800 0 min
verbose
(The line above has been shortened for printability.)
All the parameters except the verbosity flag are mandatory.
The script file contains an example of the thread parameters for the
GCDeterministic program.
[Contents]
Output
from the Example Programs
This section describes in detail the output from the example
programs. A great deal of output can be generated by these programs.
You can request shorter output by using the verbosity flag for the run.
If the verbosity flag is set to verbose, the output
will contain the following information:
- Reminder of parameters for the run
- Output header of the run
- Raw execution time measurements
- List of noticeable deviations based on previous iteration
- Summary of deviations based on previous iteration
- Percentage report of deviations
- Ordered list of values
- Summary of results
- End-of-output banner
If the verbosity flag is set to any value other than verbose,
just a subset of the output is printed:
- Reminder of parameters for the run
- Output header of the run
- Summary of deviations based on previous iteration
- Percentage report of deviations
- Summary of results
- End-of-output banner
All wall-clock times and elapsed times are expressed in microseconds.
The "reference" execution time value used in the output
is defined for a given run as the elapsed time value that occurred
the most often. If there are several different elapsed times which
occurred the same number of times, the first one found is the one that is chosen
as the reference. Reference value make sense when the run is very stable and
most of the elapsed time values are identical. Otherwise the "mean"
execution time value makes more sense for a run with widely varying
values. Both the reference value and the mean value are used in the
calculations in the output.
The script file that is provided for running the recommended
scenarios sets the VM option flag PrintGC (and sometimes also the VM option flag
PrintCompilation) to "on." Therefore, GC or compilation
activity is also displayed in the output at various places, mixed in
with the program output. This is to show clearly where garbage collection or
compilation occurs. When reading the example program output, take
care to differentiate this kind of output from the output from the
programs.
Reminder of Parameters for the Run
As a convenient reminder, in addition to the output from the example
programs, the script prints the command that was
executed, including all the input parameters.
Example of Output
The following is an example of this part of the output on Solaris OS.
/usr/sbin/psrset -e 1
/net/amos/mackinac/releases/b43a/binaries/
solaris-i486/bin/java
-XX:CompilationList=nhrt.precompile
-XX:PreInitList=itc.preinit
-Xms64m -Xmx64m
-XX:RTGCCriticalReservedBytes=10m
-XX:+PrintGC
-classpath /net/amos/mackinac/work/olagneau/
docs-pass/getting-started-hands-on/
apps/mypacking/GettingStarted/dist/
GettingStarted.jar GCDeterministic
1000 10 10 500 4 max 1000 10 10 300 0 min
0 10 10 10 0 norm verbose
(The lines above have been shortened for printability.)
Output Header
for the Run
The header for each run consists of the following:
Name of the example program
Indication of program initialization, including the computation of the
stress cost for the bench thread
Summary of the input parameters provided for
each thread, whether JLT or RTT:
Number of outer loops: integer value
Number of inner iterations: integer value
Number of stress class iterations: integer value
Size of allocated arrays inside GarbageProducer
classes, if expected (Nondeterministic and GCDeterministic programs): integer
value
Pause time: integer value, in milliseconds and nanoseconds:
For the bench thread, the calculated period, which is the
average execution time (stress cost) added to the entered pause time (additional inactive time)
For the stress thread, the entered pause time (the forced sleep time inside each loop)
Priority level and corresponding value used (either
JLT or real-time priority): min, norm, or max
for the priority level specified, and an integer value for the value used
Indication of the level of detail of this output: summary
or verbose
Indication that thread execution has started, including
the number of outer iterations for the bench thread
VM memory: total amount, free amount, maximum amount
after initial memory clean-up
Indication that thread execution has ended, including the
number of outer iterations for the bench thread
Banner to introduce the rest of the output
Example of Output
The following is an example of this part of the output.
--------------- Running GCDeterministic Program ---------------
Note:
All time measurements are expressed in microsecond units.
unless otherwise indicated, for example ms (milliseconds) and ns (nanoseconds).
Static initialization finished:
==> Stressing Fibonacci and GarbageProducer classes initialized and loaded
Calculating stress cost for bench thread
Calculated computation time for bench: (3 ms, 452308 ns)
Calculated period for bench: (7 ms, 452308 ns)
Stress cost time for bench calculated
Initialization and configuration of program finished
============================================
GCDeterministic program parameters
============================================
The options used for this run are:
----------------------------------------
Realtime bench thread parameters:
=================================
Number of outer loops: 1000
Number of inner iterations: 10
Number of stress classes iterations: 10
Size of elementary array to allocate: 300
Entered pause time: 4 milliseconds
Calculated period (deadline + pause): (7 ms, 452308 ns)
Priority level: norm
Effective priority assigned: 30
Realtime stress thread parameters:
=================================
Number of outer loops: 1000
Number of inner iterations: 10
Number of stress classes iterations: 10
Size of elementary array to allocate: 200
Entered pause time: 4 milliseconds
Priority level: min
Effective priority assigned: 11
Output detail level: detailed
----------------------------------------
Time Measurement Results
The next output is a list of the raw time measurement results for
each iteration of the outer loop.
This output is printed only if the verbosity flag is set for the run.
This is a large amount of data that will allow you to
observe the evolution of the behavior of the run. In addition, it is
formatted to be easily imported into any spreadsheet or
statistical analysis tool, allowing you perform any analysis on a given
run.
The first
ten iterations are printed out with their execution time only. The
rest of this list is as follows:
Loop_no: The outer loop index of the bench thread.
exec_time: Elapsed time
for the bench thread for the execution of the entire outer loop.
start_time: Absolute start time in nanoseconds
end_time: Absolute end time in nanoseconds
mean_exec: Mean elapsed
time for iterations from the tenth iteration up to and including
this iteration.
delta_previous: Difference
between the elapsed time for this iteration and the previous one
(exec_time[n] - exec_time[n-1]).
mean_delta_previous:
Mean of delta_previous from the tenth iteration up to and
including this iteration.
-
delta_mean_exec: Difference
between the elapsed time for this iteration and mean_exec
value for this iteration.
-
delta_ref: Difference
between the elapsed time for this iteration and the reference
value up to and including this iteration.
-
min_exec: Minimum
elapsed time value that has occurred so far. This data is
reported only after the first ten loops, which is when we consider
that the values have stabilized.
-
max_exec: Maximum
elapsed time value that has occurred so far. This data is
reported only after the first ten loops, which is when we consider
that the values have stabilized.
-
std_dev: The
standard deviation for the set of execution times that occurred so
far, from the tenth iteration up and including this iteration.
-
exec_jitter:
Execution time jitter encountered so far, from the tenth iteration
up to and including this iteration. This value is defined as
(max_exec - min_exec).
Example of Output
The following is an example of this part of the output (reformatted for more readability).
================================
Time measurement results:
================================
Loop_no: 0 | exec_time: 5361 |
Loop_no: 1 | exec_time: 5213 |
Loop_no: 2 | exec_time: 5245 |
Loop_no: 3 | exec_time: 5300 |
Loop_no: 4 | exec_time: 5189 |
Loop_no: 5 | exec_time: 5186 |
Loop_no: 6 | exec_time: 5245 |
Loop_no: 7 | exec_time: 5256 |
Loop_no: 8 | exec_time: 5209 |
Loop_no: 9 | exec_time: 5178 |
start time: 1215790305050079335 |
end time: 1215790305055257658 | mean_exec: 5178 |
delta_previous: -31 | mean_delta_previous:31 |
delta_mean_exec: 0 | delta_ref: 87 | min_exec: 5178 |
max_exec: 5178 | std_dev: 0.0 |
exec_jitter: 0 |
Loop_no: 10 | exec_time: 5235 |
start time: 1215790305056771280 |
end time: 1215790305062006112 | mean_exec: 5206 |
delta_previous: 57 | mean_delta_previous:44 |
delta_mean_exec: 29 | delta_ref: 144 | min_exec: 5178 |
max_exec: 5235 | std_dev: 19.79 |
exec_jitter: 57 |
..... Lines omitted .....
Loop_no: 999 | exec_time: 4986 |
start time: 1215790305468070019 |
end time: 1215790305473056011 | mean_exec: 5140 |
delta_previous: -79 | mean_delta_previous:61 |
delta_mean_exec: -154 | delta_ref: -105 | min_exec: 4913 |
max_exec: 5551 | std_dev: 97.84 |
exec_jitter: 638 |
Noticeable
Deviations Based on Previous Iteration
This output is printed only if the verbosity flag is set for
the run. In addition, if no noticeable deviations were found,
nothing is printed in this section of the output.
Not counting the first ten
iterations, the execution time is given for iterations where
the elapsed time difference values vary more than 500 microseconds
from the preceding value. This is a sign of any unexpected deviation
that could be jitter. The shorter the execution time of the outer
loop, the more relevant this data. Such deviations may indicate an
occurrence of GC activity, a jitter occurrence, or simply an
instability of the program due to some other reason (such as
waiting for a resource or I/O).
Example of Output
The following is an example of this part of the output.
===================================================
Noticeable deviations:
===================================================
For loop 49, execution time: 5551 microseconds, deviation: 614 microseconds
Summary of
Deviations Based on Previous Iteration
This output summarizes the deviation occurrences for this run:
-
First is printed the number of
"noticeable" deviations based on the previous iteration. These
are deviations that are greater than 500 microseconds in absolute
value, that is, either positive or negative deviation. Since the
deviation for a loop is measured relative to the previous loop, the
exact number of deviations is in most cases half the number of the
reported number, because positive deviations are generally followed
by a negative deviation, as the deviation tends to return closer to
the reference value. This is the case for a very stable run, but not
when the values vary greatly.
-
The next part of the output is the longest
length of series of identical values. This figure represents the
highest number of identical elapsed time results in a row that
occurred, and the value of that result. For example, if the value "5066"
was found 2 times in a row, and the value "5122" was
found 3 times in a row, then this latter figure would be reported.
Example of Output
The following is an example of this part of the output.
============================================
Summary of deviations based on previous:
============================================
Nb noticeable positive deviations ( > 500 microseconds) = 1
Nb noticeable negative deviations ( < -500 microseconds) = 0
Nb max identical results = 3 for value 5122
(The lines above have been shortened for printability.)
Percentage
Report of Deviations
This part of the output reports
within which deviation range the deviations have been found, both
related to reference value and to mean value. Thus there are two
deviations reported. This provides a valuable distribution of the
deviations.
This output is a list of deviations grouped by
interval, for which is given the percentage of occurrence against the
total number of loops. This output is ordered by time slice, from 0
microseconds to 1 millisecond, where the time slices are fine-grained
up to 100 microseconds (5, 10, 20, 40, 50, 75, 100) and then
coarser-grained from 100 to 1000 microseconds (200, 300, 400, 500,
750, 1000). The last value is for all deviations above 1
millisecond.
If the run is very deterministic, the reference value will
make
much sense, and so the deviations from this reference are the
most relevant. Otherwise the deviations to consider are rather the
deviations from the mean value.
Example of Output
The following is an example of this part of the output.
==============================================
Percentage report of deviations:
==============================================
---- deviations based on most frequent value 174 ----
Lower values below are exclusive, higher value are inclusive :
0 to 5 microseconds deviations: 333, 33.30 %
5 to 10 microseconds deviations: 215, 21.50 %
10 to 20 microseconds deviations: 103, 10.30 %
20 to 40 microseconds deviations: 13, 1.30 %
40 to 50 microseconds deviations: 4, 0.40 %
50 to 75 microseconds deviations: 5, 0.50 %
75 to 100 microseconds deviations: 2, 0.20 %
100 to 200 microseconds deviations: 1, 0.10 %
200 to 300 microseconds deviations: 0, 0.00 %
300 to 400 microseconds deviations: 0, 0.00 %
400 to 500 microseconds deviations: 0, 0.00 %
500 to 750 microseconds deviations: 0, 0.00 %
750 to 1000 microseconds deviations: 0, 0.00 %
More than 1000 microseconds deviations: 0, 0.00 %
---- deviations based on mean execution time 177 ----
Lower values below are exclusive, higher value are inclusive :
0 to 5 microseconds deviations: 851, 85.10 %
5 to 10 microseconds deviations: 84, 8.40 %
10 to 20 microseconds deviations: 33, 3.30 %
20 to 40 microseconds deviations: 12, 1.20 %
40 to 50 microseconds deviations: 3, 0.30 %
50 to 75 microseconds deviations: 4, 0.40 %
75 to 100 microseconds deviations: 2, 0.20 %
100 to 200 microseconds deviations: 1, 0.10 %
200 to 300 microseconds deviations: 0, 0.00 %
300 to 400 microseconds deviations: 0, 0.00 %
400 to 500 microseconds deviations: 0, 0.00 %
500 to 750 microseconds deviations: 0, 0.00 %
750 to 1000 microseconds deviations: 0, 0.00 %
More than 1000 microseconds deviations: 0, 0.00 %
Ordered List of
Values
This output is printed only if the verbosity flag is set for
the run.
This output summarizes the elapsed times in decreasing
order of
the number of occurrences in the loop. The elapsed times are
grouped by the number of occurrences, and are in decreasing order of
deviation from the “reference” value within a group. The
deviation to mean execution time value is also given.
Example of Output
The following is an example of this part of the output.
=======================================
Ordered list of values:
=======================================
Execution time 5091 occurred 18 times,
ref deviation is 0 microseconds,
mean deviation is -49 microseconds
Execution time 5122 occurred 13 times,
ref deviation is 31 microseconds,
mean deviation is -18 microseconds
Execution time 5121 occurred 12 times,
ref deviation is 30 microseconds,
mean deviation is -19 microseconds
Execution time 5086 occurred 12 times,
ref deviation is -5 microseconds,
mean deviation is -54 microseconds
..... Lines omitted .....
Execution time 4921 occurred 1 times,
ref deviation is -170 microseconds,
mean deviation is -219 microseconds
Execution time 4913 occurred 1 times,
ref deviation is -178 microseconds,
mean deviation is -227 microseconds
(The lines above have been shortened for printability.)
Summary of
Results
This part of the output provides the most relevant information
about the determinism of the run.
-
Mean execution time: The mean
of all the elapsed time values.
-
Best execution time: The
minimum elapsed time value that occurred.
-
Worst execution time: The
maximum elapsed time value that occurred.
-
Most frequent execution time: The
elapsed execution time value that occurred the most frequently,
together with the number of occurrences.
-
Execution time jitter: The
final jitter value found for the run.
-
Standard deviation: The final
standard deviation value found for the run.
Example of Output
The following is an example of this part of the output.
============================================
Summary of results:
============================================
Mean execution time: 5140 microseconds
Best execution time: 4913 microseconds
Worst execution time: 5551 microseconds
Most frequent execution time: 5091, 312 occurrences
Execution time jitter: 638
Standard deviation: 97.84
[Contents]
Testing
Determinism With Your Own Code
Read this section if you want to execute your own code and test it
for determinism. First, be sure to read Description of
the Example Programs.
The example programs use "stress classes" to put stress on the
system resources. There are two stress classes to calculate Fibonacci
numbers (Fibonacci1 and Fibonacci2), and two stress
classes to allocate memory that will be
garbage-collected (GarbageProducer1 and GarbageProducer2).
The structure of the example programs allows you to easily
plug in your own code as an additional stress class. In this way you can
observe the deterministic behavior of the programs as they execute your
code. This provides
you with a good benchmark for testing parts of your code with Java RTS.
This section describes in detail how to plug in your
code and have it executed by the example programs.
For the purpose of plugging in external code, a simple hook
has been put in the source code:
- A dedicated interface named StressingObject
is available in the source directory. It declares two methods:
- The Fibonnacci1 class implements the StressingObject
interface. You do not have to modify this class.
-
Each program in the set declares the method
void
stress(String stressingClassName, int nb_inner_iterations, int
nb_stress_class_iterations).
This method dynamically instantiates
one StressingObject
of the passed class, sets the nb_stress_class_iterations
for this stressing object, and then calls the stress() method
of the stressing object for a number of times equal to nb_inner_iterations.
You do not have to modify anything in this method.
Each run() method of
the programs in the set has the following commented call right after
the Fibonacci calculation call:
stress("Fibonacci1",nb_inner_iterations,
nb_stress_class_iterations)
For your own stressing class to be active, just uncomment this
line of code and replace "Fibonacci1" by the name of your class that implements
the StressingObject interface. You will have
to do this once in the NonDeterministic program, twice in the Deterministic
program, and twice in the GCDeterministic program. You can suppress
the Fibonacci stress by commenting out the computeFibonacci
line at the beginning of the outer loop, in the run() method.
Each static class initialization includes also a commented
call to stress("Fibonacci1",10,10).
You can uncomment this section to preinitialize your own stressing
class before the effective run of the program.
-
You can also comment out the garbage-producing calls
to produceGarbage() if you want to see only the
effect of the stress produced by your code.
Therefore, to plug in your own code,
you create a class that will be a wrapper above your
code and which will implement the StressingObject
interface. You also have to uncomment the programs inside their
static initialization part and run()
methods. You then recompile, and you have finished: the programs will
now use the stress provided by your own code.
Note, however, that the stress methods of the programs
instantiate only one StressingObject. This
is different from the Fibonacci
and GarbageProducer stressing methods,
which instantiate two Fibonacci objects and two garbage-producer
objects, respectively.
[Contents]
Copyright © 2007, 2010, Oracle Corporation and/or its affiliates
|
|