ContentsIntroductionRTGC Configuration and Tuning Levels [Readme File] [Installation Guide] [Release Notes] [Compilation Guide] [Technical Information] [Java RTS DTrace Provider] [Practical Introduction to Achieving Determinism] IntroductionSun Java Real-Time System 2.0_01 (Java RTS) supports two garbage collectors:
The other garbage collectors featured in the Java SE version of the HotSpot virtual machine are not supported. The default garbage collector is the new real-time garbage collector. This RTGC can be configured to execute in a deterministic manner. The RTGC might exhibit lower throughput than non-real-time collectors, particularly on uniprocessors. For applications that are more sensitive to the collector's throughput than to the pause times induced by the collector's execution, the RTGC can be turned off with the -XX:-UseRTGC option. In this case, the non-real-time, serial collector is used, and all threads except NoHeapRealtimeThreads might suffer from pause times caused by garbage collection. This document will help you understand how the RTGC works, focusing on the most important configuration parameters and use cases. See also the Practical Introduction to Achieving Determinism document for a detailed description of example programs that allow you to begin using Java RTS, including the RTGC, quickly and easily. RTGC Configuration and Tuning LevelsThe RTGC's activity can be configured with a set of special parameters. Some minimum configuration is necessary. Further tuning can be performed by trying different values of various parameters, in various combinations. The parameters that you might configure depend upon your level of expertise with the RTGC: basic, advanced, or expert. All the parameters are listed in the tables at the end of this document, in the section Command-Line Options. Java RTS Thread Types and Their PrioritiesThe function of the RTGC is based on the criticality of the application threads. Critical threads are those that must execute within well-defined time limits so that their response times are predictable (that is, deterministic), whereas non-critical threads do not have these constraints. Java RTS real-time threads can be deemed critical or non-critical, whereas non-real-time threads (java.lang.Thread instances) are non-critical by definition. The thread types are summarized as follows:
In all cases, it is the programmer's responsibility to set the correct priority to reflect the level of criticality of any thread in the Java RTS application. Note that this is the base priority of the thread. Threads that share locks with more critical threads can be automatically boosted from time to time, via priority inheritance, to a higher priority, and the RTGC will take into consideration the change to this higher priority. The figure below shows the priority levels for the RTGC and for the different thread types. Figure 1 Priority Levels for RTGC and Threads
By default, the "normal" priority of the RTGC is above that of non-real-time threads (JLTs), but at the lowest priority for real-time threads. When the Java RTS VM decides that memory is getting low, the RTGC is boosted to its "critical" priority, which is higher than the priority of the non-critical threads, but still lower than that of the critical threads. Note: The figure also shows the priority level of the NHRT threads, but these threads do not depend on garbage collection activity since they do not allocate memory from the heap. Real-Time Garbage Collection PrinciplesAny real-time garbage collector is expected to recycle memory without disrupting the temporal determinism of time-critical tasks. In addition, a real-time GC must recycle memory fast enough so that these tasks are not blocked if the memory is exhausted when they attempt to allocate more. This means that a real-time GC must be granted enough processor cycles to complete its work on time. On the other hand, granting it too many processor cycles reduces the application's overall throughput. Therefore, if application throughput is a concern, users must properly configure real-time GCs. Work-based real-time collectors execute when allocations are performed, regardless of the priority of the threads that are executing. Their configuration defines how much time is granted to the GC for each allocation. Time-based real-time collectors execute at specified times, and at top priority. Their configuration defines when and how often the GC should run. For instance these real-time GCs can be scheduled as periodic threads. However, most of the available real-time GCs do not allow threads to preempt the GC. The real-time application could be frozen for a considerable amount of time. The better the real-time GC, the smaller this time can get. Unfortunately, this approach does not scale well on multiprocessors. Another frequent drawback of real-time GCs is that, to ensure determinism, users must analyze the memory behavior of all the threads of their applications. Adding a new low-priority thread that allocates a lot of memory might force the user to change all the configuration parameters of the real-time GC. It might even result in an application for which the GC cannot ensure determinism, requiring the addition of more memory or CPU resources. For instance, this prevents the use of these real-time GCs for a server on which servlets are dynamically added and removed, unless the server has enough memory and CPUs. Sun's Java RTS and its Real-Time Garbage Collector (RTGC) avoid these two pitfalls. Sun's Real-Time Garbage CollectorThe important point about the RTGC provided with Java RTS is that it is fully concurrent, and thus can be preempted at any time. There is no need to run the RTGC at the highest priority, and there is no stop-the-world phase where all the application's threads are suspended during the GC execution. On a multiprocessor, one CPU can be doing some GC work while an application thread is making some progress on another CPU. Thus, the RTGC offered by Java RTS is very flexible. While other real-time GCs usually either have to be executed as incremental, periodic, high-priority activities or else induce an allocation-time overhead, the Java RTS RTGC can execute according to many different scheduling policies. In addition, instead of trying to ensure determinism for all the threads of the application, the RTGC considers that the criticality of an application's threads is based on their priority and ensures hard real-time behavior only for real-time threads at the critical level, while trying to offer soft real-time behavior for real-time threads below the critical level. This reduces the total overhead of the RTGC and ensures that the determinism is not impacted by the addition of new low-priority application threads. In addition, this makes the configuration easier because there is no need to study the allocation behavior of an application in its entirety in order to configure the RTGC. Determinism is designed by looking only at the critical tasks. By setting only two parameters, namely a memory threshold and a priority, you can ensure that threads running at critical priority will not be disrupted by garbage collection pauses. The big advantage of this approach is that these parameters are independent of the non-critical part of the application. You do not need to reconfigure the RTGC when you add a new non-critical component or when the machine load changes. The RTGC tries to recycle memory fast enough for the non-critical real-time threads, but without offering any guarantees for them. If the non-critical load increases, the RTGC might fail to recycle memory fast enough for all the threads. However, this will not disrupt the critical threads, as long as the memory threshold is correctly set for the application. Only the non-critical real-time threads will be blocked and temporarily suffer from some jitter, due to memory recycling. The RTGC has an auto-tuning mechanism that tries to find the best balance between determinism and throughput. Expert users can configure a few parameters to control this auto-tuning in order to improve this balance. Real-Time Garbage Collection SchedulingThe most important property for the RTGC is the priority at which it runs. Our real-time garbage collector dynamically changes its priority to try to balance throughput and determinism. For the parallel (multi-processor) version of the RTGC, the number of threads supporting RTGC activity is also important. See the section Improving the Determinism by Using Multiprocessors. The RTGC can function in three different modes during a garbage-collection cycle, based on the remaining free memory available:
The figure below shows how the RTGC is scheduled (with one CPU), based on free memory thresholds. Figure 2 RTGC Scheduling on One CPU
The RTGC starts its next cycle at its initial (normal) priority
when free memory goes below the startup memory threshold.
This threshold is calculated from a set of parameters (which the expert user can tune).
One of the parameters used in the calculation of this threshold is
The Java RTS VM calculates a critical memory threshold,
based on another set of parameters (which
the expert user can tune). One of the parameters used in the calculation
of this threshold is If the free memory falls below the critical reserved memory threshold
(specified by the RTGCCriticalReservedBytes
parameter), the non-critical threads are blocked on allocation,
waiting for the RTGC to recycle memory. This guarantees that
the critical RT threads, and only the critical RT threads,
will be able to allocate memory from the reserved amount.
This mode is called the deterministic mode,
as it assures determinism for the critical threads.
If the free memory remains
below The figure also shows the priority level of the NHRT threads, but these threads are not involved in garbage collection since they do not allocate memory from the heap. Tuning the RTGCThe principle behind the RTGC is the balance between determinism and throughput. With a finite amount of system resources (CPU time and memory in particular), Java RTS must ensure determinism for the critical threads, while ensuring that the other threads also are able to execute in a timely manner. Therefore, the RTGC must recycle enough memory for the allocation requests, while not consuming all the CPU cycles. This delicate balance is configured with several parameters. Since the RTGC continuously tunes its own operation, only two of these parameters need to concern the basic user:
The remaining parameters should be used in advanced tuning by the advanced or expert user. Note: The default behavior of Java RTS is to allow the RTGC to use all the CPU power, because determinism is our primary goal. However, with a bad configuration or with applications that have a large percentage of reachable objects, the RTGC might run continuously to try to ensure determinism. In such a case, the expert user can configure the RTGCWaitDuration parameter. This parameter specifies a wait period between two consecutive RTGC cycles. It defaults to zero, but in exceptional circumstances you can change it to force a small delay between two GC cycles. This can help prevent the system from being blocked due to continuous garbage collection. All the parameters are listed in the tables at the end of this document in the section Command-Line Options. Normal Mode: Recycling Memory Without Disrupting Any Real-Time ThreadsThe RTGC auto-tunes the parameters that control the "normal" mode of its functionality. Therefore, the basic user does not need to tune these parameters. The advanced user might need to tune the value of the RTGCNormalPriority parameter. Only expert users should attempt to tune the parameters used to determine when the RTGC starts its next cycle. In the RTSJ, the impact of the GC on real-time threads is implementation-dependent. With Java RTS, the threads running at the highest priorities can completely avoid garbage collection pauses. However, the RTGC must start soon enough to ensure that it completes before memory is exhausted. This works as long as the allocation behavior is relatively stable, because the RTGC must be started soon enough to handle the worst case allocation. In normal mode, the RTGC runs at a lower priority than the real-time threads. Unfortunately, this does not scale well to larger applications. As a general rule, the more real-time threads are running (and allocating memory), the longer an RTGC cycle lasts. The amount of memory allocated during a cycle can quickly increase. To ensure determinism, we must take into account the worst case allocation behavior, which could possibly cause the RTGC to run continuously. And even this might not be sufficient to guarantee that memory would be recycled fast enough. However, real-world applications normally contain a mix of critical and non-critical tasks. For some of the non-critical tasks, timing is important but not critical. You might be willing to miss a few deadlines when there is an allocation burst. The gain is that this allows the RTGC to start much later, improving the global throughput. With Java RTS, we can configure the behavior of the non-critical threads independently from the behavior of critical ones. We consider that the normal behavior for real-time threads is to run at a higher priority than the RTGC, while the java.lang.Threads run at a lower priority than the RTGC. The RTGC starts running at RTGCNormalPriority (which defaults to the minimal real-time priority). The startup memory threshold determines when the RTGC starts. The idea is to start it early enough so that it completes its cycle before reaching the critical memory threshold, which would increase the RTGC priority to its critical level, creating jitter for the non-critical real-time threads. The auto-tuning mechanism takes into account the allocation performed during the last RTGC cycles to try to start the RTGC just on time, thus maximizing the throughput while avoiding pause times, assuming that the allocation behavior is stable.
Critical Mode: Boosting the RTGC to Critical PriorityWhen the RTGC is boosted to its critical priority, non-critical threads can be preempted by RTGC threads, but the RTGC can be preempted by critical threads. Therefore this can cause non-critical threads to pause for a long time, unless in a multiprocessor environment. See the section Improving the Determinism by Using Multiprocessors. You can set the critical priority with the RTGCCriticalPriority parameter.
Deterministic Mode: Ensuring Determinism for Time-Critical ThreadsThe real-time threads that run at a priority higher than RTGCCriticalPriority are called the critical real-time threads. The RTGC must ensure that their worst case pause time is very low (at worst a few hundreds of microseconds). To guarantee this determinism, you must specify the RTGCCriticalReservedBytes parameter. You can also tune the critical priority level using the RTGCCriticalPriority parameter. When the free memory goes below the limit specified by RTGCCriticalReservedBytes, Java RTS blocks the non-critical threads from allocating memory to prevent them from disrupting the critical threads. Note that the critical threads continue running at a priority higher than the RTGC and can preempt it at any time. If RTGCCriticalReservedBytes is too low, a critical thread might block when the memory is full, waiting for the RTGC to free some memory. If RTGCCriticalReservedBytes is too high, the RTGC will run more often, preventing the lower priority threads from running. Hence, this reduces their global throughput. Improving the Determinism by Using MultiprocessorsThe RTGC is fully concurrent, that is, application threads can run concurrently with the RTGC. Therefore, we can improve the determinism of non-critical threads on multiprocessors by specifying how many processors, or threads, the RTGC can use. This is both simpler and safer than depending on parameter tuning alone. The drawback of parameter tuning alone is that when the RTGC parameters are underestimated, all non-time-critical threads might be suspended from time to time. If they are overestimated, the java.lang.Threads will be suspended very often. At worst, they will not make any progress because the RTGC could consume all the CPU not used by the real-time threads. To specify the maximum number of CPUs that the RTGC can use, set the RTGCMaxCPUs option. This is equivalent to setting the maximum number of parallel threads for the RTGC. Thus, when the RTGC runs, there should still be some CPU cycles not used by the RTGC and the critical threads. (It is assumed that critical threads use only a small part of the CPU power.) Hence, non-critical real-time threads should still be able to make some progress. Even java.lang.Threads could get some CPU cycles. When the real-time load is low, RTGCMaxCPUs threads executing the RTGC at RTGCNormalPriority will be enough to cope with allocation performed on the remaining CPUs. However, if the allocation rate increases or if more real-time threads are running, the RTGC at RTGCNormalPriority, even if running continuously, might not recycle memory fast enough. When this happens, the RTGC priority is boosted, but it is still limited to using only RTGCMaxCPUs threads. Even if memory goes below RTGCCriticalReservedBytes, no more than RTGCMaxCPUs threads will be used. On multiprocessors, as on uniprocessors, the VM performs auto-tuning to try to maximize throughput. It will first try to complete the RTGC on time by using RTGCMaxCPUs threads running at RTGCNormalPriority. Expert users can balance determinism and throughput by using the NormalSlideFactor and NormalSafetyMargin parameters. However, an underestimation has a limited impact because, even if the RTGC is boosted to RTGCCriticalPriority, the non-critical threads still get some CPU power. Expert users should focus on the critical memory threshold, which defines when the RTGC threads start running at RTGCCriticalPriority. This is done through CriticalSlideFactor, CriticalSafetyMargin, and CriticalMinFreeBytes. (See the section Description of the Auto-Tuning Mechanism.) The big advantage to this configuration is that overestimating the threshold (for example, small slide factors and big safety margins) is not dangerous. Even if this configuration makes the RTGC run continuously on RTGCMaxCPUs CPUs, there will always be a few CPUs available for the lower priority threads. The impact on determinism depends on the ratio between RTGCMaxCPUs and the total number of CPUs. The following example shows a simple configuration to easily achieve determinism
(though without optimizing throughput) by dedicating multiple CPUs to the RTGC.
Since the RTGC usually needs about 25% of the CPU
power in order to guarantee determinism, we dedicate two CPUs to the RTGC,
assuming an 8-CPU system.
We also force garbage collection to run almost continuously by
specifying the -Xms1G -Xmx1G -XX:NormalMinFreeBytes=1G -XX:RTGCMaxCPUs=2 When making estimations, consider the following:
As a summary, this expert mode is a parameterized way to offer more determinism for the real-time non-critical threads. In addition, it does not block the machine, avoiding the case where the RTGC runs continuously on all the CPUs, at very high priority. However, there is no free lunch. This automatic gain in determinism can quickly lead to throughput issues. Description of the Auto-Tuning MechanismThis section describes in detail the auto-tuning mechanism that Java RTS uses to determine the optimal conditions to start the RTGC or to boost its priority to its critical level. These conditions are referred to as the startup memory threshold and the critical memory threshold, respectively. Expert users might want to try to tune these threshold values, using specific parameters. Note: This same mechanism is used to determine both the startup memory threshold and the critical memory threshold; the only difference is that the parameter names begin with Normal and Critical, respectively. Therefore, we drop Normal and Critical from the parameter names used in the calculations below. The VM does not try to use the worst case allocation behavior to ensure the RTGC will complete on time, because this would run the RTGC too often after the first period of allocation bursts. Hence, the auto-tuning mechanism uses a depreciation mechanism to compute a "sliding value" at the end of each GC cycle. This sliding value is the higher of its depreciated previous value and the memory that was allocated during the current GC cycle. A safety margin is then applied to this sliding value to compute the next memory threshold. The RTGC will start (or be boosted to its critical priority) when the free memory goes below that threshold. The RTGC uses the amount of memory that was allocated previously in order to estimate the next threshold. It is expected that this value will "slide" to a stable amount. (This is why we call it a "sliding value.") The following formulas are used to calculate the startup memory threshold and the critical memory threshold:
This calculation is performed at the end each RTGC cycle. In the formulas, the variables have the following meanings:
The first formula above uses the sliding value calculated during the previous GC cycle. For the initial cycle of the RTGC, the sliding value is zero, as there was no previous GC cycle. With the slide factor applied, the sliding value is still zero. Therefore, the amount of memory allocated during this first cycle becomes the new sliding value. For subsequent GC cycles, the sliding value calculated during the previous GC cycle is depreciated by the slide factor percentage. This result is compared with the amount of memory allocated during the current GC cycle, and the higher of the two becomes the new sliding value, which represents the amount of memory we predict will be allocated during the next cycle. Then, in the second formula above, this new sliding value,
with the safety margin percentage applied, is compared with the
user-estimated free memory threshold ( The expert user can configure SlideFactor, SafetyMargin, and MinFreeBytes. The slide factor represents the percentage of memory allocated during the previous cycles that will not be considered in the calculation of the prediction of the allocation needs during the next garbage-collection cycle. In other words, this can be considered as the speed with which allocation bursts are forgotten. As an example, let's say that you have set the SlideFactor parameter to 40. In this case, at the end of each GC cycle, the sliding value is reduced by 40% before being compared to the memory allocated during the cycle. If a large amount of memory is allocated during the cycle, only 60% of that amount will be used to determine when to start the next cycle or boost the priority. If the slide factor is increased to 80%, then only 20% of the allocation is used in the calculation, and the effect of the allocation burst is more quickly "forgotten." Therefore, by increasing the SlideFactor parameter, you ensure that the RTGC will not continue consuming a lot of CPU cycles after a period of allocation bursts. As mentioned above, the sliding value is initially zero. It also goes back to zero if no allocations are performed during a few GC cycles. Note that, if the sliding value is zero for the calculation of the memory thresholds (startup or critical), the RTGC could start or be boosted too late and cause jitter for the first few cycles (the time needed for the application to reach a steady state). If you are concerned with this "initial learning phase," you can specify a MinFreeBytes threshold, which should roughly correspond to the average behavior of the application. The SafetyMargin parameter is a percentage of the calculated sliding value that is added to the value before we compare it to the free memory threshold. In this way, the RTGC ignores small variations in the level of memory allocations. This represents an estimation of the variation of allocation rate that can be supported without creating unacceptable jitter. Significance for Normal ModeDecreasing NormalSafetyMargin or NormalMinFreeBytes improves throughput by starting the RTGC later. Unfortunately, the RTGC might start too late, particularly at application startup or when the allocation rate increases. In this case the RTGC would be boosted to its critical priority, preempting the non-critical real-time threads. Increasing NormalSafetyMargin or NormalMinFreeBytes avoids this situation. Note that if NormalSafetyMargin is too big, the RTGC might run continuously, and threads at a priority lower than RTGCNormalPriority might be prevented from running. This might also happen if the real-time threads often have allocation bursts and the NormalSlideFactor is too low. Increasing NormalSlideFactor allows the RTGC to forget more quickly the allocation bursts, improving the throughput but increasing the likelihood of jitter during the next allocation burst. Significance for Critical ModeFor critical mode, these tuning options are mainly useful in a multiprocessing enviroment. See the section Improving the Determinism by Using Multiprocessors. Summary of the RTGC SchedulingThe VM auto-tunes the memory thresholds to try to start or boost the RTGC as late as possible (to maximize throughput) while trying not to have to increase its priority or block the non-critical threads (which impacts their determinism). The RTGC can go through three different phases.
Finding the Right Configuration ParametersAs mentioned above, you must configure the RTGCCriticalReservedBytes parameter. This is the only parameter that you are required to configure. You might also want to configure the RTGCCriticalPriority parameter. For the rest, Java RTS includes several auto-tuning mechanisms to try to lighten the configuration burden, as detailed in Description of the Auto-Tuning Mechanism. Java RTS provides a garbage collector API and a few additional MBeans to gather additional information. See Using MBeans to Monitor and Configure the RTGC. However, the simplest way to figure what is happening is to use the -XX:+PrintGC and -XX:+RTGCPrintStatistics options (and optionally -Xloggc:<filename> to redirect the output to a file). The most important number to look at in the With respect to non-critical real-time threads, the most important number is the priority at which the GC ends. All the threads below that priority have suffered from jitter during that GC cycle. This is not an error. It comes from the balance between throughput and determinism. Expert users can try to improve the determinism by ensuring that the worst case is taken into account for a longer time (by reducing the slide factors in the auto-tuning mechanism) and/or that the RTGC can cope with quicker changes in the allocation rates (by increasing the safety margins or the minimum threshold). To optimize the throughput, you can try to tune the RTGC so that the minimum free memory is nearly equal to the memory threshold that would have caused the RTGC to enter the next mode. If RTGCMaxCPUs is set, you should also determine whether the RTGCCriticalReservedBytes boundary was reached by checking the GC output; if the boundary was reached, the GC prints the number of blocked threads. Additional Note on Pause TimesThe RTGC is truly concurrent. The only time it prevents a thread from running is when it has to look at this particular thread's Java stack, where the local variables and operands are stored. Hence, the main potential source of pause time for a given thread is caused by the scanning of its stack. Since a thread is not impacted by the scanning of the other thread stacks, the pause time for a thread is smaller than with non-concurrent GCs. To reduce this pause time even further, threads should execute in compiled mode. See the Java RTS Compilation Guide for further details on how to compile the methods executed by time-critical code. In general, the pause time primarily depends on the stack's depth. Pause times are very small for threads that do not extensively perform recursive method calls. Do not forget that there are a lot of other jitter sources that might look like RTGC pause times. Java RTS provides solutions to avoid them. For instance, the Java RTS Compilation Guide covers the issues related to late class loading and initialization. Remember also that only real-time priorities provide real-time guarantees. For priorities lower than real-time, the threads are scheduled with a time-sharing policy to reduce the risks of starvation in the non real-time part of the application. Hence, a thread at a priority lower than real-time can be preempted at any time to let lower priority threads make some progress. Using MBeans to Monitor and Configure the RTGCVersion 2.0 of Java RTS introduced a garbage collector API, useable from the application code, which allows you to dynamically change the RTGC parameters either locally from your Java application or remotely through a management bean. The release includes an initial version of the new API and MBeans. Feedback can be sent to the Java RTS team. For additional information, look in the Javadoc for the FullyConcurrentGarbageCollector class. The new MBeans are registered in the realtime.rtgc namespace and can be viewed with, for example, the JConsole tool. We have also extended the LastGCInfo attribute of the garbage collector MBean. To see a description of the attributes and operations ot these new MBeans, refer to the corresponding Javadocs:
Command-Line OptionsCaution: The "-XX" options are not part of the official Java API and can vary from one release to the other. RTGC Basic Options
RTGC Advanced Options
RTGC Expert Options
|