C H A P T E R  48

RAM Test (ramtest)


ramtest Description

ramtest is designed to stress the memory modules (RAM) instead of the whole memory subsystem. The test is optimized to achieve large memory bandwidth on UltraSPARC III (USIII) and UltraSPARC II (USII) class of CPUs. ramtest has an integrated ECC error monitor which reports the ECC errors found during the test run.

Enhanced for UltraSPARC T1 (high-end processor with chip multithreading [CMT]) based systems to use background patterns to work as the base patterns for read/write operations within the march test selected. This technique has proven to be very effective and useful for l2sram tests that use similar pattern generation techniques. You can, however, still use the previous methods.



Note - All the subtests/marches that are not intended for a particular platform are treated as an invalid option and the test provides a FATAL message indicating the invalid option.



You can now specify any march any number of times and in any order. The order of the entire command-line interface sequence is maintained for execution. This allows for a reduction in run time and provides an option for running an effective and stressful march earlier.



caution icon

Caution - This is an exclusive mode test. This test can not be run in parallel with any other tests or applications.




ramtest Options

To reach the following dialog box, right-click on the test name in the System Map and select Test Parameter Options. If you do not see this test in the System Map, you might need to expand the collapsed groups. Refer to the SunVTS User's Guide for more details.


FIGURE 48-1 ramtest Test Parameter Options Dialog Box

Screenshot of the ramtest Test Parameter Options dialog box


The following table details the ramtest options:


TABLE 48-1 ramtest Options

ramtest Options

Description

Reserve

Reserve option represents the percentage of physical memory that is assumed to be in use by the OS or other processes. If you see excessive swapping while running ramtest, increase this percentage. The default is 20%; this means that ramtest allocates 80% of physical memory size for testing. Swapping decreases stress on memory and increases it on the system itself. For memory testing purposes, minimize swapping by tuning the reserve option.

If the allocation or locking (in case Memory Locking is enabled) does not succeed, the amount of memory is reduced and the allocation process is repeated. Once the allocation succeeds, the amount of memory allocated is displayed in the messages.

Stride

By default this option is set to Random. It can be set to Column or Row also. In case of random, either Row or Column are randomly selected for each pass. Value of stride defines the memory locations addressed consecutively in certain subtests, in a hardware dependent manner. All testable memory is still tested. Using different strides, checks coupling among different sets of memory cells; therefore random is the recommended value for this option unless both Column and Row are being explicitly used in different instances. For FA type of uses, stride may also be set to UserDefined, in this case the test will stride the number of banks specified in the userstride option.

Stride may be set to Custom in which case the stride values are randomly selected from the strides specified in the stridemask value.

User-Defined Banks to Stride

Set the number of banks that the test should stride. The value is currently limited to between 1 and 16. Row striding is not possible while using this option.

Stride Mask

Specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the Least Significant Bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using Bit 14 of the address). Mulitple bits can be set mixing row and column strides. The Memory Controller section of the PRM for the CPU of the test system for information on how the memory reference address is divided between rows and columns in the DRAM.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.

Default values specific to UltraSPARC T1 and UltraSPARC IIIi based systems are as follows:

  • ram.h:#define DSTRIDEMASK 0xC600
  • ram.h:#define DSTRIDEMASKFIESTA 0x1C040
  • ram.h:#define DSTRIDEMASKNIAGARA1 0x180C00

Memory Locking

By default memory locking is Disabled. To turn it on, set lock to Enabled. This test uses ISM to lock the memory into the core, which allows 4 MB virtual pages and avoids swapping. Running without locking, adds more randomness to the addressing sequence. If memory locking with ISM fails, the test allocates on heap and tries to lock the memory allocated on heap.

ECC Error Monitor

ECC Monitor is Enabled by default. The ECC error monitor runs as a separate thread in the test. When an ECC error is detected, the message is displayed in the test output. Turn off the monitor by setting this option to Disabled.

ECC Error Threshold

The number of ECC errors after which the test will stop (if ECC monitor is running). When the threshold is reached, the test exits with a nonzero exit code. If set to zero, the test will still report all the errors but will not stop. The default of threshold is 2.

Number of Passes

Specifies the number of passes, in the same instance. Increase passes in case lock is enabled. This saves time spent on locking the memory every time a new process or instance is spawned by the SunVTS kernel. This pass has no relation with the system passes in the SunVTS infrastructure. It appears that ramtest is taking longer to complete system passes.

NTA March Test

Specifies number of loops of NTA March(30N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA March Test attacks coupling and stuck at faults. NTA March is efficient at finding single-, double-, and some triple-bit errors. Depending on the stride option, coupling faults can be found between cells in adjacent columns, or rows that are targeted.

Test time will be higher when row striding is selected because of greater page faults generated. For efficiency purposes, total memory is divided among available CPUs.

LA March Test

Specifies number of loops of LA March(22N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LA March Test attacks coupling and stuck-at-faults.

LR March Test

Specifies number of loops of LR march(14N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LR March Test attacks coupling and stuck-at-faults.

SS March Test

Specifies number of loops of SS March(22N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. SS March Test attacks simple static faults.



ramtest Test Modes


TABLE 48-2 ramtest Supported Test Modes

Test Mode

Description

Exclusive

Stresses memory modules and generates enormous amount of memory traffic.



ramtest Command-Line Syntax

/opt/SUNWvts/bin/sparcv9/ramtest standard-arguments [ -o

[ bgpattern=Disabled/Solid/Checkerboard/RowStripe/ColumnStripe/Random/Randexcl
[ reserve=Integer between 0 and 90 ]
[ stride=
Row | Column | Random | UserDefined | Custom
[ userstride=1 - 16 ]
[
stridemask=0x40 - 0x400000 ]
[ lock=Enabled | Disabled ]
[ dratio=
Integer between 0 and 100 ]
[ eccmonitor=
Enabled | Disabled ]
[ threshold=
Integer i; 0 = i = MAX-INT ]
[ pass=
32 bit integer ]
[ ntaloops=
32 bit integer]
[ laloops=
32 bit Integer ]
[ lrloops=
32 bit Integer ]
[ ssloops=32 bit Integer] ]

 


TABLE 48-3 ramtest Command-Line Syntax

Argument

Description

bgpattern

Enhanced for UltraSPARC T1 (high-end processor with chip multithreading [CMT]) based systems to use background patterns to work as the base patterns for read/write operations within the march test selected. This technique has proven to be very effective and useful for l2sram tests that use similar pattern generation techniques. You can, however, still use the previous methods.

reserve

Specifies the amount of memory that will not be allocated for testing. reserve represents a percentage of the total physical memory in the system. When the test starts, it probes the total memory present in the system, then tries to allocate (100 - reserve)% of memory. If the allocation or locking does not succeed the amount of memory is reduced before the retry. Before starting the test, the amount of memory allocated for testing is displayed.

Default value for reserve option is 20. For UltraSPACR IIIi platforms, the default value is 25.

On low memory systems, keep the reserve value higher to avoid excessive swapping.

For 32-bit booted systems, the reserve value represents the percentage of 4 GB rather than the percentage of total physical memory.

stride

By default stride is set to random. It can be set to Column or Row also. For random, either Row or Column are randomly selected for each pass. The value of stride defines the memory locations addressed consecutively in certain subtests, in a hardware dependent manner. All testable memory is still tested. Using different stride checks coupling among a different set of memory cells, therefore random is the recommended value for this option unless both Column and Row are being explicitly used in different instances. For FA type of uses, stride may also be set to UserDefined, in this case the test will stride the number of banks specified in the userstride option.

stride may be set to Custom in which case the stride values are randomly selected from the strides specified in the stridemask value.

userstride

Use this option to set number of banks the test should stride. One of the good choices could be the interleave on the suspect bank, during FA. the value is limited between 1 and 16. This also means row striding is not possible while using this option.

stridemask

When stride=custom is selected, this value specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the Least Significant bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using bit 14 of the address). Mulitple bits can be set by mixing row and column strides.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.

lock

By default memory locking is disabled. To turn it on set the lock to enabled. The test uses ISM to lock the memory into the core, this gives 4 MB virtual pages and avoids swapping. Running without locking adds more randomness to the addressing sequence.

If memory locking with ISM fails, the test allocates on heap and tries to lock the memory allocated on heap.

On low memory systems, this option can be enabled to avoid excessive swapping.

Solaris 10 users, perform the following steps:

1. Issue the following command:

% prctl $$

If resource controls project.max-shm-memory and project.max-shm-ids are listed in the output, proceed to the next step, otherwise follow the instructions given for Solaris 9.

2. Retrieve the default project with the following command:

% projects -d root

This command outputs the default project name, project1 in this example, for the Super User.

3. Set the resource control project.max-shm-memory with the following command:

% prctl -t privileged -r -n \

project.max-shm-memory -v 9223372036854775807 \

-i project project1

For further information please refer to the Solaris Tunable Parameters Reference Manual applicable to your Solaris release.

eccmonitor

ECC Monitor is enabled by default. The ECC error monitor runs as a separate thread in the test. When an ECC error is detected, the message is displayed on to the test output. The monitor can be turned off by setting this option to disabled.

threshold

Number of ECC errors after which the test stops (if ECC monitor is running). When the threshold is reached the test will exit with a non zero exit code. If set to zero, the test will still report all the errors but will not stop. The default threshold is 2.

pass

Specifies the number of passes in the single instance. Increase pass if lock is enabled. This saves time spent on locking the memory when a new process or instance is spawned by the SunVTS kernel. This pass has no relation with the system passes in the SunVTS infrastructure, it will appear that ramtest is taking longer to complete system passes.

ntaloops

Specifies number of loops of NTA march(30N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA march test attacks stuck-at-faults, two-cell coupling faults, and some three-cell coupling faults.

laloops

Specifies number of loops of LA march(22N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LA march test attacks coupling and stuck-at-faults.

ntaloops

Specifies number of loops of NTA march test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA march test attacks coupling and stuck at faults.

lrloops

Specifies number of loops of LR march(14N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LR march test attacks coupling and stuck-at-faults.

dratio

(Descramble ratio) Tunes the algorithm used to generate data patterns in ramtest. A descramble ratio of 100 means that all the data patterns generated will be descrambled. If a descramble ratio is 0, the test will generate the data patterns tuned toward bus noise. Default value is 50, which means that half the data patterns are descrambled.

ssloops

Specifies number of loops of SS march(22N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. The SS March test attacks simple static faults.

custom

When stride=custom is selected, this value specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the least significant bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using Bit 14 of the address). Mulitple bits can be set mixing row and column strides.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.




Note - 32-bit tests are located in the bin subdirectory, /opt/SUNWvts/bin/testname.





Note - On the Solaris 10 OS, ECC errors are logged in fault management architecture (FMA) error logs by the FMA subsystem of the OS.