C H A P T E R  16

RAM Test (ramtest)


ramtest Description

ramtest is designed to stress the memory modules (RAM) instead of the whole memory subsystem. The test is optimized to achieve large memory bandwidth on UltraSPARC III (USIII) and UltraSPARC II (USII) class of CPUs. ramtest has an integrated ECC error monitor which reports the ECC errors found during the test run.

For x86 systems, the Exclusive mode testing puts high stress on the memory and the system interconnect.



Note - disktest is supported on x86 platforms with the Solaris OS.





caution icon

Caution - This is an Exclusive mode test. Do not run any other application during this test.




ramtest Options

To reach the following dialog box, right-click on the test name in the System Map and select Test Parameter Options. If you do not see this test in the System Map, you might need to expand the collapsed groups. Refer to the SunVTS User's Guide for more details.


FIGURE 16-1 ramtest Test Parameter Options Dialog Box


The following table details the ramtest options:


TABLE 16-1 ramtest Options

ramtest Options

Description

Reserve

Reserve option represents the percentage of physical memory that is assumed to be in use by the OS or other processes. If you see excessive swapping while running ramtest, increase this percentage. The default is 20%; this means that ramtest allocates 80% of physical memory size for testing. Swapping decreases stress on memory and increases it on the system itself. For memory testing purposes, minimize swapping by tuning the reserve option.

If the allocation or locking (in case Memory Locking is enabled) does not succeed, the amount of memory is reduced and the allocation process is repeated. Once the allocation succeeds, the amount of memory allocated is displayed in the messages.

Stride

By default this option is set to Random. It can be set to Column or Row also. In case of random, either Row or Column are randomly selected for each pass. Value of stride defines the memory locations addressed consecutively in certain subtests, in a hardware dependent manner. All testable memory is still tested. Using different strides, checks coupling among different sets of memory cells; therefore random is the recommended value for this option unless both Column and Row are being explicitly used in different instances. For FA type of uses, stride may also be set to UserDefined, in this case the test will stride the number of banks specified in the userstride option.

Stride may be set to Custom in which case the stride values are randomly selected from the strides specified in the stridemask value.

Stride may be set to "Custom" in which case the stride values are randomly selected from the strides specified in the "stridemask" value.

User-Defined Banks to Stride

Set the number of banks that the test should stride. The value is currently limited to between 1 and 16. Row striding is not possible while using this option.

Stride Mask

Specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the Least Significant Bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using Bit 14 of the address). Mulitple bits can be set mixing row and column strides. The Memory Controller section of the PRM for the CPU of the test system for information on how the memory reference address is divided between rows and columns in the DRAM.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.

Memory Locking

By default memory locking is Disabled. To turn it on, set lock to Enabled. This test uses ISM to lock the memory into the core, which allows 4 MB virtual pages and avoids swapping. Running without locking, adds more randomness to the addressing sequence.

ECC Error Monitor

ECC Monitor is Enabled by default. The ECC error monitor runs as a separate thread in the test. When an ECC error is detected, the message is displayed in the test output. Turn off the monitor by setting this option to Disabled.
The ECC Monitor option is not supported on x86 platforms and an appropriate warning is displayed and the test proceeds based on other options.

ECC Error Threshold

The number of ECC errors after which the test will stop (if ECC monitor is running). When the threshold is reached, the test exits with a nonzero exit code. If set to zero, the test will still report all the errors but will not stop. The default of threshold is 2.
The ECC Threshold option is not supported on x86 platforms and an appropriate warning is displayed and the test proceeds based on other options.

Number of Passes

Specifies the number of passes, in the same instance. Increase passes in case lock is enabled. This saves time spent on locking the memory every time a new process or instance is spawned by the SunVTS kernel. This pass has no relation with the system passes in the SunVTS infrastructure. It appears that ramtest is taking longer to complete system passes.

NTA March Test

Specifies number of loops of NTA March(30N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA March Test attacks coupling and stuck at faults. NTA March is efficient at finding single-, double-, and some triple-bit errors. Depending on the stride option, coupling faults can be found between cells in adjacent columns, or rows that are targeted.

Test time will be higher when row striding is selected because of greater page faults generated. For efficiency purposes, total memory is divided among available CPUs.

LA March Test

Specifies number of loops of LA March(22N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LA March Test attacks coupling and stuck-at-faults.

LR March Test

Specifies number of loops of LR march(14N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LR March Test attacks coupling and stuck-at-faults.

SS March Test

Specifies number of loops of SS March(22N) Test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. SS March Test attacks simple static faults.
The SS March option is not supported on x86 platforms and an appropriate warning is displayed and the test proceeds based on other options.



ramtest Test Modes


TABLE 16-2 ramtest Supported Test Modes

Test Mode

Description

Exclusive

Stresses memory modules and generates enormous amount of memory traffic.



ramtest Command-Line Syntax

/opt/SUNWvts/bin/sparcv9/ramtest standard-arguments [ -o

[ reserve=Integer between 0 and 90 ]
[ stride=
Row | Column | Random | UserDefined | Custom
[ userstride=1 - 16 ]
[
stridemask=0x40 - 0x400000 ]
[ lock=Enabled | Disabled ]
[ dratio=
Integer between 0 and 100 ]
[ eccmonitor=
Enabled | Disabled ]
[ threshold=
Integer i; 0 = i = MAX-INT ]
[ pass=
32 bit integer ]
[ ntaloops=
32 bit integer]
[ laloops=
32 bit Integer ]
[ lrloops=
32 bit Integer ]
[ ssloops=32 bit Integer] ]

 


TABLE 16-3 ramtest Command-Line Syntax

Argument

Description

reserve

Specifies the amount of memory that will not be allocated for testing. reserve represents a percentage of the total physical memory in the system. When the test starts, it probes the total memory present in the system, then tries to allocate (100 - reserve)% of memory. If the allocation or locking does not succeed the amount of memory is reduced before the retry. Before starting the test, the amount of memory allocated for testing is displayed.

Default value for reserve option is 20. For UltraSPACR IIIi platforms, the default value is 25.

On low memory systems, keep the reserve value higher to avoid excessive swapping.

For 32-bit booted systems, the reserve value represents the percentage of 4 GB rather than the percentage of total physical memory.

stride

By default stride is set to random. It can be set to Column or Row also. For random, either Row or Column are randomly selected for each pass. The value of stride defines the memory locations addressed consecutively in certain subtests, in a hardware dependent manner. All testable memory is still tested. Using different stride checks coupling among a different set of memory cells, therefore random is the recommended value for this option unless both Column and Row are being explicitly used in different instances. For FA type of uses, stride may also be set to UserDefined, in this case the test will stride the number of banks specified in the userstride option.

stride may be set to Custom in which case the stride values are randomly selected from the strides specified in the stridemask value.

userstride

Use this option to set number of banks the test should stride. One of the good choices could be the interleave on the suspect bank, during FA. the value is limited between 1 and 16. This also means row striding is not possible while using this option.

stridemask

When stride=custom is selected, this value specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the Least Significant bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using bit 14 of the address). Mulitple bits can be set by mixing row and column strides.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.

lock

By default memory locking is disabled. To turn it on set the lock to enabled. The test uses ISM to lock the memory into the core, this gives 4 MB virtual pages and avoids swapping. Running without locking adds more randomness to the addressing sequence.

On low memory systems, this option can be enabled to avoid excessive swapping.

Solaris 10 users, perform the following steps:

1. Issue the following command:

% prctl $$

If resource controls project.max-shm-memory and project.max-shm-ids are listed in the output, proceed to the next step, otherwise follow the instructions given for Solaris 9.

2. Retrieve the default project with the following command:

% projects -d root

This command outputs the default project name, project1 in this example, for the Super User.

3. Set the resource control project.max-shm-memory with the following command:

% prctl -t privileged -r -n \

project.max-shm-memory -v 9223372036854775807 \

-i project project1

For further information please refer to the Solaris Tunable Parameters Reference Manual applicable to your Solaris release.

eccmonitor

ECC Monitor is enabled by default. The ECC error monitor runs as a separate thread in the test. When an ECC error is detected, the message is displayed on to the test output. The monitor can be turned off by setting this option to disabled.
The ECC Monitor option is not supported on x86 platforms. An appropriate warning is displayed and the test proceeds based on other options.

threshold

Number of ECC errors after which the test stops (if ECC monitor is running). When the threshold is reached the test will exit with a non zero exit code. If set to zero, the test will still report all the errors but will not stop. The default threshold is 2.
The ECC Threshold option is not supported on x86 platforms and an appropriate warning is displayed and the test proceeds based on other options.

pass

Specifies the number of passes in the single instance. Increase pass if lock is enabled. This saves time spent on locking the memory when a new process or instance is spawned by the SunVTS kernel. This pass has no relation with the system passes in the SunVTS infrastructure, it will appear that ramtest is taking longer to complete system passes.

ntaloops

Specifies number of loops of NTA march(30N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA march test attacks stuck-at-faults, two-cell coupling faults, and some three-cell coupling faults.

laloops

Specifies number of loops of LA march(22N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LA march test attacks coupling and stuck-at-faults.

ntaloops

Specifies number of loops of NTA march test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. NTA march test attacks coupling and stuck at faults.

lrloops

Specifies number of loops of LR march(14N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. LR march test attacks coupling and stuck-at-faults.

dratio

Descrambles ratio tunes the algorithm used to generate data patterns in ramtest. A descramble ratio of 100 means that all the data patterns generated will be descrambled. If a descramble ratio is 0, the test will generate the data patterns tuned toward bus noise. Default value is 50, which means that half the data patterns are descrambled.

ssloops

Specifies number of loops of SS march(22N) test, per pass. Increasing the number of loops of any subtest increases the relative time spent on that subtest in each pass. This increase also increases the time taken to complete a pass. The SS March test attacks simple static faults.
The SS March option is not supported on x86 platforms. An appropriate warning is displayed and the test proceeds based on other options.

custom

When stride=custom is selected, this value specifies the strides used. Each thread selects one of the stride values from stridemask by selecting one of the bits in the mask.

The bits in the stridemask value represent the least significant bit of the stride. Thus a value of 0x4000 calls for a stride of 16384 (using Bit 14 of the address). Mulitple bits can be set mixing row and column strides.

The value can be specified as a decimal (NNN), hexadecimal (0xNNN), or octal (0NNN) value. The maximum value is 0x400000 (4194304). The default value is 0xC600 which represents strides using Bits 15, 14, 10, and 9.




Note - 32-bit tests are located in the bin subdirectory, /opt/SUNWvts/bin/testname.





Note - ECC errors returned by ramtest are detected by the operating system and are logged in the /var/adm/messages file. Review this file for more detailed information regarding errors.





Note - 64-bit tests are located in the /bin/64 directory, or the relative path in which you installed SunVTS. If a test is not present in this directory, then it might be available as a 32-bit test only.