A P P E N D I X  B

Frequently Asked Questions (FAQ) and General Recommendations on SunVTS Usage

This appendix provides frequently asked questions regarding SunVTS.


Introduction

SunVTS is a very powerful and versatile tool which is used by a wide spectrum of users. Although the tool has been designed to perform effectively by default, due to the inherent complexity in hardware testing and wide range of usage models of the tool, there are some guidelines that can help you use the tool even more efficiently. The following questions in this appendix are meant to capture the most frequently asked questions by the tool user community and also to cover some of the typical usage models.

Since the tool is supporting all SPARC and x86 platforms that Sun markets, the range of hardware platforms and configurations is very large. It is not possible to provide inputs and recommendations for each and every platform and configuration separately. Recommendations are made for some typical reference platforms and configurations. You are expected to match the nearest configurations and extrapolate from those configurations.


Frequently Asked Questions (FAQ)

When Should I Use the SunVTS Tool?

The SunVTS tool is primarily targeted for hardware stress testing. Below are some of the typical uses of the tool.

The tool can be used as a super application for stressing of the system as a whole. You should run the tool in the “System Exerciser” mode and depending on the stress that you want to put on the system, select “High” or “Low”. The tool will put application level stress on all parts of the system hardware. This could be useful in situations where you are trying to see the system stability, while all the components are stressed at the same time. Some of the situations could be:

SunVTS can be good tool to check the stability of the system, when all parts of the system are under heavy loads. This setup can be further used in four corner testing (testing with margined environmental conditions and mechanical vibrations).

Before putting a new system online or into production, SunVTS can be run on the system to provide functional validation.

For measuring the emission output from a system, you need to make sure that all the parts of the system are running on full cycle. SunVTS can be an effective tool in that situation.

When we do environmental margin testing, it is necessary to keep all the parts of the system at full speed and then see if there are any fallouts.

The tool fits testing needs on the system manufacturing floors. The tool can be completely automated and can be used all the way from board functional testing to system functional testing.

SunVTS allows tests to be run on the whole system or targeted for a specific component. After repair of a component or installation of a component, the tool can be run in the “Component Stress” mode to specifically target the functionality of the new device. In this mode the tests are more targeted towards the component under stress and the tool ensures that the tests does not have to share any system resources with the other tests.

What Do the Different Modes and the High/Low Within the Modes Mean? When Should I Use Them?

The tool provides three different modes: Online, System Exerciser, and Component Stress. The three modes are meant to serve three different usage models. Below are the usage models for the three modes:

This mode is meant to be used when you need to test the system in the presence of other application. Since the tests would be running with other applications, they are less stressful on the system resources and have limited coverage.

This is the mode that can provide overall system hardware validation (mentioned in the last question). All the selected tests run simultaneously, exercising all parts of the system in parallel. You should quit other applications before starting the tests in this mode, mainly because system resources are highly stressed in this mode.

This mode is meant to perform targeted testing of the chosen component. The selected tests are run sequentially, allowing the tests to have full control of system resources to perform the most stressful testing of the component under focus.

In each of these modes the tests can be run in levels “high” or “low”. These levels are meant to give you flexibility to choose the level of stress and coverage that you want. What stress and coverage is put at these levels will vary from one test to another. Typically, tests in the “low” level would be consuming less resources and will be quicker and in many cases will have less coverage. The low level should be used when the goal is to do a quick sanity check or provide less stressful background activity. For example, if the goal is to do an overall system validation, but you know that the CPU is good and you do not want to test it too much, you can change the “Processor” testing to “low” and the rest of the tests at “high”. Another situation would be if you have less time, but you want to do a quick sanity check of a component, you should then use the “low” level. When the goal is stressful exhaustive testing, choose the “high” level.

How Long Should I Run the Tests?

SunVTS gives you the flexibility to run tests based on “Pass” or “Time”. The “Pass” encapsulates the minimum quantum of testing necessary before it can declare a component to be good. So a general guideline is that you need to run at least one pass of the tests that you are interested in.

The time taken by one pass of a test will vary with several factors like size of the hardware configuration, the speed of the system, the architecture of the system, the load on the system, and so on. Therefore, it is not quite possible to accurately gauge or predict the time that would be required for getting at least one pass.

Generally, the minimum time that you need to run the tests include the following:

If time is not of very high significance and the goal is to do maximum testing that the tool can provide, the recommendation is to run the tests for at least 5 passes in both System Exerciser and Component Stress modes.

The Test Did Not Fail. Is Everything Ok with My Hardware?

May or may not be. In most cases, the faults in the hardware are detected and managed by the system (the hardware plus operating system) and not by the test. SunVTS and its tests are all user-level applications, running on top of the operating system. Any error or failure that is detected and managed by the system may not be visible to the tests. But the occurrence of such errors or failures are logged by the system. So, you should always check the following log files after a run of the testing session. These log files would give a clear picture of any error or failure that might have happened.

This file contains the failures that are detected and reported by the SunVTS tests. Wherever possible, the message includes possible causes of the error and recommended actions.

These messages are reported by the syslog daemon (syslogd). They are logged in the file /var/adm/messages. These messages are not necessarily errors, but do tell you about any mishaps that could have occurred while tests were running.

Most of the errors that happen on the system are detected, managed and reported by the system (hardware plus operating system), using Proactive Self Healing technology in Solaris. Since the management of the fault happens underneath SunVTS tests, the tests themselves don’t see these errors. Solaris provides a utility called “fmdump”, which allows you to display the errors and faults that the system detected (see the man pages for fmdump for more details). After a testing session, look at the output of the following commands to check the errors and faults that happened during the testing session:

fmdump -eV
fmdump

Your Tool Caused My System to Panic. Are You Doing Anything that a Normal Application Would Not Do?

As mentioned in The Test Did Not Fail. Is Everything Ok with My Hardware?, SunVTS and its tests all run at the user level, on top of the operating system. They do not use any special test specific kernel driver. So they can be considered as any other user level application, whose job is to test the hardware of the system. If the system is experiencing a panic when SunVTS is run, that would indicate a hardware or system software (operating system plus drivers) problem. SunVTS tests merely stepped on the problem (which is its goal). The actions that led to the panic can potentially be caused by any other application as well. So it should be investigated and root cause analysis should be done. It is highly unlikely to be a SunVTS issue.

SunVTS Logs Are Showing a Failure, and the Cause Is Being Attributed to the System Running Out of Resources While Running Tests. Is This a Real Error in My System?

Most probably not. Since SunVTS tests stress the system hardware components very heavily, they also tend to be heavy user of system resources like CPU, memory, system interconnects, network etc. This is specially true in the “System Exerciser” mode when all the tests are run simultaneously. It is quite possible under certain configurations, some tests have to exit due to the lack of enough resources, that they need for doing effective testing. In this case, there is nothing wrong in the system. The recommendation is to disable some of the other tests and run the test that failed again.

In some rare cases the lack of system resources can indicate a problem in the system. In which case when you run that test again, it will fail consistently.

I See that the New SunVTS 7.0 Version Is Very Different from the Version I Have Been Using. What Are the Main Differences?

SunVTS 7.0 is the new generation of the tool and is significantly different from its previous generation. The new version is significantly easier to use and requires less training. The look and feel of the tool has completely changed. This User’s Guide should be helpful. Some major differences include the following:

The Motif based GUI in the previous generation of the SunVTS has been replaced with a new web-based GUI (BUI).

This feature is only supported on the BUI. From this version, you can monitor and manage SunVTS sessions on multiple hosts through one BUI.

Tests in this new SunVTS version has a different meaning and look. SunVTS 7.0 introduces a new software layer between the physical tests (of the previous version) and the user. The new “Tests” are nothing but a combination of earlier physical tests, running in a pre-determined way. All the complexity of option settings of the physical test are hidden under the software layer of the new Tests. The new Tests are very few in number and have very few high level options. The physical tests are identical between the current and previous generation of the tool. These physical tests can still be run manually from a shell prompt.

I Have Been Using the Previous Generation of SunVTS. I Used to Create Option Files for My Options and Load Them. Can I Use the Same Option Files With the New SunVTS 7.0 Version of the Tool?

No, the older option files are no longer supported in SunVTS 7.0. In the SunVTS 7.0, the option settings are significantly easier and happen at a higher level. You no longer need to decide and set options for each and every physical test. These options are automatically set appropriately by the tool. You only needs to provide higher level options such as how much time, how many passes of the test, what level of testing, and so on. These higher level options can be captured in a file called the “session file” which can be stored and loaded.

What Is the Meaning of Improved Diagnostics Effectiveness?

The goal of SunVTS 7.0 diagnostics testing is to be effective by default. SunVTS 7.0 diagnostics improves the effectiveness of testing by these methods:

In SunVTS 7.0, with simple and higher level inputs from you, the tool is able to test the hardware appropriately. You do not need to know what combination of tests with associated options are run.

Who Do I Contact if I Have More Questions?

Send email to sunvts-ext@sun.com with your questions and suggestions.