7 Finding and Fixing VTCS Problems

This section is about what to do when things go wrong. You have already done your dailies as described in "Using the VTCS Dashboard," and the as-needed tasks in "Restoring the CDS from a Backup Copy," and things are still not going well. Here is the place where you find out how to get VTCS working again when problems occur, starting with the simple problems you probably run across in "Fixing Common Problems."

Note:

Recovering the CDS is primarily an HSC task, but it also has a VSM side. For more information, see "Using PITCOPY to back up the CDS."

Fixing Common Problems

"Common," in this context, just means things that are likely to go wrong despite your best efforts. The way you find out about trouble often doubles back to taking another look at your VTCS Dashboard, and the fixes often reside in your as-needed tasks.

Before you begin with VTV mount performance problems, these are common problems that you can generally diagnose and fix on your own. After making a reasonable effort, however, if things still are not working out, it is time to call for help from Customer Support. There are also some tools that are not discussed here, like traces, because you basically only want to use them under the direction of Oracle service.

Poor VTV Mount Performance

If VTV mounts occur very slowly or not at all, check the following:

  • Are mounts failing on a single VTD? This usually occurs because a host requests a mount of an MVC–resident VTV that VSM cannot recall. If so, do the following:

    • Enter a Display Queue DETail command to check the queued recalls. If a recall is queued waiting for an MVC, it may be in use by another VTCS process, which you can check with Display Active DETail.

    • If the MVC is not in use, next enter an HSC DISPLAY VOLUME command. Is the MVC actually in the ACS? If not, you must reenter the MVC to complete the recall.

    • Next, are RTDs available to mount the MVC to recall the VTV? Enter Display RTD to check RTD availability. If no RTDs are available, use Display on all hosts to check active and queued processes.

      If necessary, use Cancel to cancel processes and free an RTD so the recall can complete. With Cancel, VTCS tries to stop processes without affecting system resources or information; therefore, the cancellation may not occur immediately. For example, VTCS may wait for hardware time out periods before terminating a process using a specific RTD.

      Note:

      If you cancel a parent request, you stop the parent and all child requests. If you cancel a child request, the parent request continues processing.

      Caution:

      If you cancel a task associated with migration scheduler (either with the MIGrate parameter or by specific process ID), this task terminates but migration scheduler starts another migration task at its next timer interval. You can, however, use migrate-to-threshold to stop automigration by specifying a value greater than the current DBU.

      Tip:

      Setting the MGMTclas statement IMMEDmig parameter to either KEEP or DELETE preferences migration processing (and RTD use for migration) and may increase I/O to the RTDs.

      Also note that you can change the CONFIG MAXMIG and MINMIG parameter settings to rebalance automatic migration tasks with other tasks (such as recall and reclaim) for the RTDs you have defined for each VTSS.

  • Are the mounts failing on multiple VTDs? If so, check the following:

    • Check VTD status with Display VTD.

    • Enter Display Active. If there are no active processes, ensure that VTCS, HSC, all VTSSs, and all communications are functioning normally.

    • Ensure that you have sufficient VTSS space.

    • Check to see if your system is running out of available MVCs or usable MVC space.

    • Raising the low AMT tends to keep more VTVs resident in VTSS space, which may help prevent virtual mounts from failing.

  • If a VTV mount fails, even if VTDs are online, use the MVS VARY command to vary VTDs online, use the MVS UNLOAD command to clear the VTDs, then use the HSC MOUNT and DISMOUNT commands to retry the operation.

Poor Migration Performance

If VTV migration occurs very slowly, check the following:

  • Start with Display MIGrate, which shows you, in broad strokes, how well or poorly your various migration tasks are doing. You may be able to rearrange the furniture (for example, raise the MAXMIG/MINMIG values) to get things moving.

  • Ensure that your supply of RTDs and MVCs is in good shape as described in "Checking Virtual Tape Status (Daily)." If you want to get down to bits and bytes, also use Display Queue DETail to check the status of queued processes. If many processes are waiting for RTDs, and you are sharing RTDs with MVS, you may want to vary transports offline to MVS and online to VSM.

Note:

In the JES3 environment, VTV mounts may fail if you have not created and installed the correct User Exit modifications.

Migration Failures

There is only one thing worse than poor migration performance, and that is no migration at all. Fortunately, VTCS provides detailed information about migration failures as described in the following sections:

Messages Enhancements

To provide greater detail about migration failures, message SLS6700E is replaced by the following messages:

  • SLS6853E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - MVCPool poolname is not defined

  • SLS6854E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - no MVCs found for specified media

  • SLS6855E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - no MVCs found for specified media/SC/ACS

  • SLS6856E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - no usable MVCs found for specified media/SC/ACS

  • SLS6857E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - no RTDs for requested media and ACS

  • SLS6858E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - all RTDs for requested media and ACS are offline

  • SLS6859E Migration failed Storage Class:stor-clas-name ACS:acs-id VTSS:vtss-name - unknown reason (X'xx')

In addition, message SLS6860I is always output after any of the preceding messages are issued to provide details of the Storage Class. If applicable, SLS6860I also reports any errors with regard to satisfying the migration requirements:

  • If the MVC Pool is undefined.

  • If the MVC Pool contains none of the specified media.

  • If the MVC Pool contains no free MVCs of the specified media.

  • If the VTSS/ACS has no suitable RTD defined to write the migration MVC.

  • If all suitable RTDs are offline.

The result is that you are now getting more detailed information, and more specific recommendations for fixes when migration failures do occur.

Display STORCLas

Display is enhanced with the STORCLas parameter, whose output is:

  • The characteristics of the Storage Class (ACS, MVC Pool, and Media).

  • VTVs waiting migration to the Storage Class from any VTSS.

  • Requirements of the MVCs to be used for migration.

  • The device type(s) of the RTDs needed to write to the migration MVCs.

  • Any errors with regard to satisfying the migration requirements.

Once again, VTCS provides information about a critical element (Storage Classes) in the migration scenario.

Enhanced MVC Pool Validation

Validation of MVC Pools is enhanced to check for common set-up errors:

  • Has at least one valid MVC Pool been defined? If not, message SLS6845E is issued. VTCS functionality is severely degraded because no migrations can occur. If you receive this message, you must define appropriate MVC Pools. See the next bullet.

  • Does the default MVC Pool (DEFAULTPOOL) exist? DEFAULTPOOL is used when migrating to a Storage Class that does not specify a Named MVC Pool and in error situations with Storage Class !ERROR. If DEFAULTPOOL does not exist, message SLS6846W is issued.

    You indicate migrations to a Storage Class should use a particular MVC Pool by coding MVCPool(pool-name) on the STORCLAS statement.  If MVCPool(pool-name) is not coded, VTCS treats the STORCLAS as though MVCPool(DEFAULTPOOL) was coded.

Enhanced Storage Class Validation

To continue in this theme, validation of Storage Classes is enhanced to check for common set-up errors:

  • If you specify a Named MVC Pool on a Storage Class (STORCLAS NAME(stor-clas-name) MVCPOOL(poolname)), VTCS checks that the Named MVC Pool is defined. Therefore, if you code STORCLAS NAME(stor-clas-name) MVCPOOL(poolname),ensure that the Named MVC Pool exists. If not, VTCS issues message SLS6848W. If you get this message, define the Named MVC Pool, change your Storage Class definition, or both.

  • Similarly, if you do not specify a Named MVC Pool on a Storage Class (STORCLAS NAME(stor-clas-name), VTCS checks that the DEFAULTPOOL is defined. Therefore, if you code STORCLAS NAME(stor-clas-name), ensure that there is at least one MVCPOOL statement that does not create a Named MVC Pool. If not, VTCS issues message SLS6846W. If you get this message, code at least one MVCPOOL statement that does not create a Named MVC Pool, change your Storage Class definition, or both.

  • If you specify an MVC media on a Storage Class (STORCLAS NAME(stor-clas-name) MEDIA(media-type)), VTCS checks that the MVC Pool contains media of type media-type (if a Named MVC Pool is not specified, DEFAULTPOOL is implied). If not, message VTCS issues message SLS6849W. Ensure that the media type exists in the corresponding pool, change your Storage Class definition, or both.

  • If you specify an ACS and media type on a Storage Class (STORCLAS NAME(stor-clas-name) ACS(acs-id) MEDIA(media-type)), VTCS checks that there are RTDs in the specified ACS compatible with the specified media type. If not, message VTCS issues message SLS6851W. Ensure that required RTD type exists in the specified ACS, change your Storage Class definition, or both.

  • If you specify media type without a specific ACS on a Storage Class (STORCLAS NAME(stor-clas-name) MEDIA(media-type)), VTCS checks that there are RTDs in the configuration compatible with the specified media type. If not, message VTCS issues message SLS6851W. Ensure that required RTD type(s) exist in the configuration, change your Storage Class definition, or both.

RTD/MVC Failures

At first, you may not know if you are looking at a media or drive failure. That is, if VTCS detects read/write errors on an MVC, VTCS swaps the MVC to another RTD. If VTCS detects no further read/write errors on the MVC, VTCS assumes that the first RTD is in error.

Message SLS6662A indicates that an RTD is in maintenance mode, and this status is also reported on Display RTD output. An RTD in maintenance mode is typically in error and requires assistance from your hardware operations or service personnel. Note that an RTD in recovery mode is initializing (when varied online, for example), and typically is not in error.

If a failed RTD cannot be quickly repaired or if the failed RTD is attached to a remote ACS, you may want to remove the RTD from your configuration to prevent attempts to allocate that RTD. Remove the RTD statement for the RTD and rerun CONFIG.

Caution:

In a dual-ACS configuration (two ACSs connected to a single VTSS), ensure that you do not allow all RTDs in either ACS to be unavailable to the VTSS for an extended period. If no RTDs are available in that ACS, migrates to or recalls from that ACS cannot occur, and the VTSS space can fill up. In addition, this condition can also cause stalled migrations to RTDs in the other ACS.

In a dual-ACS configuration, therefore, if you must make all RTDs in an ACS unavailable for an extended period, remove the RTDs from the configuration as described above.

Is it a Bad MVC?

If you ran through the check list for RTD problems, above, and that is not the problem, and you also did all the things you can reasonably do to make more MVC space available, and compared the volsers on the MVC Summary Report to an HSC Volume Report, the MVCs actually were in the ACS. Otherwise, you either reenter or replace any MVCs not listed on the HSC Volume Report.

It really does look like a media problem. You see what kind of media problem looking at the MVC and VTV reports described in "Checking Virtual Tape Status (Daily)." That section talks about some of the fixes for the most straightforward MVC anomalies. The following is an exhaustive list of the MVC statuses you do not want to see on your MVC and VTV reports, and what to do about them:

BROKEN

This is a generic error that indicates the MVC, drive, or combination of the two has a problem. VTCS attempts to de-preference MVCs with this state. In general, to clear this state:

If the MVC caused the problem, use a DRAIN(EJECT) command to remove the MVC from service.

If the RTD caused the problem, use the MVCMAINT utility to reset the MVC state.

Note also that one or more of the following messages is issued for BROKEN status: SLS6686, SLS6687, SLS6688, SLS6690. For detailed recovery procedures for these messages, see VTCS Messages and Codes.

DATA CHECK

A data check condition has been reported against this MVC. VTCS attempts to de-preference MVCs with this state. To clear this state:

If all VTVs on the MVC are duplexed, use MVCDRain on the MVC without the Eject option. This recovers all VTVs and removes the MVC from service.

If all VTVs on the MVC are not duplexed, VTCS AUDIT the MVC. The audit probably fails. After the audit, do an MVCDRAIN (no eject). This recalls the VTVs before the data-check area in ascending block-id order and the VTVs after the data-check area in a descending block-id order. Processing the VTVs in this sequence ensures that VTCS recovers as many VTVs as possible from the media. You then need to recreate the data for any VTVs still on the MVC.

After clearing data checks, remove and replace MVCs with data check errors as described in "Permanently Removing MVCs." This procedure also tells how to remove an MVC from VTCS use and return it to Nearline operations.

DRAINING

The MVC is either currently being drained or has been the subject of a failed MVCDRain.

IN ERROR

An error occurred while the MVC was mounted.

INITIALIZED

The MVC has been initialized.

LOST - FAILED TO MOUNT

VTCS attempted to mount an MVC and the mount did not complete within a 15-minute time-out period. VTCS is attempting to recover from a situation that may be caused by hardware problems, HSC problems, or by the MVC being removed from the ACS. VTCS attempts to de-preference MVCs with this state.

If VTCS does perform a subsequent successful mount of an MVC with LOST(ON) state, VTCS sets the state to LOST(OFF).

Determine the cause of the error and fix it. You can also use the VTCS MVCMAINT utility to set LOST(OFF) for the following events:

LOST(ON) was set due to LSM failures or drive errors that have been resolved.

LOST(ON) was set because the MVC was outside the ACS and has been reentered.

MARKED FULL

The MVC is full and is not a candidate for future migrations.

MOUNTED

The MVC is mounted on an RTD.

NOT-INITIALIZED

The MVC has been defined through the CONFIG utility, but has not ever been used.

READ ONLY

The MVC has been marked read-only because of one of the following conditions:

  • The MVC being the target of an export or consolidation process. The read-only state protects the MVC from further updates.

  • The MVC media is set to file protect. Correct the error and use the MVCMAINT utility to set READONLY(OFF).

  • The MVC does not having the appropriate SAF rules set to enable VTCS to update the MVC. Correct the error (for more information, see ”Defining A Security System User ID for HSC, SMC, and VTCS” in Installing ELS and use the MVCMAINT utility to set READONLY(OFF).

BEING AUDITED

The MVC is either currently being audited or has been the subject of a failed audit. If the audit failed, VTCS does not use the MVC for migration. To clear this condition, rerun the AUDIT utility against this MVC.

LOGICALLY EJECTED

The MVC has either been the subject of an MVCDRain Eject or the MVC was ejected for update by a RACROUTE call. The MVC is not be used again for migration or recall. To clear this condition, use MVCDRain against the MVC without the Eject option.

RETIRED

The MVC is retired. VTCS recalls from, but does not migrate to, the MVC. Replace the MVC as soon as possible.

WARRANTY HAS EXPIRED

The MVC's warranty has expired. VTCS continues to use the MVC. You should start making plans to replace the MVC when it reaches Retired state.

INVALID MIR

VTCS has received status from an RTD to indicate the MIR (media information record) for a 9x40 media is invalid. An invalid MIR does not prevent access to data but may cause significant performance problems while accessing records on the tape. The MVC is not capable of high-speed searches on areas of the tape that do not have a valid MIR entry.

VTCS attempts to de-preference MVCs with this condition. For recalls, if the VTV resides on multiple MVCs, VTCS selects MVCs with valid MIRs ahead of MVCs with invalid MIRs. VTCS avoids using MVCs with invalid MIRs for migration, unless the migration is at the beginning of the tape. Migrating from the beginning of tape corrects the MIR.

VTCS detects the invalid MIR condition at either mount time or dismount time. If detected at mount time and the operation can be completed with another MVC, VTCS dismounts the first MVC and selects the alternate MVC. Note that VTCS has only a limited ability to switch to an alternate MVC. That is, it is mainly used for migrate and virtual mount.

For MVCs with invalid MIRs, determine the cause of the error, which may be caused by media or drive problems, and fix the error.

To recover an MVC with an invalid MIR, run the INVENTRY utility. For example, to recover MVC707, enter:

INVENTRY MVCID(MVC707) 

Recovering an MVC with a Data Check

This is a very specific instance of the general ”bad MVC” woes, and you know it is required when you see an MVC data check error on your MVC and VTV reports.

To recover an MVC with a Data Check:

  1. Run an MVC audit against the MVC.

    The audit attempts to read the VTV metadata sequentially from the MVC. The audit fails when it encounters the data check, which leaves the MVC in an auditing state. This prevents VTCS from selecting this MVC for output.

  2. Run an MVCDRain Eject for the MVC.

    This causes all the available VTVs to be recalled to a VTSS and then remigrated to a new error-free MVC. This logically removes the MVC from the MVC pool.

    Note:

    • Due to the error status of the MVC, VTCS recalls VTVs from alternate MVCs if possible.

    • If VTVs must be recalled from the MVC in error (no other copies available), then:

      • VTVs before the data check area are recalled in ascending block ID order.

      • VTVs after the data check area are recalled in descending block ID order.

  3. Determine if any VTVs could not be recovered from the MVC.

    Run an MVC Detail report for the MVC. If any VTVs are still reported as being on the MVC, then these VTVs are not recoverable; you must use other methods to recover your data.

  4. Manage the defective MVC by doing one of the following:

    Replace the defective MVC with an initialized tape volume with the same internal and external labels:

    1. Enter the HSC EJECT command for the defective MVC.

    2. Enter the HSC ENTER command for the replacement MVC.

    3. Initialize the tape as required.

    4. Enter HSC AUDIT for the new MVC.

    5. Run an MVCDRAIN (no EJECT) to return the MVC to the MVC pool.

    Remove the MVC from the system:

    1. Enter the HSC EJECT command for the defective MVC.

    2. Edit the MVC pool definitions to remove the defective MVC from the pool.

    3. Enter a VT MVCDEF on all active hosts to activate the new MVC pool definitions.

Using the RTV Utility

The RTV utility is another item you are probably only going to use after talking with Oracle service, because RTV is designed to read VTV data directly from an MVC without any assistance from VTCS, for example, in the case that you really have lost the CDS.

RTV is a standalone utility, and the way it works is to read a VTV from an MVC, decompress the VTV, then write the data to a single output tape (real tape volume) so the data can be read by user applications. Because RTV utility is a stand–alone utility; you can run RTV when VSM is down but the MVS system is up.

What the RTV Utility Can Recover

The RTV utility can recover:

  • All or specified VTVs from a specified MVC. If you do not know the location of the most current version of a VTV on the MVC, specify only the VTV volser, and RTV converts the most current version of the VTV it finds on this MVC.

  • A VTV at a specified block ID on a specified MVC. The LISTONLY parameter listing supplies a Block ID value that you can use as input to the RTV utility to convert a VTV to a Nearline volume. Specifying the volser and Block ID speeds positioning time.

  • A VTV specified by logical data set number on a specified MVC. Specifying the volser and logical data set number will have a much longer positioning time compared to specifying volser and Block ID. Using volser and Block ID is the preferred method to access a single VTV.

Note:

If more than one VTV is specified, or if no block-id or FILEnum parameter is specified, the entire MVC is read and the MVC contents displayed as part of the output. Reading of the entire MVC is necessary to insure that only the most current copy of a VTV is decompressed.

General Usage Guidelines  

  • The output volume that contains the converted VTV(s) must be at least the size of your maximum VTV size (400 Mb, 800 Mb, 2 Gb, or 4 Gb) to ensure that it can contain an individual VTV.

  • The VTCS MVC and VTV reports provide information to specify which copy of a VTV you want RTV to recover. Ensure that you have a current copy of these reports before you run the RTV utility. In addition, to help identify the VTVs you want to convert, you can use the LISTONLY parameter to produce a list of the VTVs on an MVC.

    Because multiple copies of the same VTV can exist on the same or different MVCs, study carefully your VTV and MVC reports and LISTONLY listings to ensure that you are using the correct MVC to convert the most current copy of a VTV!

  • The RTV utility does not update the system catalog or TMC with information about the converted volumes; you must do this manually.

Security Considerations

  • You must have read access both to the VTVs you want to convert and to the MVC that contains these VTVs or your system's security application cannot be running. Otherwise, the conversion fails.

  • Ensure that you APF authorize the RTV utility load library.

  • RTV makes no attempts to bypass any TMS protection. All RTV tape mounts are subject to full TMS control.

Note:

Because the RTV utility must be capable of rewriting the tape standard labels on the output unit and positioning over label information on the input unit, Dynamic Allocation is used to invoke bypass label processing (BLP) on the tape volumes. This requires that the library that contains the SWSRTV executable code be APF authorized.

JCL Examples

The following displays JCL examples using the RTV utility.

Listing the VTVs on an MVC

The following shows example JCL to lists the VTVs on MVC MVC001.

//JOBVRECJOB(account),programmer 
//RUNRTV EXEC PGM=SWSRTV,PARM='MIXED' 
//STEPLIBDD DSN=hlq.SEALINK,DISP=SHR 
//SLSPRINTDD SYSOUT=A 
//SLSINDD * 
RTV  MVC(MVC001)INUNIT(/1AB4) LISTONLY  
/* 
// 

Converting a Single VTV by Specifying Its Volser

The following shows example JCL to run the RTV utility to convert VTV VTV200 on MVC MVC001, which is mounted on a 3490E transport. The output (converted VTV VTV200) goes to the output volume mounted on transport 280, and RTV copies the VTV VOLID from the VTV to the output volume.

//JOBVRECJOB(account),programmer 
//RUNRTV EXEC PGM=SWSRTV,PARM='MIXED' 
//STEPLIBDD DSN=hlq.SEALINK,DISP=SHR 
//SLSPRINTDD SYSOUT=A 
//SLSINDD * 
  RTV  MVC(MVC001) INUNIT(3490E) VTV(VTV200) CPYVOLID OUTUNIT(280) 
/* 
// 

Converting a Single VTV by Specifying Its Volser and Block ID

The following shows example JCL to run the RTV utility to convert VTV VTV200 at block ID x'8EA484AB' on MVC MVC001, which is mounted on a 3490E transport. The output (converted VTV VTV200) goes to the output volume mounted on transport 480.

//JOBVRECJOB(account),programmer 
//RUNRTV EXEC PGM=SWSRTV,PARM='MIXED' 
//STEPLIBDD DSN=hlq.SEALINK,DISP=SHR 
//SLSPRINTDD SYSOUT=A 
//SLSINDD * 
  RTV  MVC(MVC001) INUNIT(3490E) VTV(VTV200) BLOCK(8EA484AB) OUTUNIT(480) 
/* 
//