Cobol Converter

Cobol Converter

Cobol Converter

The role of this tool is to convert Cobol programs running on the source platform (z/OS, IBM Cobol dialect) into Cobol programs running on the target platform (UNIX or Linux, Micro-Focus Cobol dialect) whilst maintaining the same behavior of the application. The conversion is performed in the context of other components translated or generated by the other Oracle Tuxedo Application Rehosting Workbench tools.

The purpose of this document is to describe precisely all the features of the Cobol Converter.

Overview of the Cobol Converter

Scope

The Refine Cobol converter handles the following transformations in a single pass:

•

Cobol dialectal correction (from z/OS Cobol to MicroFocus Cobol).

•

Adaptation to target platform UNIX interfaces (e.g., replacement of SEQUENTIAL files by more efficient LINE SEQUENTIAL files or handling of printer control characters).

•

Embedded SQL conversion (from DB2 to Oracle DBMS), including the interface with the host program (SQLCODE, host variables…).

•

Any code adaptation consecutive to supported reengineering options such as file to RDBMS conversion or component renaming.

•

Normalization of the EXEC CICS statements and the programs to make them suitable for the run-time EXEC CICS preprocessor.

The resulting programs can be compiled and run on the target platform with the same behavior as on the source platform, except in some cases detailed in Scope.

Inputs

The Cobol converter takes as input:

•

The abstract syntax trees of the Cobol programs to convert (one or more), stored in the POB files produced by the Rehosting Workbench Cataloger;

•

A number of configuration files:

•

(mandatory) the system description file, which describes where to locate the Cobol programs on the migration platform file system.

•

(mandatory) the conversion configuration file, which specifies the general parameters of the conversion and gives the location of the specific configuration sub-files.

•

(optional) specific configuration sub-files such as the variable-renaming table, the component-renaming table, the file-to-RDBMS conversion table, etc.; see Main conversion configuration file.

Outputs

It produces as output:

•

An execution log.

•

Converted Cobol components in their textual representation: programs and copy files.

•

Additional files such as dependence files for make.

Conversion phases

The Cobol conversion process is logically divided in two phases:

•

Individual conversion: this phase acts "locally" and separately on each program. It converts the AST in its internal form, and then prints out the converted form of both the main source file (in the target directory) and the copy files (in a private subdirectory of the target directory). It also runs the post-translator on each of these files, if requested.

•

Copy file reconciliation: the copy files have been converted separately for each program, but the objective is a single set of copy files for all programs. The reconciliation process builds this set by "factorizing" all privately-converted copy files and storing them in a common file base, taking care to not mix different versions of the same copy file (these different versions may come from context-dependent conversions).

The individual conversion phase can run concurrently on several programs but, since the copy-reconciliation phase updates the global copy file base, it must run as a single process, possibly incrementally. This dictates the possible execution modes of the Cobol converter; see Command-line syntax for more details.

Restrictions and limitations

By definition, the Rehosting Workbench Cobol Converter accepts only those programs accepted by the Rehosting Workbench Cobol Cataloger, but imposes no further restriction on entry.

The resulting programs can be compiled and run on the target platform with the same behavior as on the source platform, except for the following potential pitfalls for which we take no responsibility:

•

Some target-compiler options, such as IBMCOMP, must be set as mandatory in this document (see Compiler options).

•

Problems with data files which would be incorrectly migrated because incorrect information was supplied to the data-migration tools.

•

Differences of behavior in "incorrect" pieces of code, such as accessing an array with an out-of-bounds index value, or pieces of code which work "by chance", such as code assuming that two records which are adjacent in the Working-Storage Section are also adjacent in memory.

•

Issues related to the semantic nature of some Cobol variables or requiring a deep, semantic understanding of the program. Some of these issues are described later in this chapter.

Use of COMP-5 type on Linux platforms

The Oracle Tuxedo Application Rehosting Workbench Cobol converter translates "portable" binary integer types (BINARY, COMP, COMP-4) to the native binary type COMP-5. This is in order to ensure compatibility with sub-programs written in C such as those in the transaction processing framework (see the CICS section of the Oracle Tuxedo Application Rehosting Workbench Reference Guide), and to improve execution performance. This may cause problems when the target platform does not have the same "endianness" as the source platform, in particular on Linux and Intel platforms (the Intel processor line is little-endian whereas the zSeries processor is big-endian; most other processors, such as IBM pSeries and HP-RISC, are also big-endian). Indeed, in this case, the order of bytes in a binary variable is reversed with respect to the source platform. This can lead to different behavior when such a binary variable is redefined by a character (PIC X) variable and this redefinition is used to access the individual bytes in the binary variable. For Example:

Listing 6‑1 Binary field manipulation example

WORKING-STORAGE SECTION.

01 FILLER.

02 BINVAR PIC S9(9) COMP.

02 CHARVAR REDEFINES BINVAR PIC X(4).

PROCEDURE DIVISION.

...

MOVE ... TO BINVAR

IF CHARVAR(1:1) = ... THEN ...

On a big-endian machine such as the z/OS hardware, CHARVAR(1:1) contains the most significant (higher-order) byte of BINVAR. However, on a little-endian machine, with the same code, CHARVAR(1:1) will contain the least significant (lower-order) byte of BINVAR; this is definitely a change of behavior and will probably lead to different observable results. However, the Rehosting Workbench Cobol Converter is unable to detect and fix all occurrences of this situation (the example above is "obvious", but there exists many much more complex cases); these must be handled manually.

Use of COMP-5 type and the TRUNC compiler option

As mentioned in the previous paragraph, the Rehosting Workbench Cobol converter translates portable binary integer types (BINARY, COMP, COMP-4) to the native binary type COMP-5. In addition to endianness problems, this may cause another kind of difference of behavior for applications which were compiled with the (default) TRUNC(STD) option on the source platform – this option corresponds to the TRUNC option of Micro Focus Cobol. Indeed, both on the source and on the target platforms, the portable binary types obey this option whereas the native type does not. In our opinion, the probability of observing a real difference of behavior is very low, because in general, binary-integer variables are used to hold "control" values (loop counters, array indices, etc.) rather than applicative values. In any case, if differences of behavior are observed, it is up to the Rehosting Workbench user to deal with them, either by accepting them or by manually correcting them, for instance by returning a few selected variables to their original binary type.

EBCDIC-to-ASCII conversion issues

For reasons of efficiency and compatibility with native utility programs on the target platform—for instance, simply browsing through a data file on the terminal—one of the fundamental design choices for the migration performed by the Rehosting Workbench is to convert textual (alphabetic) data from the native character set of the source platform (EBCDIC, or one of its variants) into the native character set of the target platform (ASCII, or one of its variants). This common-sense decision however has important consequences on the migration process:

•

The migration of the data itself must be handled with great care. In particular, the actual EBCDIC-to-ASCII conversion table used for your specific project must take into account the particular non-standard characters you use on your screens (e.g. accented letters, the Euro or pound sign, etc.), together with their encoding in the source character set, and make sure that they are appropriately transcoded into the corresponding characters in the target character set. See the Oracle Tuxedo Application Rehosting Workbench Process Guide and the the Rehosting Workbench Data Migration Tools documentation in this guide and in the Oracle Tuxedo Application Rehosting Workbench Reference Guide.

•

An important issue in the data migration process is that the EBCDIC-to-ASCII conversion must not apply to non-textual data such as binary, packed-BCD or floating-point data. This requires that each data structure (file, opaque SQL column, etc.) be described by a detailed and precise Cobol record exhibiting all these non-text fields. See the Rehosting Workbench Process Guide and the the Rehosting Workbench Data Migration Tools documentation.

•

The same EBCDIC-to-ASCII conversion is applied to the source file of the components, so that they appear visually correct on the target platform. This is important for their correct maintenance. This also means that it is applied to the contents of literal character strings in the programs or JCLs.

In most cases, if you comply correctly with these directives, the resulting application will run smoothly. There is one issue however, for which no efficient solution ca be found: the collating sequences of the EBCDIC and ASCII character sets are not quite the same, and this may lead to different behavior in sorting and string comparisons. In most cases, there is no problem, because you sort or compare "homogeneous" data such as names (alphabetic) or dates (numeric); only special characters such as accented letters may sort a bit differently but still satisfactorily. However, in cases when you sort or compare data which contains mixed letters and digits, you may find differences of behavior, because letters sort before digits in EBCDIC and after digits in ASCII. One typical example is when you compute a key for some type of data (account number, etc.) using both digits and letters. The Cobol converter cannot handle such issues because these are dynamic issues related to the contents of Cobol variables, not static issues related to their declarations.

Literal constants: characters or numbers?

As mentioned above, string or character literals in Cobol programs, including hexadecimal string literals, are subject to EBCDIC-to-ASCII conversion. This is legitimate when these literals denote texts or pieces of text. Sometimes however, such constant values denote (numeric) codes such as file status codes, condition codes, CICS-related values, etc. In this case, it is generally not appropriate to apply EBCDIC-to-ASCII conversion to these values. However, the Cobol converter, like any automatic tool, cannot reliably "guess" the semantic nature of a Cobol variable or literal, so it cannot handle itself these exceptions; this will have to be done manually using post-translation, see post-translation-file clause).

Note:

CICS-related values and codes defined as character literals in standard copy files cause no trouble, because the Rehosting Workbench and Oracle Tuxedo Application Runtime for CICS come with pre-translated, validated versions of these copy files. Only user-defined constants may cause trouble.

Use of floating-point variables

Source floating-point variables (COMP-1 and COMP-2) types are "translated" to the same types on the target platform. Given this, the Micro Focus Cobol compiler and run-time system offer the possibility to use floating-point data (COMP-1 and COMP-2 variables) in either the IBM hexadecimal format or the native (IEEE 754) format. If the NONATIVEFLOATINGPOINT option is set at compile time (which is true by default), then the floating-point format is selected at run-time, depending on the MAINFRAME_FLOATING_POINT environment variable and/or the mainframe_floating_point tunable:

•

MAINFRAME_FLOATING_POINT environment variable set, or mainframe_floating_point tunable set to true: the IBM format will be used.

•

MAINFRAME_FLOATING_POINT environment variable unset, and mainframe_floating_point tunable unset or set to false: the native format will be used.

In the first case, the Micro Focus run-time system will ensure that you will observe no difference of behavior. However, this is at the expense of run-time efficiency, because the handling of this format is done entirely in software, whereas the native format is directly supported by the processor. Furthermore, this format is not directly compatible with the Oracle floating-point data types (BINARY_FLOAT and BINARY_DOUBLE) and cannot be converted to other numeric types by the Oracle engine; in fact, the only thing you can do with it is store it in opaque columns (RAW(4) and RAW(8), respectively), which forbids using such values in SQL code.

In consequence, we recommend that, at migration time or later, you consider using the native IEEE754 floating-point format, more efficient, more portable (defined by an international standard) and more compatible, if only with Oracle. Of course, because:

1.

The representation of single- and double-precision floating-point values are not the same in this format as in the source IBM format,

2.

The source and target compilers may make different choices regarding arithmetic expressions using floating-point variables (order of computation, precision of intermediate variables, rounding mode, etc.),

3.

The textual, printable representation of the same floating-point value may be different on both platforms (use of scientific notation, number of digits before and after the decimal point, etc.),

you will probably observe differences of behavior between the original and migrated applications. However, in our opinion, these differences of behavior are largely acceptable, if you keep in mind that floating-point arithmetic is only an approximation of the mathematical exactness: you will simply get a different approximation on the target machine than on the source one…

Note:

To help you deal with this issue, we performed various experiments using varied floating point-values and computations, and we found out that:

•

On the source platform, COMP-1 and COMP-2 types have the same representation range, from about 10-79 to 1076, whereas on the target platforms (which all natively support the IEEE 754 format), the range for COMP-1 is about from 10-45 to 1038 and the range for COMP-2 is about from 10-323 to 10308. So the tradeoff between range and precision is different on both platforms.

When the same computations are performed on ranges available on both the source and target platforms, the relative error between the observed results (as printed by DISPLAY) is always less than 10-6 when using COMP-1 variables and less than 10-14 using COMP-2 variables. This is not a definitive proof that everything works fine, but it is at least an encouraging indication.

Given these results, it seems that one can always reproduce the same behavior on the target as on the source, up to insignificant approximations, possibly by replacing some COMP-1 variables by COMP-2 ones.

Note:

If you decide to go with the native IEEE 754 format, we recommend that you set the NATIVEFLOATINGPOINT compiler option, which forces the use of this format at compile-time, regardless of run-time options and tunables. Thus, you will save the run-time format tests.

REWRITE operations on LINE SEQUENTIAL files

By default, data files which are SEQUENTIAL on the source platform are translated into LINE SEQUENTIAL files on the target platform, to be more "usable". In general, this is a good choice and such files are well supported by the Micro Focus Cobol system. However, there is a catch: since such files are inherently of variable record size, a REWRITE operation may cause unpredictable results and differences of behavior (see the Micro Focus documentation). If you are not sure that REWRITE operations on a given SEQUENTIAL file would always succeed if that file is turned into a LINE SEQUENTIAL one, we advise to keep it purely SEQUENTIAL; this can be done by inserting its description in the configuration sub-file referenced by the pure-seq-map-file clause below.

To ease the handling of this problem, in a future version, the the Rehosting Workbench cataloger will produce the list of SEQUENTIAL logical files which incur a REWRITE operation.

Pointer manipulation

Pointer size changes: beware of redefinitions

On the source platform, a variable of type POINTER occupies 4 bytes in memory (32 bits); on all the sup-ported target platforms, based on 64-bit Operating Systems, such a variable occupies 8 bytes. This may lead to various kinds of differences of behavior for which we take no responsibility:

•

Technical redefinitions: if a POINTER variable is directly redefined by a PIC X(4) or PIC S9(9) COMP variable used to manipulate the representation of the pointer values, the redefining variable and the code dealing with it will have to be manually rewritten. However, we strongly discourage such machine-dependent "hacks".

•

Structure alignments: if a POINTER variable is part of a structure containing variants (redefinitions), and if the different variants (sub-structures) are designed so that one particular field of one variant must be aligned with (have the same location as) some other field in some other variant, then this property must be maintained after the POINTER variable changes size: compensation fillers must be inserted, etc. Again, this must be handled manually. Note that such intended alignments must be maintained across redefinitions, but also across MOVEs to other structures.

•

Structure size: if a POINTER variable is part of a structure which is moved to some unstructured PIC X(…) variable which was big enough to hold the structure before the POINTER variable changes size, then you must make sure that it is still the case after the change.

Linkage-section arguments with NULL address

On both the source and target platforms, a program parameter (defined in the Linkage Section and listed in the USING clause of the procedure division) which is not actually passed by the caller, either because of an explicit OMITTED item is passed instead or because the caller passes less arguments than the callee expects, appears to have a NULL address in the callee. So it is quite legal, and in fact recommended, to check whether the ADDRESS OF some parameter is NULL before accessing the value of this parameter.

•

However, when the callee fails to check the parameter address and the actual address is NULL, the source and target platforms may behave differently. For instance:

•

On z/OS and AIX, NULL is address 0 and this is considered as a legal address, so when the parameter is accessed, you get whatever is stored at that address (possibly with unpredictable results).

•

On Linux however, although NULL is also address 0, this is not considered as a legal address, so when the parameter is accessed, the program crashes.

It is not possible to automatically handle this situation and the associated differences of behavior, because even if the converter could insert address checks, what should it do when the test fails? Furthermore, the set of subprograms and parameters which are really affected by this problem is a very small minority of all subprograms and parameters and it would be ugly to insert such address checks for all of them. This will have to be handled manually, possibly using post-translation.

•

There is one exception, though, which may alleviate the problem for a large majority of the offending cases: the Oracle Tuxedo Application Runtime for CICS will ensure that all programs called from it (first program in a transaction, EXEC CICS XCTL, EXEC CICS LINK, etc.) will receive a valid COMMAREA: either the one passed from the caller or a dummy-but-legal one.

•

See also the discussion of the STICKY-LINKAGE compiler option below.

Representation of the NULL pointer value

The representation of the NULL pointer value may vary from one platform to another, in particular between the source and target platforms – if only because they don't have the same size, like every other pointer value. In consequence, every program which assumes a specific representation for this value, for instance by "casting" it to or from some binary integer value, may have a different behavior from one platform to another. The Cobol converter cannot handle this issue by itself, automatically, and it will have to be handled manually. Anyway, we strongly discourage such machine-dependent "hacks".

Description of the input components, prerequisites

The input components are all the Cobol programs in the asset, after they have been parsed by the cataloger. In fact, the Cobol Converter loads the POB files for the programs, not their source files. In addition to the restrictions imposed by the cataloger (no nested programs, etc.; see Cataloger), the following rules must be respected before attempting the Cobol conversion:

•

All the anomalies reported by the cataloger must be fixed. Otherwise, there is a risk that the conversion is incorrect, or even that the Converter fails (crashes). In fact, the Converter will refuse to convert any program that contains a FATAL error. But even ERRORs or simple WARNINGs may cause trouble, so it is strongly advised to fix all anomalies. See force-translation clause, however.

•

The source format for all Cobol source files (main programs and copy files) must be fixed format with a numbering area (columns 1-6) and a comment area C (columns 73-80) physically removed. This must be done before cataloging. Note that information thus removed can be re-attached to the converted Cobol files, using the post-processor AddComment.

•

The data migration process must have been run before Cobol conversion is started, because the latter depends on the former, for instance to decide which files will be migrated into relational database tables; see the Process Guide for more details. This dependency is concretized by the fact that the file migration tools generate some of the configuration files read by the Cobol converter.

Description of the configuration files

System description file

The system description file describes the location, type and possible dependencies of all the source files in the asset to process. As such, it is the key by which the cataloger, but also all of the Rehosting Workbench tools, including the Cobol Converter, can access these source files and the corresponding components.

Note:

Because of the need to have Cobol source files with the numbering area and comment area C removed, option Cobol-left-margin must be set to 1 (one) and option Cobol-right-margin must be set to 66; these are the default values.

Main conversion configuration file

This file is given to the Cobol converter using the -c or -config mandatory command-line option. It defines various "scalar" parameters influencing the conversion and points to subordinate files containing "large" configuration data, such as renaming files.

Note:

Many of the parameters configurable in this file can also be set on the command line; in this case, the command-line value overrides the configuration-file value.

Tip:

Although not mandatory, it is advisable to store this file in the same parameter directory as the system description file.

General syntax

The contents of the main conversion configuration file is a free-format, unordered list of clauses, each beginning with a keyword and ending with a period. Some clauses take one or more arguments, others are boolean clauses with no argument. The keywords are case-insensitive symbols; the arguments are integers, symbols or (case-sensitive) strings. Spaces, new lines, etc., are comments. Comments can be written in the configuration file in two ways:

•

Start with a sharp sign ("#") and extend to the end of the same line.

•

Start with the "/*" delimiter and extend to the matching "*/" delimiter; in this form, comments can be spread over several lines and be nested.

target-dir clause

Syntax

Target-dir : dir-path .

This clause specifies the location of the directory that will contain the complete hierarchy of target files, for both programs and copy files. If there is a source program A/B/name.ext in the root directory of the asset (as specified in the system description file), then the corresponding target program will be located as A/B/name.ext in this target directory (possibly with a different file extension, see below). The same mechanism is used for copy files, except that the target path will be Master-copy/A/B/name.ext (or possibly a different file extension). The Master-copy directory is related to the copy reconciliation process, see Command-line syntax.

•

The dir-path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

•

The actual target directory will be created automatically, if necessary, when the Cobol Converter is run.

keep-same-file-names, target-program-extension and target-copy-extension clauses

Syntax

keep-same-file-names.

target-program-extension : extension . (or) tpe : extension .

target-copy-extension : extension .( or) tce : extension .

These clauses direct how the file extensions for the converted programs (main source files) and copy files are determined:

•

If the keep-same-file-names clause is given, the converted programs and copy files will have the same file extensions as the original files in the source asset (as cataloged). The other clauses, if given, will be ignored.

•

Otherwise:

•

If the target-program-extension clause is given, then the converted programs will have the given file extension,

•

If the target-copy-extension clause is given, then the converted copy files will have the given file extension.

•

By default, the converted programs will have the file extension cbl and the converted copy files will have file extension cpy.

verbosity-level clause

Syntax

verbosity-level : level .

This clause specifies the amount of detail which the Cobol converter writes to the execution log. The default value of 2 is fairly verbose, higher values are even more verbose, value 1 only displays important (error) messages.

deferred-copy-reconcil clause

Syntax

deferred-copy-reconcil.

or

deferred-crp.

or

dcrp.

This clause specifies that the copy-reconciliation process crp is to be deferred until after the conversion is completed; this allows Cobol conversion to run in multiple concurrent processes. By default, in the absence of this clause, the copy-reconciliation process is executed incrementally immediately after each program is converted, which mandates single-process execution. See the copy-reconciliation process below for more details.

force-translation clause

Syntax

force-translation.

This clause directs the Cobol converter to (try to) convert even those programs that contain FATAL errors although without any guarantees: the converter may produce incorrect results or even crash. By default, in this case, the converter refuses to work on this program and skips to the next one.

rename-copy-map-file clause

Syntax

rename-copy-map-file : file-path .

This clause specifies the location of the subordinate configuration file containing information to rename copy files, see the copy-renaming configuration file below. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

rename-call-map-file clause

Syntax

rename-call-map-file : file-path .

This clause specifies the location of the subordinate configuration file containing information to rename sub-programs and their calls, see the call-renaming configuration file. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

post-translation-file clause

Syntax

post-translation-file : file-path .

This clause specifies the location of the subordinate configuration file containing the description of manual transformations to apply after the Rehosting Workbench Converter, see the post-translation configuration file. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

on-size-error-call clause

Syntax

on-size-error-call: proc-name .

This clause specifies the name, as a symbol, of the procedure to call to cause a definite termination of the program. This name is used to force termination in situations in which the IBM compiler would force termination but not the Micro Focus compiler, such as size errors in arithmetic statements. The default name is .ABORT.

hexa-map-file clause

Syntax

hexa-map-file : file-path .

This clause specifies the location of the subordinate configuration file containing the EBCDIC-to-ASCII transformation to apply to characters in hexadecimal form, see the hexadecimal conversion configuration file. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

conv-ctrl-file clause and alt-key-file clause

These two clauses go together.

Syntax

conv-ctrl-file : file-path or conv-ctrl-list-file : file-path .

alt-key-file : file-path

These clauses specify the location of the two subordinate configuration files containing information regarding file-to-Oracle conversion. These files are generated by the Rehosting Workbench File-to-Oracle conversion tool, as respectively the Conv-ctrl-file or the Conv-ctrl-list-file and the Alt-key file. See file-to-RDBMS configuration files.

Only one of the first two clauses must be given: either the conv-ctrl-file clause or the conv-ctrl-list-file clause, but not both.

The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

RDBMS-conversion-file clause

Syntax

RDBMS-conversion-file : file-path .

This clause specifies the location of the top-level subordinate configuration file containing information about relational DBMS conversion (from DB2 to Oracle). See the RDBMS-conversion configuration files for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

keywords-file clause

Syntax

keywords-file : file-path .

This clause specifies the location of the subordinate configuration file containing information to rename Cobol identifiers which happen to be keywords or reserved words in the target Cobol dialect, see the keywords file for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

accept-date and accept-day clauses

Syntax

accept-date: date-entry-name .

accept-day: day-entry-name .

These clauses specify sub-program names to replace ACCEPT … FROM DATE and ACCEPT … FROM DAY statements. For instance, the statement:

ACCEPT MY-DATE FROM DATE

would be replaced by:

CALL "DATE-ENTRY-NAME" USING MY-DATE BY VALUE LENGTH OF MY-DATE

This allows more control and more flexibility on how programs acquire their current date. For instance, during regression tests, it is necessary to run migrated programs with the same current date as when the source programs were run; these sub-programs (to be supplied by the Rehosting Workbench users, according to their requirements) will allow this.

If any of these clauses is not specified, the corresponding statements are not transformed. You can use the current_day, current_month, and current_year parameters of the Microfocus Cobol run-time system to control the date returned by the ACCEPT statements; see the MicroFocus documentation.

sql-stored-procedures-file clause

Syntax

sql-stored-procedures-file: file-path .

This clause specifies the location of the subordinate configuration file containing the list of DB2 stored procedures called directly from Cobol, see the stored-procedure file for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

remove-sql-qualifier clause

Syntax

remove-sql-qualifier.

This clause enables the transformation rule which removes the schema qualifier from every SQL identifier which has one. The resulting program will hence rely on implicit schema qualification.

Tip:

This is generally not needed, and possibly even not desired, but it is useful if you want to run the program concurrently in multiple environments (connecting to multiple databases or schemas), for instance in multiple test corridors.

activate-cics-rules clause

Syntax

activate-cics-rules.

This clause forces the Cobol converter to apply to any program processed in the current execution the rules which normalize the EXEC CICS statements and prepare the program for use with the Oracle Tuxedo Application Runtime for CICS environment, including the CICS preprocessor.

Notes:

•

There exists a command-line option of the same name (see cobol-convert command) which has the same effect as this clause, and which is more flexible to use. So we believe that the configuration-file clause will be seldom used, except perhaps in projects in which the TP and batch parts of the asset are well identified and strictly separated in the migration project.

•

Whether this clause is given or not, the above rules will be applied anyway to every program which contains one or more EXEC CICS statement. So this clause (or the equivalent command-line argument) will be effective only for subprograms used in a CICS environment (implicit COMMAREA, etc). but do not perform CICS operations themselves.

pure-seq-map-file clause

Syntax

pure-seq-map-file: file-path .

or

purely-sequential-map-file: file-path .

This clause specifies the location of the subordinate configuration file containing the list of SEQUENTIAL logical files which are to be kept (record) SEQUENTIAL rather than converted to LINE SEQUENTIAL. See purely-sequential configuration file for more details. The file-path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

dont-print-what-string clause

Syntax

dont-print-what-string .

When present, this clause specifies that the what-string containing conversion timestamp and converter version information, which the converter normally inserts at the beginning of every converted file, is not to be printed out in this execution. This will be seldom used, unless you really want to hide the fact that your application is migrated using the the Rehosting Workbench!

remove-empty-copies clause

Syntax

remove-empty-copies .or rec.

When present, this clause specifies that COPY directives referencing copy files which no longer contain useful Cobol code after conversion are to be commented out; by default, these directives remain active. This applies for instance to copy files defining whole FD paragraphs for files which migrate into database tables.

sql-return-codes-file clause

Syntax

sql-return-codes-file: file-path .

This clause specifies the location of the subordinate configuration file containing additional pairs of equivalent DB2 & Oracle SQLCODE values. See the sql-return-codes configuration file for more details. The file-path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

copy-renaming configuration file

This file is associated with the rename-copy-map-file clause. Its contents are in CSV format, with the semicolon character used as separator. Each line is in the form:

original-copy-name;original-library-name;new-copy-name.

All names are Cobol-like, case-insensitive symbols. The meaning of such a line is that, when the directive:

COPY ORIGINAL-COPY-NAME OF ORIGINAL-LIBRARY-NAME { REPLACING … }

is encountered in a program, it is replaced by:

COPY NEW-COPY-NAME { REPLACING … }

Note that library names are not used on the target platform because they are inconvenient; it is much better to use search paths, see the COBCPY environment variable). When the original-library-name field is empty, the rule is to replace unqualified directives of the form:

COPY ORIGINAL-COPY-NAME { REPLACING … }

The same renaming applies to the copy file itself: when the Converter prints out, in the target private copy directory, the copy file referenced by this directive (see below for more information about copy reconciliation), it is printed with the new name.

When the rename-copy-map-file clause is not present, or when this file is empty, no copy renaming takes place. It is an error when the file cannot be found or read, or when the same original-copy-name;original-library-name combination is associated with different new-copy-names in different lines of the file. In this case, the converter stops with an error message and does not convert any programs. Note however that it does not check whether two different copy files in the same directory are renamed to the same target file. In principle, this would be handled gracefully by the copy reconciliation process, but without guarantee.

call-renaming configuration file

This file is associated with the rename-call-map-file clause described above. Its contents are in CSV format, with the semicolon character as separator. Each line is in the form:

original-call-name;new-call-name.

All names are Cobol-like, case-insensitive symbols. The meaning of such a line is that, when the statement:

CALL "ORIGINAL-CALL-NAME" { USING … }

is encountered in a program, it is replaced by:

CALL "NEW-CALL-NAME" { USING … }

The converter also attempts to rename literal strings which are "associated" with variables used in dynamic calls using direct constructs (VALUE, MOVE, etc.). For obvious reasons, it cannot handle truly dynamic calls in which the callee name is "computed" using complex manipulations (STRING, etc.), transported thru opaque containers or obtained from outside the caller program (e.g., read from a file or passed as parameter); such situations must be handled manually.

The same renaming applies to called sub-programs and their entry points: when the converter prints out in the target directory, a program whose base name matches one of the original names listed in the renaming file, it is printed with the corresponding new name. Similarly, for ENTRY statements whose argument is a string matching one of the original names, the string is transformed into the new name.

When the rename-call-map-file clause clause is not present, or when this file is empty, no call renaming takes place. It is an error when the file cannot be found or read, or when the same original-call-name is associated with different new-call-names in different lines of the file. In this case, the converter stops with an error message and does not convert any programs. Note however that it does not check whether two different subprograms in the same directory are renamed to the same target file.

post-translation configuration file

This file is associated with the post-translation-file clause. Its contents are a sequence of rules with the following syntax:

rule rule_name

filter [

(+|-)program_name_regexp

]

transform [

source_lines_block

]

into [

target_lines_block

]

The semantics of such a rule are simple: if, in a program, the (base) name of which matches any of the "positive" program_name_regexp's but none of the "negative" ones, a block of lines matching source_lines_block¹ is encountered, it is replaced by target_lines_block. rule_name is used in the comment associated with the application of the transformation. See appendix the post-translator below for more details.

The post-translation file may contain as many rules as desired, in any order (although the behavior of the post-translator is not defined when two source_lines_blocks overlap, or when a source_lines_block and a target_lines_block overlap).

Tip:

In the syntax above, it is very important that the square brackets closing the filter, transform, and into clauses, are in column 1, at the very beginning of the line; otherwise, they will be interpreted as part of the block.

Comments start with a sharp sign ("#") and extend to the end of the line; you can insert them anywhere between the rules, between the four clauses in a rule and after the rule name; if you insert such a comment inside a square-bracketed filter or transform or into block, it will be considered as part of the block contents rather than as a comment.

hexadecimal conversion configuration file

This file is associated with the hexa-map-file clause above. Its contents are an EBCDIC-to-ASCII conversion table to apply to characters in hexadecimal form (characters in textual form are supposed to be converted at the same time as the source file itself). The syntax is simply a CSV file with a semicolon as separator. Each line is in the form:

source-hexa-code;target-hexa-code,

Each hexa-code being written as usual, with two hexadecimal characters (0-9, A-F). The semantics of this conversion table are that if some hexadecimal literal in the source file does not match any source code in this table, it is left as is, unconverted. Such conversion works also on embedded-SQL code. Note that the converter makes no attempt to check the intrinsic consistency of the conversion table (e.g. the fact that no source-hexa-code or no target-hexa-code appears twice), nor the fact that it really describes some EBCDIC-to-ASCII conversion.

Tip:

It is strongly suggested that this table be derived from the global, project-specific conversion table used to convert data and source files, see Oracle Tuxedo Application Runtime Process Guide for more details. Failure to do so may lead to differences of behavior on the target platform, for which we take no responsibility.

file-to-RDBMS configuration files

These files are associated with the conv-ctrl-file clause and alt-key-file clause. They contain information about file-to-RDBMS conversion, e.g. to define which logical files (FDs) are converted into RDBMS tables (actually, because the physical files they are associated with are converted to these DB tables). Since these files are automatically generated by the the Rehosting Workbench File-to-Oracle conversion tool and should not be modified by hand, their contents are not further specified here.

RDBMS-conversion configuration files

These files are associated with the RDBMS-conversion-file clause above. The information they contain is accessed in a two-level way:

•

The top-level file is named in the RDBMS-conversion-file clause proper. Its contents is a CSV table, with each line in the form:

schema-name;file-path.

File-path is the path to the file containing RDBMS-conversion information pertaining to SQL schema schema-name. As usual, it can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file. This file must be created by the Rehosting Workbench user.

•

For each schema in the application, the file containing RDBMS-conversion information pertaining to this schema (default date and time format, renaming map, etc.) is an XML file generated by the File-to-Oracle tool, which should not be modified by hand. In consequence, its content is not further specified here.

keywords file

This file is associated with the keywords-file clause. Its contents are a CSV table using the semicolon as separator, each line being in the form:

old-name;new-name.

The effect of such a line is to rename every Cobol identifier (variable name, paragraph name, etc.) named old-name in every program into new-name. This is required for names which happen to be keywords or reserved words in the target Cobol dialect, such as TEST, but it may also be useful to rename plain identifiers for reengineering purposes.

stored-procedure file

This file is associated with the sql-stored-procedures-file clause. Its contents are a list of subprogram names, one per line. When one of these names appears in a Cobol CALL statement, the latter is replaced by an SQL CALL statement. In addition, declarations of the parameters of the CALL, if any, are adapted so that they can be used in SQL statements.

purely-sequential configuration file

This file is associated with the pure-seq-map-file clause. Its contents is a CSV table using the semi-colon as separator, each line being in the form

program-name;FD-name

with both names being symbols. The effect of such a line is to prevent this particular logical file (the given FD in the given program), assumed to be (record) SEQUENTIAL on the source platform, to be converted to LINE SEQUENTIAL on the target platform; rather, it is kept unchanged as a record SEQUENTIAL file. This makes it much less amenable to manipulation using standard target-platform utilities, but on the other hand, it will support unrestricted REWRITE operations (see section REWRITE operations on LINE SEQUENTIAL files above). This might also be useful for files exchanged with a z/OS platform in binary form.

sql-return-codes configuration file

This file is associated with the sql-return-codes-file clause. Its contents is a CSV table using the semicolon as separator, each line being in the form:

DB2-sqlcode-value;Oracle-sqlcode-value

with both values being positive or negative integers. The effect of such a line is to add the pair of values to the translation table used to map "remarkable" DB2 SQLCODE values to their equivalent Oracle SQLCODE values. This translation table is initialized as if read from the following file:

Listing 6‑2 DB2 to Oracle SQL return code mapping

+100;+1403

-810;-1422

-803;-1

-530;-2291

-516;-1002

-501;-1001

-407;-1451

-305;-1405

-180;-1820

-181;-1821

-811;-2112

-204;-942

Of course, value 0 (zero) is mapped to itself.

Note:

The Cobol converter does not currently check the consistency of this translation table; for instance, it does not complain if the same DB2 value is mapped to more than one distinct Oracle value.

Description of output files

Converted programs and copy files

Naming scheme

As mentioned above, the main purpose of the Rehosting Workbench Cobol Converter is to produce the converted Cobol components, in the form of their source files. There is a direct, one-to-one correspondence between the hierarchy of main program files inside the source root directory and the hierarchy of main program files inside the target root directory; the only possible differences, as far as file names are concerned, come from the CALL-renaming map and the choice of the target program-file extension, see rename-call-map-file clause and keep-same-file-names, target-program-extension and target-copy-extension clauses. The same comments apply for the target copy files, with the following observations:

•

The hierarchy of target copy files is located in the Master-copy sub-directory of the target root directory.

•

The names of the target copy files may differ from those of the source files because of the COPY-renaming map and the choice of the target copy-file extension.

•

The correspondence between source and target copy files may not be strictly one-to-one. It may be one-to-many when the transformations applied by the converter on the contents of some copy file depend on the context in which this file is included. This is handled by the copy reconciliation process, see below for more details.

If file ORIGCOPY(.s-ext) is translated into multiple versions, these versions are named ORIGCOPY(.t-ext), ORIGCOPY_V1(.t-ext), ORIGCOPY_V2(.t-ext), etc.

Transformation comments

In principle, Cobol conversion is a "light" process, because Cobol on the target platform is not that different from Cobol on the source platform. This is why this process is called conversion rather than translation. Indeed, a converted file, either main file or copy file, generally differs from its corresponding source file only in very few places; the bulk of the contents is not affected in any way and is reproduced in the target file exactly as it is in the source file. The differing places, however, are identified by specific transformation comments.

Modified code

In places at which some transformation actually took place, the converter inserts transformation comments describing the effects of the transformation. The affected code is composed of:

•

A header line giving the transformation-rule name and version; the header line starts with a recognizable prefix, namely "*{", in which the opening curly bracket symbolizes the start of the transformation.

•

The original code, commented out.

•

An intermediate separator, namely a line composed of "*--".

•

The new code which replaces the original one.

•

A terminating line, namely "*}", in which the closing curly bracket symbolizes the end of the transformation.

Listing 6‑3 Transformation comment example

*{ tr-binary-to-comp-5 1.2

* 77 MY-VAR PIC S9(9) COMP.

*--

77 MY-VAR PIC S9(9) COMP-5.

*}

Added code

Some rules not only transform existing code but also insert some completely new code in some remote places, for instance, the declaration of an intermediate variable in the Working-Storage Section. In this case, the affected area in the program is composed of:

•

A header line giving the transformation-rule name and version; the header line starts with the prefix "*+{", in which the opening curly bracket symbolizes the start of the transformation and the plus sign indicates that this is an insertion rather than a transformation.

•

The inserted code.

•

The terminating line "*+}".

Deleted code

When a rule simply deletes some code rather than transforming it, the affected area in the program has the same organization as for modified code, except that the "new code" area is empty:

Moved code

Some rules move code from one place in the program to another, for instance, when a file is migrated into a relational DB table, the corresponding FD is deleted and the data records it contains are moved to the Working-Storage Section. In this case, the code at the original location is shown as deleted and the code at the new location is shown as inserted.

Other comment rules

•

It is possible that some transformation acts on the code produced by a previous transformation. In this case, the transformation comments are properly nested.

•

The format of the transformation comments is designed both:

•

To be informative and help the program maintenance team to understand the transformations which were applied during migration by the Cobol Converter; studying these transformations is a quick way for the developer to understand the differences between the source and target Cobol dialects, and become proficient with the latter.

•

To be automatically deletable when they cease to be useful and become a nuisance.

Layout

When the Cobol Converter applies a transformation rule to a piece of code, it attempts to keep the same layout for the new code, by minimizing how elements of the code which exist in both the original and new versions are moved around. In addition, when the converter inserts a new element, for instance a statement or a variable declaration, it tries to align the new element with similar ones before or after it. When, by following these guidelines, a transformed or new line of code becomes too long for the fixed format, the converter cuts the line at the right-most "nice" cutting point (preferably between two words) and wraps the rest on the next line, flush with the right margin, to indicate that these wrapped elements are logically part of the previous line.

•

We recognize however that the layout issue is a subjective matter and that the result of a transformation might not appear exactly as you would have made it yourself. Consider though that:

•

The only obligation of the Converter is to produce code which is correct; aesthetic considerations are not part of the contract.

•

The converter only applies rules which have to be deterministic; it does not have the power to estimate the aesthetic value of this or that layout.

•

Some other people might actually be completely satisfied with the resulting layout!

Miscellaneous issues

When the Converter prints out a copy file invoked in the main program file with a REPLACING clause, it takes great care to undo the effect of the replacements. However, when the transformations performed by the converter apply to pieces of text generated by some replacement, it may be very hard for the converter to compute the "inverse" of the transformation and create the transformed replacement clause – for instance, when the COPY clause replaces "something" by "nothing", the converter may have a hard time finding the "nothing" to replace back by "something" when the area is affected by a transformation. In such cases, some manual correction might be necessary; use the post-translation feature to apply it in a repeatable way.

Compiler options

To guarantee identical behavior between the source programs and the target ones produced by the Cobol converter, up to the limitations described above, the target programs must be compiled with a certain set of compiler options in effect. Indeed, some of the MicroFocus Cobol compiler options do change the behavior of the executed code. The transformations applied by the Rehosting Workbench Cobol converter are hence tailored to the option set described below. No support will be provided for programs compiled with a different, or at least conflicting, option set. For more information, please see the Micro Focus Cobol documentation, in particular the Compiler Directives book.

The current version of Oracle Tuxedo Application Rehosting Workbench and Oracle Tuxedo Application Runtime for CICS, have been validated with the following option list:

Listing 6‑4 Validated Cobol compiler option list

SOURCEFORMAT"FIXED" P64 DEFAULTBYTE"00" ALIGN"8" NOTRUNCCALLNAME

NOTRUNCCOPY NOCOPYLBR COPYEXT"cpy,cbl" RWHARDPAGE PERFORM-TYPE"OSVS"

NOOUTDD INDD NOTRUNC HOSTARITHMETIC NOSPZERO INTLEVEL"4"

HOST-NUMMOVE"2" HOST-NUMCOMPARE"2" SIGN"EBCDIC" ASSIGN"EXTERNAL"

NOBOUND REPORT-LINE"256" IBMCOMP

No guarantee will be given for programs compiled with an option list which contradicts the above one.

Note:

The P64 option is not necessary on Micro Focus installations set up to compile to 64-bit mode by default.

Detailed processing

Overview

When the Cobol Converter starts, it reads and checks the various configuration files, starting with the main one. If any inconsistency is detected at this stage, one or more error messages are printed and the converter exits. Otherwise, the converter uses both command-line options and configuration-file options to set its internal parameters, including the list of (source) programs to process. Then it proceeds to process each of these programs in turn; for each of them:

1.

According to the make-like, incremental behavior of the Converter, it checks whether the target program already exists and is up-to-date with reference to the POB file for its corresponding source program. If not, the converter skips to the next program. Otherwise, it continues with the next step.

2.

The POB file for the program is loaded. The converter then checks whether it contains FATAL errors. If so, and unless the force-translation flag is set on the command-line or in the configuration file, it prints out a warning message and skips to the next program. Otherwise, it continues with the next step.

3.

The various transformation rules are then applied on the program AST, in several passes. Each transformation modifies the AST and then updates the program layout accordingly (textual appearance).

4.

The (text of the) resulting AST is then printed out in the target program file. When the beginning of a copy file is encountered, the COPY clause is written to the caller file and the sequel of the output is diverted to a new output file for this copy file, in a private directory; if a file with this name already exists in the directory, it probably is because the same copy file is included more than once in the program, and the new file carries a new version number (the existing version is not overwritten). If the copy file was invoked with a REPLACING clause, the effects of the replacements are undone before the file is printed out (see the caveat Miscellaneous issues regarding interferences between transformations and replacements). When the end of the copy file is reached, output is reverted to the caller file. This allows to correctly handle nested copy files.

5.

If the post-translation file is specified in the configuration file, it is exercised by the post-translator on the main target program file and on all target copy files in the private directory.

6.

Lastly, if the deferred-copy-reconcil clause is not given, either on the command-line or in the con-figuration file, the copy reconciliation process is applied to the target copy files in the private directory.

The converter can be executed by several concurrent processes at the same time, provided that the deferred-copy-reconcil clause is given either on the command-line or in the configuration file; otherwise, the copy-reconciliation phase of these concurrent processes may run into access conflicts over the "data-base" of final, reconciled copy files, which could lead to corrupted results.

Command-line syntax

Refine launcher interface

The Cobol Converter is designed to be run through the refine command, which is the generic Oracle Tuxedo Application Rehosting Workbench launcher and is also used to launch all "big" Oracle Tuxedo Application Rehosting Workbench tools. This launcher handles various aspects of the operation of these tools, such as execution log management and incremental/repetitive operation.

cobol-convert command

Synopsis

$REFINEDIR/refine cobol-convert [ launcher-options… ] \

( -s | -system-desc-file ) system-desc-path \

( -c | -config ) main-config-file-path \

[ other-specific-flags… ] \

( source-file-path | ( -f | -file | -file-list-file ) file-of-files )…

Options

The mandatory options are:

( -s | -system-desc-file ) system-desc-path

Specifies the location of the System description file. As usual for Unix/Linux commands, the given path can be absolute or relative to the current working directory. Note that many other paths used by many of the Rehosting Workbench tools are then derived from the location of this file, including that of the main configuration file (see next option); this makes it easy to run the same command from different working directories.

( -c | -config ) main-config-file-path

Specifies the location of the Main conversion configuration file. The given path can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.

The generic options which define which source programs to process are:

source-file-path

Adds to the work-list the program source file designated by this path. The path must be given as relative to the root directory of the system, $SYSROOT, even if the current working directory is different.

( -f | -file | -file-list-file ) file-of-files

Adds to the work-list the program source files listed in the file designated by this path. The file-of-files itself may be located anywhere, and its path is either absolute or relative to the current working directory. The program source files listed in this file, though, must be given relative to the root directory of the system.

You can give as many individual programs and/or files-of-files as you wish. The work-list is built when the command line is analyzed by the Cobol Converter, see the detailed description above.

The optional specific flags or options are:

-dcrp or -deferred-copy-reconcil

Has the same effect as the deferred-copy-reconcil clause of the configuration file, namely to not run the copy reconciliation process incrementally after converting each program. Only with this clause or flag can the Cobol converter run in multiple concurrent processes.

-tpe extension or -target-program-extension extension

This option has the same effect as the configuration-file clause of the same name, and overrides it when given.

-tce extension or -target-copy-extension extension

This option has the same effect as the configuration-file clause of the same name, and overrides it when given.

-keep or -keep-same-file-names

This option has the same effect as the configuration-file clause of the same name, and overrides it when given.

-force or -force-translation

This option has the same effect as the configuration-file clause of the same name, and overrides it when given.

-cics or -activate-cics-rules

This option has the same effect as the configuration-file clause of the same name, and overrides it when given.

Repetitive and incremental operation

Even with the powerful computing platforms easily available nowadays, processing a complete asset using the Rehosting Workbench remains a computing-intensive, long-running, memory-consuming task. The Work-bench tools are hence designed to be easily stopped and restarted and, thanks to a make-like mechanism, not repeat any piece of work which has already been done. This allows efficient operation in all phases of a migration project.

Initial processing: repetitive operation

In the initial phase, when starting with a completely fresh asset and up to the end of the first conversion-translation-generation cycle of a stable asset, the make-like mechanism is used to allow repetitive operation, as follows:

1.

When the Cobol Converter starts, it begins by studying the current state of the asset (source files and target files such as the target program files) and determining what work remains to do to reach a complete and consistent set of results. It then undertakes this work, producing more and more result files.

2.

As the volume of processed files grows, the Rehosting Workbench process consumes more and more memory.

3.

Regularly, the tool checks whether the available physical memory drops below the threshold set by the minimum-free-ram-percent option in the system description file.

•

If the work to be performed is complete before running our of memory, the process definitely stops.

•

Otherwise, the process stops but restarts immediately, after memory is freed. Going back to step 1 above, there is less work to do, so that the process eventually terminates.

This mode is particularly well suited for tools or commands which operate globally on the whole asset such as the Cataloger, but it is also useful for component-wise tools such as the Cobol Converter. This is the normal mode of operation for the Rehosting Workbench tools and there is nothing specific to choose it.

Changes in the asset: incremental operation

The Cobol converter knows the dependencies between the various components (main program files) and associated result files (POB files, target program files). Using this information, it is able to react incrementally when some change occurs in the asset. For example, when a Cobol source file is added, modified or removed: the cataloguer re-parses the affected programs, and then the Cobol converter re-converts only those. Again, this is the normal mode of operation for the Rehosting Workbench tools and there is nothing specific to choose it.

1

In this context, matching simply means that the two blocks of lines must be identical when you reduce each sequence of spaces in both of them to a single space. This is basically "identical" with a little flexibility added.