Table of Contents Previous Next PDF


COBOL Converter

COBOL Converter
The role of this tool is to convert COBOL programs running on the source platform (z/OS, IBM COBOL dialect) into COBOL programs running on the target platform (UNIX or Linux, Micro-Focus COBOL dialect or COBOL-IT) while maintaining the same behavior of the application. The conversion is performed in the context of other components translated or generated by the other Oracle Tuxedo Application Rehosting Workbench tools.
The purpose of this document is to describe precisely all the features of the COBOL Converter.
This chapter contains the following topics:
Overview of the COBOL Converter
Scope
The Refine COBOL converter handles the following transformations in a single pass:
The resulting programs can be compiled and run on the target platform with the same behavior as on the source platform, except in some cases detailed in Scope.
Inputs
The COBOL converter takes as input:
Outputs
It produces as output:
Conversion Phases
The COBOL conversion process is logically divided in two phases:
The individual conversion phase can run concurrently on several programs but, since the copy-reconciliation phase updates the global copy file base, it must run as a single process, possibly incrementally. This dictates the possible execution modes of the COBOL converter; see Command-Line Syntax for more details.
Restrictions and Limitations
By definition, the Rehosting Workbench COBOL Converter accepts only those programs accepted by the Rehosting Workbench COBOL Cataloger, but imposes no further restriction on entry.
The resulting programs can be compiled and run on the target platform with the same behavior as on the source platform, except for the following potential pitfalls for which we take no responsibility:
Use of COMP-5 Type on Linux Platforms
The Oracle Tuxedo Application Rehosting Workbench COBOL converter translates "portable" binary integer types (BINARY, COMP, COMP-4) to the native binary type COMP-5. This is in order to ensure compatibility with sub-programs written in C such as those in the transaction processing framework (see the CICS section of the Oracle Tuxedo Application Rehosting Workbench Reference Guide), and to improve execution performance. This may cause problems when the target platform does not have the same "endianness" as the source platform, in particular on Linux and Intel platforms (the Intel processor line is little-endian whereas the zSeries processor is big-endian; most other processors, such as IBM pSeries and HP-RISC, are also big-endian). Indeed, in this case, the order of bytes in a binary variable is reversed with respect to the source platform. This can lead to different behavior when such a binary variable is redefined by a character (PIC X) variable and this redefinition is used to access the individual bytes in the binary variable. For Example:
Listing 10‑1 Binary Field Manipulation Example
WORKING-STORAGE SECTION.
01 FILLER.
02 BINVAR PIC S9(9) COMP.
02 CHARVAR REDEFINES BINVAR PIC X(4).
PROCEDURE DIVISION.
...
MOVE ... TO BINVAR
IF CHARVAR(1:1) = ... THEN ...
 
On a big-endian machine such as the z/OS hardware, CHARVAR(1:1) contains the most significant (higher-order) byte of BINVAR. However, on a little-endian machine, with the same code, CHARVAR(1:1) will contain the least significant (lower-order) byte of BINVAR; this is definitely a change of behavior and will probably lead to different observable results. However, the Rehosting Workbench COBOL Converter is unable to detect and fix all occurrences of this situation (the example above is "obvious", but there exists many much more complex cases); these must be handled manually.
Use of COMP-5 Type and the TRUNC Compiler Option
As mentioned in the previous paragraph, the Rehosting Workbench COBOL converter translates portable binary integer types (BINARY, COMP, COMP-4) to the native binary type COMP-5. In addition to endianness problems, this may cause another kind of difference of behavior for applications which were compiled with the (default) TRUNC(STD) option on the source platform – this option corresponds to the TRUNC option of Micro Focus COBOL or the BINARY-TRUNCATE option of COBOL-IT. Indeed, both on the source and on the target platforms, the portable binary types obey this option whereas the native type does not. In general, the probability of observing a real difference of behavior is very low, because in general, binary-integer variables are used to hold "control" values (loop counters, array indices, etc.) rather than applicative values. In any case, if differences of behavior are observed, it is up to the Rehosting Workbench user to deal with them, either by accepting them or by manually correcting them, for instance by returning a few selected variables to their original binary type.
EBCDIC-to-ASCII Conversion Issues
For reasons of efficiency and compatibility with native utility programs on the target platform—for instance, simply browsing through a data file on the terminal—one of the fundamental design choices for the migration performed by the Rehosting Workbench is to convert textual (alphabetic) data from the native character set of the source platform (EBCDIC, or one of its variants) into the native character set of the target platform (ASCII, or one of its variants). This common-sense decision however has important consequences on the migration process:
In most cases, if you comply correctly with these directives, the resulting application will run smoothly. There is one issue however, for which no efficient solution can be found: the collating sequences of the EBCDIC and ASCII character sets are not quite the same, and this may lead to different behavior in sorting and string comparisons. In most cases, there is no problem, because you sort or compare "homogeneous" data such as names (alphabetic) or dates (numeric); only special characters such as accented letters may sort a bit differently but still satisfactorily. However, in cases when you sort or compare data which contains mixed letters and digits, you may find differences of behavior, because letters sort before digits in EBCDIC and after digits in ASCII. One typical example is when you compute a key for some type of data (account number, etc.) using both digits and letters. The COBOL converter cannot handle such issues because these are dynamic issues related to the contents of COBOL variables, not static issues related to their declarations.
Literal Constants: Characters or Numbers?
As mentioned above, string or character literals in COBOL programs, including hexadecimal string literals, are subject to EBCDIC-to-ASCII conversion. This is legitimate when these literals denote texts or pieces of text. Sometimes however, such constant values denote (numeric) codes such as file status codes, condition codes, CICS-related values, etc. In this case, it is generally not appropriate to apply EBCDIC-to-ASCII conversion to these values. However, the COBOL converter, like any automatic tool, cannot reliably "guess" the semantic nature of a COBOL variable or literal, so it cannot handle itself these exceptions; this will have to be done manually using post-translation, see post-translation-file Clause).
Note:
Use of Floating-Point Variables
Source floating-point variables (COMP-1 and COMP-2) types are "translated" to the same types on the target platform. Given this, the Micro Focus COBOL compiler and run-time system offer the possibility to use floating-point data (COMP-1 and COMP-2 variables) in either the IBM hexadecimal format or the native (IEEE 754) format. If the NONATIVEFLOATINGPOINT option is set at compile time (which is true by default), then the floating-point format is selected at run-time, depending on the MAINFRAME_FLOATING_POINT environment variable and/or the mainframe_floating_point tunable:
MAINFRAME_FLOATING_POINT environment variable set, or mainframe_floating_point tunable set to true: the IBM format will be used.
MAINFRAME_FLOATING_POINT environment variable unset, and mainframe_floating_point tunable unset or set to false: the native format will be used.
In the first case, the Micro Focus run-time system will ensure that you will observe no difference of behavior. However, this is at the expense of run-time efficiency, because the handling of this format is done entirely in software, whereas the native format is directly supported by the processor. Furthermore, this format is not directly compatible with the Oracle floating-point data types (BINARY_FLOAT and BINARY_DOUBLE) and cannot be converted to other numeric types by the Oracle engine; in fact, the only thing you can do with it is store it in opaque columns (RAW(4) and RAW(8), respectively), which forbids using such values in SQL code.
In consequence, we recommend that, at migration time or later, you consider using the native IEEE754 floating-point format, more efficient, more portable (defined by an international standard) and more compatible, if only with Oracle. Of course, because:
1.
2.
3.
you will probably observe differences of behavior between the original and migrated applications. However, in our opinion, these differences of behavior are largely acceptable, if you keep in mind that floating-point arithmetic is only an approximation of the mathematical exactness: you will simply get a different approximation on the target machine than on the source one…
Note:
On the source platform, COMP-1 and COMP-2 types have the same representation range, from about 10-79 to 1076, whereas on the target platforms (which all natively support the IEEE 754 format), the range for COMP-1 is about from 10-45 to 1038 and the range for COMP-2 is about from 10-323 to 10308. So the tradeoff between range and precision is different on both platforms.
When the same computations are performed on ranges available on both the source and target platforms, the relative error between the observed results (as printed by DISPLAY) is always less than 10-6 when using COMP-1 variables and less than 10-14 using COMP-2 variables. This is not a definitive proof that everything works fine, but it is at least an encouraging indication.
Given these results, it seems that one can always reproduce the same behavior on the target as on the source, up to insignificant approximations, possibly by replacing some COMP-1 variables by COMP-2 ones.
Note:
If you decide to go with the native IEEE 754 format, we recommend that you set the NATIVEFLOATINGPOINT compiler option, which forces the use of this format at compile-time, regardless of run-time options and tunables. Thus, you will save the run-time format tests.
REWRITE Operations on LINE SEQUENTIAL Files
By default, data files which are SEQUENTIAL on the source platform are translated into LINE SEQUENTIAL files on the target platform, to be more "usable". In general, this is a good choice and such files are well supported by the target COBOL system. However, there is a catch: since such files are inherently of variable record size, a REWRITE operation may cause unpredictable results and differences of behavior (see the Micro Focus and COBOL-IT documentation). If you are not sure that REWRITE operations on a given SEQUENTIAL file would always succeed if that file is turned into a LINE SEQUENTIAL one, we advise to keep it purely SEQUENTIAL; this can be done by inserting its description in the configuration sub-file referenced by the pure-seq-map-file clause below.
To ease the handling of this problem, in a future version, the Rehosting Workbench cataloger will produce the list of SEQUENTIAL logical files which incur a REWRITE operation.
Pointer Manipulation
Pointer Size Changes: Beware of Redefinitions
On the source platform, a variable of type POINTER occupies 4 bytes in memory (32 bits); on all the supported target platforms, based on 64-bit operating systems, such a variable occupies 8 bytes. This may lead to various kinds of differences of behavior for which we take no responsibility:
Technical redefinitions: if a POINTER variable is directly redefined by a PIC X(4) or PIC S9(9) COMP variable used to manipulate the representation of the pointer values, the redefining variable and the code dealing with it will have to be manually rewritten. However, we strongly discourage such machine-dependent "hacks".
Structure alignments: if a POINTER variable is part of a structure containing variants (redefinitions), and if the different variants (sub-structures) are designed so that one particular field of one variant must be aligned with (have the same location as) some other field in some other variant, then this property must be maintained after the POINTER variable changes size: compensation fillers must be inserted, etc. Again, this must be handled manually. Note that such intended alignments must be maintained across redefinitions, but also across MOVEs to other structures.
Structure size: if a POINTER variable is part of a structure which is moved to some unstructured PIC X(…) variable which was big enough to hold the structure before the POINTER variable changes size, then you must make sure that it is still the case after the change.
Linkage-Section Arguments with NULL Address
On both the source and target platforms, a program parameter (defined in the Linkage Section and listed in the USING clause of the procedure division) which is not actually passed by the caller, either because of an explicit OMITTED item is passed instead or because the caller passes less arguments than the callee expects, appears to have a NULL address in the callee. So it is quite legal, and in fact recommended, to check whether the ADDRESS OF some parameter is NULL before accessing the value of this parameter.
However, when the callee fails to check the parameter address and the actual address is NULL, the source and target platforms may behave differently. For instance:
On z/OS and AIX, NULL is address 0 and this is considered as a legal address, so when the parameter is accessed, you get whatever is stored at that address (possibly with unpredictable results).
On Linux however, although NULL is also address 0, this is not considered as a legal address, so when the parameter is accessed, the program crashes.
It is not possible to automatically handle this situation and the associated differences of behavior, because even if the converter could insert address checks, what should it do when the test fails? Furthermore, the set of subprograms and parameters which are really affected by this problem is a very small minority of all subprograms and parameters and it would be ugly to insert such address checks for all of them. This will have to be handled manually, possibly using post-translation.
See also the discussion of the STICKY-LINKAGE compiler option below.
Representation of the NULL Pointer Value
The representation of the NULL pointer value may vary from one platform to another, in particular between the source and target platforms – if only because they don't have the same size, like every other pointer value. In consequence, every program which assumes a specific representation for this value, for instance by "casting" it to or from some binary integer value, may have a different behavior from one platform to another. The COBOL converter cannot handle this issue by itself, automatically, and it will have to be handled manually. Anyway, we strongly discourage such machine-dependent "hacks".
Description of the Input Components, Prerequisites
The input components are all the COBOL programs in the asset, after they have been parsed by the cataloger. In fact, the COBOL Converter loads the POB files for the programs, not their source files. In addition to the restrictions imposed by the cataloger (no nested programs, etc.; see Cataloger), the following rules must be respected before attempting the COBOL conversion:
Description of the Configuration Files
System Description File
The system description file describes the location, type and possible dependencies of all the source files in the asset to process. As such, it is the key by which the cataloger, but also all of the Rehosting Workbench tools, including the COBOL Converter, can access these source files and the corresponding components.
Note:
Because of the need to have COBOL source files with the numbering area and comment area C removed, option Cobol-left-margin must be set to 1 (one) and option Cobol-right-margin must be set to 66; these are the default values.
Main Conversion Configuration File
This file is given to the COBOL converter using the -c or -config mandatory command-line option. It defines various "scalar" parameters influencing the conversion and points to subordinate files containing "large" configuration data, such as renaming files.
Note:
Tip:
General Syntax
The contents of the main conversion configuration file is a free-format, unordered list of clauses, each beginning with a keyword and ending with a period. Some clauses take one or more arguments, others are boolean clauses with no argument. The keywords are case-insensitive symbols; the arguments are integers, symbols or (case-sensitive) strings. Spaces, new lines, etc., are comments. Comments can be written in the configuration file in two ways:
target-dir Clause
Syntax
Target-dir : dir-path .
This clause specifies the location of the directory that will contain the complete hierarchy of target files, for both programs and copy files. If there is a source program A/B/name.ext in the root directory of the asset (as specified in the system description file), then the corresponding target program will be located as A/B/name.ext in this target directory (possibly with a different file extension, see below). The same mechanism is used for copy files, except that the target path will be Master-copy/A/B/name.ext (or possibly a different file extension). The Master-copy directory is related to the copy reconciliation process, see Command-Line Syntax.
Sql-rules Clause
Syntax
Sql-rules : target-sql-syntax.
This clause specifies the target SQL syntax. Its value can be oracle or none. If the value is none, the sql code in the source files isn't translated. It's transferred as it is to the target components. The default value of this clause is oracle. In the latter case, it's not necessary to set sql-rules to oracle in the configuration file.
keep-same-file-names, target-program-extension and target-copy-extension Clauses
Syntax
keep-same-file-names.
target-program-extension : extension . (or) tpe : extension .
target-copy-extension : extension .( or) tce : extension .
These clauses direct how the file extensions for the converted programs (main source files) and copy files are determined:
If the keep-same-file-names clause is given, the converted programs and copy files will have the same file extensions as the original files in the source asset (as cataloged). The other clauses, if given, will be ignored.
If the target-program-extension clause is given, then the converted programs will have the given file extension,
If the target-copy-extension clause is given, then the converted copy files will have the given file extension.
By default, the converted programs will have the file extension cbl and the converted copy files will have file extension cpy.
Verbosity-Level Clause
Syntax
verbosity-level : level .
This clause specifies the amount of detail which the COBOL converter writes to the execution log. The default value of 2 is fairly verbose, higher values are even more verbose, value 1 only displays important (error) messages.
deferred-copy-reconcil Clause
Syntax
deferred-copy-reconcil.
or
deferred-crp.
or
dcrp.
This clause specifies that the copy-reconciliation process crp is to be deferred until after the conversion is completed; this allows COBOL conversion to run in multiple concurrent processes. By default, in the absence of this clause, the copy-reconciliation process is executed incrementally immediately after each program is converted, which mandates single-process execution. See the copy-reconciliation process below for more details.
force-translation Clause
Syntax
force-translation.
This clause directs the COBOL converter to (try to) convert even those programs that contain FATAL errors although without any guarantees: the converter may produce incorrect results or even crash. By default, in this case, the converter refuses to work on this program and skips to the next one.
rename-copy-map-file Clause
Syntax
rename-copy-map-file : file-path .
This clause specifies the location of the subordinate configuration file containing information to rename copy files, see the copy-renaming Configuration File below. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
rename-call-map-file Clause
Syntax
rename-call-map-file : file-path .
This clause specifies the location of the subordinate configuration file containing information to rename sub-programs and their calls, see the Call-Renaming Configuration File. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
post-translation-file Clause
Syntax
post-translation-file : file-path .
This clause specifies the location of the subordinate configuration file containing the description of manual transformations to apply after the Rehosting Workbench Converter, see the Post-Translation Configuration File. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
on-size-error-call Clause
Syntax
on-size-error-call: proc-name .
This clause specifies the name, as a symbol, of the procedure to call to cause a definite termination of the program. This name is used to force termination in situations in which the IBM compiler would force termination but not the target compiler, such as size errors in arithmetic statements. The default name is .ABORT.
hexa-map-file Clause
Syntax
hexa-map-file : file-path .
This clause specifies the location of the subordinate configuration file containing the EBCDIC-to-ASCII transformation to apply to characters in hexadecimal form, see the Hexadecimal Conversion Configuration File. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
conv-ctrl-file Clause and alt-key-file Clause
These two clauses go together.
Syntax
conv-ctrl-file : file-path or conv-ctrl-list-file : file-path .
alt-key-file : file-path
These clauses specify the location of the two subordinate configuration files containing information regarding file-to-Oracle conversion. These files are generated by the Rehosting Workbench File-to-Oracle conversion tool, as respectively the Conv-ctrl-file or the Conv-ctrl-list-file and the Alt-key file. See File-to-RDBMS Configuration Files.
Only one of the first two clauses must be given: either the conv-ctrl-file clause or the conv-ctrl-list-file clause, but not both.
The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
RDBMS-conversion-file Clause
Syntax
RDBMS-conversion-file : file-path .
This clause specifies the location of the top-level subordinate configuration file containing information about relational DBMS conversion (from DB2 to Oracle). See the RDBMS-conversion Configuration Files for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
keywords-file Clause
Syntax
keywords-file : file-path .
This clause specifies the location of the subordinate configuration file containing information to rename COBOL identifiers which happen to be keywords or reserved words in the target COBOL dialect, see the keywords File for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
accept-date and accept-day Clauses
Syntax
accept-date: date-entry-name .
accept-day: day-entry-name .
These clauses specify sub-program names to replace ACCEPT … FROM DATE and ACCEPT … FROM DAY statements. For instance, the statement:
ACCEPT MY-DATE FROM DATE
would be replaced by:
CALL "DATE-ENTRY-NAME" USING MY-DATE BY VALUE LENGTH OF MY-DATE
This allows more control and more flexibility on how programs acquire their current date. For instance, during regression tests, it is necessary to run migrated programs with the same current date as when the source programs were run; these sub-programs (to be supplied by the Rehosting Workbench users, according to their requirements) will allow this.
If any of these clauses is not specified, the corresponding statements are not transformed. You can use the target COBOL parameters (for example current_day, current_month, and current_year parameters of the Microfocus COBOL run-time system) to control the date returned by the ACCEPT statements; see the MicroFocus/COBOL-IT documentation.
sql-stored-procedures-file Clause
Syntax
sql-stored-procedures-file: file-path .
This clause specifies the location of the subordinate configuration file containing the list of DB2 stored procedures called directly from COBOL, see the stored-procedure File for more details. The file path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
remove-sql-qualifier Clause
Syntax
remove-sql-qualifier.
This clause enables the transformation rule which removes the schema qualifier from every SQL identifier which has one. The resulting program will hence rely on implicit schema qualification.
Tip:
activate-cics-rules Clause
Syntax
activate-cics-rules.
This clause forces the COBOL converter to apply to any program processed in the current execution the rules which normalize the EXEC CICS statements and prepare the program for use with the Oracle Tuxedo Application Runtime for CICS environment, including the CICS preprocessor.
Notes:
There exists a command-line option of the same name (see cobol-convert Command) which has the same effect as this clause, and which is more flexible to use. So we believe that the configuration-file clause will be seldom used, except perhaps in projects in which the TP and batch parts of the asset are well identified and strictly separated in the migration project.
Whether this clause is given or not, the above rules will be applied anyway to every program which contains one or more EXEC CICS statement. So this clause (or the equivalent command-line argument) will be effective only for subprograms used in a CICS environment (implicit COMMAREA, etc). but do not perform CICS operations themselves.
pure-seq-map-file Clause
Syntax
pure-seq-map-file: file-path .
or
purely-sequential-map-file: file-path .
This clause specifies the location of the subordinate configuration file containing the list of SEQUENTIAL logical files which are to be kept (record) SEQUENTIAL rather than converted to LINE SEQUENTIAL. See purely-sequential Configuration File for more details. The file-path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
dont-print-what-string Clause
Syntax
dont-print-what-string .
When present, this clause specifies that the what-string containing conversion timestamp and converter version information, which the converter normally inserts at the beginning of every converted file, is not to be printed out in this execution. This will be seldom used, unless you really want to hide the fact that your application is migrated using the Rehosting Workbench!
remove-empty-copies Clause
Syntax
remove-empty-copies .or rec.
When present, this clause specifies that COPY directives referencing copy files which no longer contain useful COBOL code after conversion are to be commented out; by default, these directives remain active. This applies for instance to copy files defining whole FD paragraphs for files which migrate into database tables.
sql-return-codes-file Clause
Syntax
sql-return-codes-file: file-path .
This clause specifies the location of the subordinate configuration file containing additional pairs of equivalent DB2 & Oracle SQLCODE values. See the sql-return-codes Configuration File for more details. The file-path is given as a string. It can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
copy-renaming Configuration File
This file is associated with the rename-copy-map-file Clause. Its contents are in CSV format, with the semicolon character used as separator. Each line is in the form:
original-copy-name;original-library-name;new-copy-name.
All names are COBOL-like, case-insensitive symbols. The meaning of such a line is that, when the directive:
COPY ORIGINAL-COPY-NAME OF ORIGINAL-LIBRARY-NAME { REPLACING … }
is encountered in a program, it is replaced by:
COPY NEW-COPY-NAME { REPLACING … }
Note that library names are not used on the target platform because they are inconvenient; it is much better to use search paths, see the COBCPY environment variable). When the original-library-name field is empty, the rule is to replace unqualified directives of the form:
COPY ORIGINAL-COPY-NAME { REPLACING … }
The same renaming applies to the copy file itself: when the Converter prints out, in the target private copy directory, the copy file referenced by this directive (see below for more information about copy reconciliation), it is printed with the new name.
When the rename-copy-map-file Clause is not present, or when this file is empty, no copy renaming takes place. It is an error when the file cannot be found or read, or when the same original-copy-name;original-library-name combination is associated with different new-copy-names in different lines of the file. In this case, the converter stops with an error message and does not convert any programs. Note however that it does not check whether two different copy files in the same directory are renamed to the same target file. In principle, this would be handled gracefully by the copy reconciliation process, but without guarantee.
Call-Renaming Configuration File
This file is associated with the rename-call-map-file Clause described above. Its contents are in CSV format, with the semicolon character as separator. Each line is in the form:
original-call-name;new-call-name.
All names are COBOL-like, case-insensitive symbols. The meaning of such a line is that, when the statement:
CALL "ORIGINAL-CALL-NAME" { USING … }
is encountered in a program, it is replaced by:
CALL "NEW-CALL-NAME" { USING … }
The converter also attempts to rename literal strings which are "associated" with variables used in dynamic calls using direct constructs (VALUE, MOVE, etc.). For obvious reasons, it cannot handle truly dynamic calls in which the callee name is "computed" using complex manipulations (STRING, etc.), transported thru opaque containers or obtained from outside the caller program (e.g., read from a file or passed as parameter); such situations must be handled manually.
The same renaming applies to called sub-programs and their entry points: when the converter prints out in the target directory, a program whose base name matches one of the original names listed in the renaming file, it is printed with the corresponding new name. Similarly, for ENTRY statements whose argument is a string matching one of the original names, the string is transformed into the new name.
When the rename-call-map-file Clause clause is not present, or when this file is empty, no call renaming takes place. It is an error when the file cannot be found or read, or when the same original-call-name is associated with different new-call-names in different lines of the file. In this case, the converter stops with an error message and does not convert any programs. Note however that it does not check whether two different subprograms in the same directory are renamed to the same target file.
Post-Translation Configuration File
This file is associated with the post-translation-file Clause. Its contents are a sequence of rules with the following syntax:
rule rule_name
filter [
(+|-)program_name_regexp
]
transform [
source_lines_block
]
into [
target_lines_block
]
The semantics of such a rule are simple: if, in a program, the (base) name of which matches any of the "positive" program_name_regexp's but none of the "negative" ones, a block of lines matching source_lines_block1 is encountered, it is replaced by target_lines_block. rule_name is used in the comment associated with the application of the transformation. See appendix the post-translator below for more details.
The post-translation file may contain as many rules as desired, in any order (although the behavior of the post-translator is not defined when two source_lines_blocks overlap, or when a source_lines_block and a target_lines_block overlap).
Tip:
Comments start with a sharp sign ("#") and extend to the end of the line; you can insert them anywhere between the rules, between the four clauses in a rule and after the rule name; if you insert such a comment inside a square-bracketed filter or transform or into block, it will be considered as part of the block contents rather than as a comment.
Hexadecimal Conversion Configuration File
This file is associated with the hexa-map-file Clause above. Its contents are an EBCDIC-to-ASCII conversion table to apply to characters in hexadecimal form (characters in textual form are supposed to be converted at the same time as the source file itself). The syntax is simply a CSV file with a semicolon as separator. Each line is in the form:
source-hexa-code;target-hexa-code,
Each hexa-code being written as usual, with two hexadecimal characters (0-9, A-F). The semantics of this conversion table are that if some hexadecimal literal in the source file does not match any source code in this table, it is left as is, unconverted. Such conversion works also on embedded-SQL code. Note that the converter makes no attempt to check the intrinsic consistency of the conversion table (e.g. the fact that no source-hexa-code or no target-hexa-code appears twice), nor the fact that it really describes some EBCDIC-to-ASCII conversion.
Tip:
How to Generate the hexa-map File
Oracle Tuxedo Application Rehosting Workbench makes avalaible a script which generates the hexa-map file based on the CONVERTMW copy file, see Using the COBOL CONVERTMW.cpy file in Codeset Conversion chapter.
The script generating the hexa-map file is located in the directory:
REFINEDIR/scripts/
The script names is:
convert-hexa-copy-to-map.sh
Syntax
REFINEDIR/scripts/convert-hexa-copy-to-map.sh convertmw_copy_file
convertmw_copy_file: location of the CONVERTMW.cpy file
The script automatically generates the tr-hexa.map file inside the current directory (PARAM directory is a better choice). This generated file name has to be used as file-path value with the hexa-map-file attribute.
On normal end, a return code of 0 is returned. Otherwise, see displayed messages and content of tr-hexa.map file.
Error Messages
WBART-1001:
Example: COPY file <filename> not found. Check argument 1.
Explanation: Argument 1 must contain the CONVERTMW COBOL copy file name.
WBART-1003:
Example: bad status returned by awk
Explanation: see messages written into tr-hexa.map file
Messages could be :
too many FILLER in TRANSCODE-[SOURCE | CIBLE]
The filler number id does not contains enough hexa element: num instead of 64
not enough FILLER in TRANSCODE-SOURCE and/or TRANSCODE-CIBLE
File-to-RDBMS Configuration Files
These files are associated with the conv-ctrl-file Clause and alt-key-file Clause. They contain information about file-to-RDBMS conversion, e.g. to define which logical files (FDs) are converted into RDBMS tables (actually, because the physical files they are associated with are converted to these DB tables). Since these files are automatically generated by the Rehosting Workbench File-to-Oracle conversion tool and should not be modified by hand, their contents are not further specified here.
RDBMS-conversion Configuration Files
These files are associated with the RDBMS-conversion-file Clause above. The information they contain is accessed in a two-level way:
The top-level file is named in the RDBMS-conversion-file Clause proper. Its contents is a CSV table, with each line in the form:
schema-name;file-path.
File-path is the path to the file containing RDBMS-conversion information pertaining to SQL schema schema-name. As usual, it can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file. This file must be created by the Rehosting Workbench user.
keywords File
This file is associated with the keywords-file Clause. Its contents are a CSV table using the semicolon as separator, each line being in the form:
old-name;new-name.
The effect of such a line is to rename every COBOL identifier (variable name, paragraph name, etc.) named old-name in every program into new-name. This is required for names which happen to be keywords or reserved words in the target COBOL dialect, such as TEST, but it may also be useful to rename plain identifiers for reengineering purposes.
stored-procedure File
This file is associated with the sql-stored-procedures-file Clause. Its contents are a list of subprogram names, one per line. When one of these names appears in a COBOL CALL statement, the latter is replaced by an SQL CALL statement. In addition, declarations of the parameters of the CALL, if any, are adapted so that they can be used in SQL statements.
purely-sequential Configuration File
This file is associated with the pure-seq-map-file Clause. Its contents is a CSV table using the semi-colon as separator, each line being in the form
program-name;FD-name
with both names being symbols. The effect of such a line is to prevent this particular logical file (the given FD in the given program), assumed to be (record) SEQUENTIAL on the source platform, to be converted to LINE SEQUENTIAL on the target platform; rather, it is kept unchanged as a record SEQUENTIAL file. This makes it much less amenable to manipulation using standard target-platform utilities, but on the other hand, it will support unrestricted REWRITE operations (see section REWRITE operations on LINE SEQUENTIAL files above). This might also be useful for files exchanged with a z/OS platform in binary form.
sql-return-codes Configuration File
This file is associated with the sql-return-codes-file Clause. Its contents is a CSV table using the semicolon as separator, each line being in the form:
DB2-sqlcode-value;Oracle-sqlcode-value
with both values being positive or negative integers. The effect of such a line is to add the pair of values to the translation table used to map "remarkable" DB2 SQLCODE values to their equivalent Oracle SQLCODE values. This translation table is initialized as if read from the following file:
Listing 10‑2 DB2 to Oracle SQL Return Code Mapping
+100;+1403
-810;-1422
-803;-1
-530;-2291
-516;-1002
-501;-1001
-407;-1451
-305;-1405
-180;-1820
-181;-1821
-811;-2112
-204;-942
 
Of course, value 0 (zero) is mapped to itself.
Note:
Description of Output Files
Converted Programs and Copy Files
Naming Scheme
As mentioned above, the main purpose of the Rehosting Workbench COBOL Converter is to produce the converted COBOL components, in the form of their source files. There is a direct, one-to-one correspondence between the hierarchy of main program files inside the source root directory and the hierarchy of main program files inside the target root directory; the only possible differences, as far as file names are concerned, come from the CALL-renaming map and the choice of the target program-file extension, see rename-call-map-file Clause and keep-same-file-names, target-program-extension and target-copy-extension Clauses. The same comments apply for the target copy files, with the following observations:
The hierarchy of target copy files is located in the Master-copy sub-directory of the target root directory.
The names of the target copy files may differ from those of the source files because of the COPY-renaming map and the choice of the target copy-file extension.
If file ORIGCOPY(.s-ext) is translated into multiple versions, these versions are named ORIGCOPY(.t-ext), ORIGCOPY_V1(.t-ext), ORIGCOPY_V2(.t-ext), etc.
Transformation Comments
In principle, COBOL conversion is a "light" process, because COBOL on the target platform is not that different from COBOL on the source platform. This is why this process is called conversion rather than translation. Indeed, a converted file, either main file or copy file, generally differs from its corresponding source file only in very few places; the bulk of the contents is not affected in any way and is reproduced in the target file exactly as it is in the source file. The differing places, however, are identified by specific transformation comments.
Modified Code
In places at which some transformation actually took place, the converter inserts transformation comments describing the effects of the transformation. The affected code is composed of:
Listing 10‑3 Transformation Comment Example
*{ tr-binary-to-comp-5 1.2
* 77 MY-VAR PIC S9(9) COMP.
*--
77 MY-VAR PIC S9(9) COMP-5.
*}
 
Added Code
Some rules not only transform existing code but also insert some completely new code in some remote places, for instance, the declaration of an intermediate variable in the Working-Storage Section. In this case, the affected area in the program is composed of:
A header line giving the transformation-rule name and version; the header line starts with the prefix "*+{", in which the opening curly bracket symbolizes the start of the transformation and the plus sign indicates that this is an insertion rather than a transformation.
Deleted Code
When a rule simply deletes some code rather than transforming it, the affected area in the program has the same organization as for modified code, except that the "new code" area is empty:
Moved Code
Some rules move code from one place in the program to another, for instance, when a file is migrated into a relational DB table, the corresponding FD is deleted and the data records it contains are moved to the Working-Storage Section. In this case, the code at the original location is shown as deleted and the code at the new location is shown as inserted.
Other Comment Rules
Layout
When the COBOL Converter applies a transformation rule to a piece of code, it attempts to keep the same layout for the new code, by minimizing how elements of the code which exist in both the original and new versions are moved around. In addition, when the converter inserts a new element, for instance a statement or a variable declaration, it tries to align the new element with similar ones before or after it. When, by following these guidelines, a transformed or new line of code becomes too long for the fixed format, the converter cuts the line at the right-most "nice" cutting point (preferably between two words) and wraps the rest on the next line, flush with the right margin, to indicate that these wrapped elements are logically part of the previous line.
Miscellaneous Issues
When the Converter prints out a copy file invoked in the main program file with a REPLACING clause, it takes great care to undo the effect of the replacements. However, when the transformations performed by the converter apply to pieces of text generated by some replacement, it may be very hard for the converter to compute the "inverse" of the transformation and create the transformed replacement clause – for instance, when the COPY clause replaces "something" by "nothing", the converter may have a hard time finding the "nothing" to replace back by "something" when the area is affected by a transformation. In such cases, some manual correction might be necessary; use the post-translation feature to apply it in a repeatable way.
Compiler Options
To guarantee identical behavior between the source programs and the target ones produced by the COBOL converter, up to the limitations described above, the target programs must be compiled with a certain set of compiler options in effect. Indeed, some of the target COBOL compiler options do change the behavior of the executed code. The transformations applied by the Rehosting Workbench COBOL converter are hence tailored to the option set described below. No support is provided for programs compiled with a different, or at least conflicting, option set. For more information, please see the Micro Focus COBOL documentation, in particular the Compiler Directives book and the COBOL-IT Compiler Suite Entreprise Edition - Reference Manual.
MicroFocus
Mandatory Options
The mandatory compiler options are listed below. For each of them, we indicate whether it is set by default, we give a brief description and we justify why we require it.
DIALECT"MF" (default)
Sets the most “native” and efficient mode of operation. Since the aim of the Rehosting Workbench is not simply to emulate the source mainframe, but to forget about it and lead you towards the benefits of the target platform, this is the best choice. It will enable you to use the modern features of MF COBOL such as Unicode support and object-oriented programming.
CHARSET"ASCII" (default)
Sets the default character set and collating sequence to those supported natively on the target platform. This is an obvious choice, too.
SOURCEFORMAT"FIXED" (default)
Directs the compiler to stick with the “old” standard, fixed-format, column-based format. This may appear contradictory with our aim of modernity, but unfortunately the Oracle Pro*COBOL compiler, even the latest 11g version, is still not quite compatible with the MicroFocus free format, and we have to require fixed format to guarantee correct behavior.
ALIGN"8" (default)
Defines the alignment for top-level structures (01 and 77-level). Required to make sure that structures retain the same layout as on the source platform, and yields the best performance, at a slight expense in memory size.
COMP5-BYTE-ORDER"NATIVE" (default)
Uses native byte ordering for COMP-5 variables. Necessary for compatibility with the C language and the Oracle Tuxedo Application Runtime.
P64 (to set explicitly, except on MicroFocus installations set up to compile to 64-bit mode by default)
Compiles for 64-bit platforms. All the target platforms supported by the Rehosting Workbench are 64-bit.
SIGN"EBCDIC" (to set explicitly)
Uses the EBCDIC convention rather than the ASCII convention for representing overpunched sign on DISPLAY numeric values. This is the same convention as on the source platform and hence provides for less differences of behavior when the last digit of such a numeric value is redefined by a character variable.
DEFAULTBYTE"00" (to set explicitly, except if the previous option is given)
Specifies the value with which to initialize all otherwise-undefined variables in the Working-Storage Section. Not strictly necessary, since on the source platform, the Working-Storage Section is officially not implicitly initialized, but we found that it leads to less differences of behavior when a numeric variable is redefined by a character variable.
RWHARDPAGE (to set explicitly)
Causes the Report Writer control module to execute a form feed after the last item has been printed on a page, instead of the usual multiple blank lines. All Unix printer systems correctly handle Form Feed characters.
INDD or INDD"SYSIN80L" (to set explicitly)
Causes “default” ACCEPT statements to read from the specified logical file instead of from the Unix standard input file. This is the same behavior as on the source platform and is appropriate with ART-translated KSH scripts, which treats SYSIN as any other file for COBOL programs.
OUTDD or OUTDD"SYSOUT80L" (to set explicitly)
Causes “default” DISPLAY statements to write to the specified logical file instead of to the Unix standard output file. This is the same behavior as on the source platform and is appropriate with the Rehosting Workbench-translated KSH scripts, which treat SYSOUT as any other file for COBOL programs.
HOSTARITHMETIC, HOST-NUMMOVE"2", HOST-NUMCOMPARE"2", ARITHMETIC"ENTCOBOL", CHECKDIV"ENTCOBOL", FP-ROUNDING"ENTCOBOL", REMAINDER"2" (to set explicitly)
All these options control various aspects of the treatment of numeric variables and expressions. Their settings maximizes the compatibility with the source platform.
IBMCOMP (to set explicitly)
Turns on word-storage mode for the layout of the structures, the same mode as on the source platform. It also enables the SYNC[HRONIZED] clause to have an effect for “machine-native” types (binary integers, binary floats, pointers, etc.), so as to yield the most efficient performance, at a slight increase in memory consumption.
ODOSLIDE (to set explicitly)
Moves data items that follow a variable-length table in the same record as the table's length changes. This affects data items that appear after a variable-length table in the same record; that is, after an item with an OCCURS DEPENDING clause, but not subordinate to it. With ODOSLIDE, these items always immediately follow the table, whatever its current size; this means their addresses change as the table's size changes. With NOODOSLIDE, these items have fixed addresses, and begin after the end of the space allocated for the table at its maximum length. Setting ODOSLIDE leads to the same behavior as on the source platform.
PERFORM-TYPE"ENTCOBOL" (to set explicitly)
Enables the same behavior as on the source platform regarding nested PERFORM statements with overlapping ranges. The default option PERFORM-TYPE"MF" is semantically cleaner and allows more efficient execution, but may lead to differences of behavior which are hard to detect and diagnose; hence, unless you know that your PERFORM ranges are “well-behaved” and never overlap, we can’t recommend the default setting.
RDW (to set explicitly)
Enables you to find out the length of a record that has just been read from a variable-length sequential file, by providing a “hidden” length variable just before the first record of the FD (see more details in the MicroFocus documentation). This “trick” is available on the source platform, although not explicitly advertised, and this option allows to reproduce the same behavior.
RECMODE"ENTCOBOL" (to set explicitly)
Directs the MicroFocus compiler to use the same algorithm as the source compiler to determine whether a file has fixed-length or variable-length format, depending on the length of the various records in the file definition.
ASSIGN"EXTERNAL" (to set explicitly)
Directs the MicroFocus compiler to use, by default, the EXTERNAL file-assignment method. In this method, the SELECT names are used as keys to search the actual file names in environment variables of the form DD_NAME. This is the mode chosen for the Rehosting Workbench, because it allows the file assignments to be specified outside the programs, namely in the calling KSH scripts. Not only is this closer in philosophy to the source behavior, but in our opinion this is the most flexible method.
SYSPUNCH"80" (to set explicitly)
Defines the record length for the SYSPUNCH logical file. Default setting (132) is not the same as on the source platform.
ZEROLENGTHFALSE (to set explicitly)
When ZEROLENGTHFALSE is set, all comparisons between zero-length group items, and between zero-length items and figurative constants, return false; when it is not set, they all return true. To reproduce the same behavior as on the source platform, it must be set.
NOADV (default)
Do not use special printer-control characters on text files. Target-platform printing utilities will simply print a file with the same layout as it appears on the screen.
NOTRUNCCALLNAME (default)
Does not truncate names of subprograms referenced in CALL statements. This is necessary for the Rehosting Workbench-migrated assets, because data access routines generated by the Rehosting Workbench have long names. In addition, in future evolutions of the asset, you will want to get rid of the short-names limitations imposed on the source platform.
NOTRUNCCOPY (default)
Does not truncate names of copy files referenced in COPY directives. This is necessary for the Rehosting Workbench-migrated assets, because copy files generated or inserted by the Rehosting Workbench have long names. In addition, in future evolutions of the asset, you will want to get rid of the short-names limitations imposed on the source platform.
NOCOPYLBR (default)
Treat copy-file names as plain paths, not library archives (.lbr files). This is necessary for the Rehosting Workbench-migrated assets, because copy files converted or generated by the Rehosting Workbench are not grouped in archives.
NOSPZERO / NOSIGN-FIXUP (default)
NOSIGN-FIXUP provides emulation of the mainframe compiler option NUMPROC(PFD) when using the HOST-NUMCOMPARE and HOST-NUMMOVE directives. This option gives the best performance, given that it is not possible to provide a complete emulation of NUMPROC(NOPFD) behavior.
REPORT-LINE"256" (default)
Specifies the maximum length of a Report Writer line.
COPYEXT"cpy,cbl" (to set explicitly)
Specifies the filename extension of the copyfile that the compiler is to look for if a filename in a COPY statement is specified without an extension. This non-default setting is appropriate for AST-migrated asset, because copy files generated by the Rehosting Workbench always have the .cpy extension and copy files converted by the Rehosting Workbench generally have the same extension. However, if you configure the COBOL converter for another extension, you will have to adapt the setting of this option appropriately.
After taking into account default and explicit options, and dependencies, the required option list must start with the following:
Listing 10‑4 Validated COBOL Compiler Option List
P64 SIGN"EBCDIC" RWHARDPAGE INDD OUTDD HOSTARITHMETIC HOST-NUMMOVE"2"
HOST-NUMCOMPARE"2" ARITHMETIC"ENTCOBOL" CHECKDIV"ENTCOBOL"
FP-ROUNDING"ENTCOBOL" IBMCOMP ODOSLIDE PERFORM-TYPE"ENTCOBOL" RDW
RECMODE"ENTCOBOL" REMAINDER"2" ASSIGN"EXTERNAL" SYSPUNCH"80"
ZEROLENGTHFALSE COPYEXT"cpy,cbl"
 
No guarantee will be given for programs compiled with an option list which contradicts the above one. The current version of Oracle Tuxedo Application Rehosting Workbench and Oracle Tuxedo Application Runtime for CICS, have been validated with this option list.
Note:
Note:
Installation-dependent Options
These options are not strictly necessary but may help you handle assets in which programs contain a mixture of upper-case and lower-case letters:
FOLDCALLNAME"UPPER" (to set explicitly)
Directs the compiler to map subprogram names in CALL statements to upper case.
FOLDCOPYNAME"UPPER" (to set explicitly)
Directs the compiler to map copy file names in COPY directives to upper case.
MAPNAME (to set explicitly)
Makes the Compiler alter program-names and entry-point names to make them compatible with the source platform.
By experimenting with these settings, you may find the combination which is appropriate for your particular asset. For instance, FOLDCALLNAME"UPPER" and MAPNAME taken together provide a good enough emulation of the PGMNAME(COMPAT) source-compiler option, but there is no sure way to emulate the other values of this option.
1.1.1.3 Options Depending on Customer Choice
The following options influence the behavior of the target asset, but may be set more or less at will by the user of the ART system.
BOUND and SSRANGE
checks that each index is between the correct bounds when accessing an array or in reference modifiers. This is similar to the SSRANGE option of the IBM compiler. We strongly recommend that both of these options be set, at least during migration tests and in the first few months of operation (note that setting SSRANGE also sets BOUND). This choice is a bit controversial because it can break some programs which apparently run correctly on the source platform (without the SSRANGE option). However, in our experience, the only programs which break are incorrect programs which just happen to work by chance on the source platform and would not work in the same way, or at all, on the target platform. The sooner we detect these programs and fix them, the better. In the same way, you could consider setting the CHECK option, which enables various (other) kinds of run-time checks and allows to detect other kinds of seemingly-correct programs.
TRUNC
specifies whether truncation to the given PIC size occurs when assigning a value to a BINARY variable (or COMP, or COMP-4). This is similar to the TRUNC(STD) option of the IBM compiler. However, with the present specification of the ART COBOL converter, all such variables are transformed into COMP-5 variables, which do not obey the TRUNC option. See the discussion in Use of COMP-5 type and the TRUNC compiler option above.
APOST and QUOTE
allow to choose which character, single or double quote, the QUOTE symbolic constant will represent. This is similar to the IBM options of the same name. Use the same setting as on the source platform.
NOALTER
forbids the presence of ALTER statements in the COBOL programs. Since ALTER statements are a thing of the past, and a very bad thing if any, we recommend that you take the opportunity of migrating your asset with the ART Workbench to chase out any remaining ALTER. Then, set this option to prevent their reappearance and to make compiled code more efficient and safe.
AREACHECK
Causes the Compiler to treat any token which starts in area A in the Procedure Division as a paragraph or section label, regardless of the preceding tokens. If AREACHECK is not specified, only tokens which follow a period are treated as possible labels. This directive provides closer compatibility with IBM error handling, where omitting a period before the label produces a less serious message. We recommend that such erroneous source code is corrected.
NOBYTEMODEMOVE
Controls behavior for alphanumeric moves between overlapping data items. If BYTE-MODE-MOVE is specified, data is moved one byte at a time from the source to the target. If NOBYTE-MODE-MOVE is specified, the data is moved in granules of two, four or more bytes at a time (depending on environment) from the source to the target. Consequently, if the overlap is less than the size of the granule, each granule moved overwrites part of the next granule to be moved. NO-BYTE-MODE-MOVE gives better performance, but may yield incorrect code on some very rare programs which work correctly on the source platform; we suggest that you start with the “more compatible” setting (BYTE-MODE-MOVE), perform complete regression tests until satisfaction, then choose the other option and re-test.
DYNAM
Specifies that CANCEL statements are not to be ignored. This is similar to the IBM option of the same name (but not quite the same, see the MicroFocus documentation). We strongly recommend that you set this option, because the Tuxedo servers in the ART TP Run-time system, which execute the applicative CICS programs, use CANCEL statements to free the memory used to load and run those programs. If NODYNAM is in effect, the amount of memory use by these servers would grow as they execute more and more different programs.
NOFDCLEAR, NOHOSTFD
The “positive” settings of these options reproduce the restrictions on FD usage imposed by the IBM compiler (FD records allocated only at OPEN time, record contents lost after WRITE, etc.). We feel that these restrictions are silly and hence recommend that you don’t use these options.
NATIVEFLOATINGPOINT
see the discussion in Use of floating-point variables above.
NOSEG
turns off segmentation and ignores all segment numbers. The resulting program is a single piece with no overlay. Who still uses segmentation, anyway?
STICKY-LINKAGE"2" / NOSTICKY-LINKAGE
this option controls how a program parameter (Linkage Section item) which has been linked to some actual data item in a previous invocation of the program may be re-linked with the same item if the current invocation specifies no new linking (no actual argument supplied). The STICKY-LINKAGE"2" setting is “more compatible” with the behavior of the source platform, especially for CICS programs, but it is certainly non-standard and error-prone. It may also be incompatible with certain features of the ART TP run-time system, in particular the possibility to distribute TP transactions over several servers running in a cluster with no shared memory. So we strongly suggest to use the default NOSTICKY-LINKAGE setting from the beginning and fix any sticky-linkage-related bug discovered during regression testing. See also the discussion in Linkage-section arguments with NULL address above.
1.1.1.4 Options Influencing Compile-Time Operation
The following options influence only the production of the compilation listing and may be chosen at will:
LIST
Specifies the location and format of the compilation listing.
SETTINGS
Specifies whether to include the complete list of compiler options in the compilation listing.
TRACE
Specifies whether tracing statements (READY TRACE and RESET TRACE) are obeyed.
WARNING
Specifies the verbosity of error messages printed in the compilation listing.
FLAG “dialect
Specifies whether the compiler must produce language-level certification flags when it finds syntax that is not part of the specified dialect of COBOL.
Mandatory Options
The mandatory compiler options are listed below. For each of them, we indicate whether it is set by default, we give a brief description and we justify why we require it.
DIALECT"MF" (default): sets the most "native" and efficient mode of operation. Since the aim of Oracle ART is not simply to emulate the source mainframe, but to forget about it and lead you towards the benefits of the target platform, this is the best choice. It will enable you to use the modern features of MF Cobol such as Unicode support and object-oriented programming.
CHARSET"ASCII" (default): sets the default character set and collating sequence to those supported natively on the target platform. This is an obvious choice, too.
SOURCEFORMAT"FIXED" (default): directs the compiler to stick with the "old" standard, fixed-format, column-based format. This may appear contradictory with our aim at modernity, but unfortunately the Oracle Pro*Cobol compiler, even the latest 11g version, is still not quite compatible with the MicroFocus free format, and we have to require fixed format to guarantee correct behavior.
ALIGN"8" (default): defines the alignment for top-level structures (01 and 77-level). Required to make sure that structures retain the same layout as on the source platform, and yields the best performance, at a slight expense in memory size.
COMP5-BYTE-ORDER"NATIVE" (default): uses native byte ordering for COMP-5 variables. Necessary for compatibility with the C language and the ART TP run-time system.
P64 (to set explicitly): compiles for 64-bit platforms. All the target platforms supported by ART are 64-bit.
SIGN"EBCDIC" (to set explicitly): uses the EBCDIC convention rather than the ASCII convention for representing overpunched sign on DISPLAY numeric values. This is the same convention as on the source platform and hence provides for less differences of behavior when the last digit of such a numeric value is redefined by a character variable.
DEFAULTBYTE"00" (to set explicitly, except if the previous option is given): specifies the value with which to initialize all otherwise-undefined variables in the Working-Storage Section. Not strictly necessary, since on the source platform, the Working-Storage Section is officially not implicitly initialized, but we found that it leads to less differences of behavior when a numeric variable is redefined by a character variable.
RWHARDPAGE (to set explicitly): Causes the Report Writer control module to execute a form feed after the last item has been printed on a page, instead of the usual multiple blank lines. All Unix printer systems correctly handle Form Feed characters.
INDD or INDD"SYSIN80L" (to set explicitly): causes "default" ACCEPT statements to read from the specified logical file instead of from the UNIX standard input file. This is the same behavior as on the source platform and is appropriate with ART-translated KSH scripts, which treats SYSIN as any other file for Cobol programs.
OUTDD or OUTDD"SYSOUT80L" (to set explicitly): causes "default" DISPLAY statements to write to the specified logical file instead of to the UNIX standard output file. This is the same behavior as on the source platform and is appropriate with ART-translated KSH scripts, which treat SYSOUT as any other file for Cobol programs.
HOSTARITHMETIC, HOST-NUMMOVE"2", HOST-NUMCOMPARE"2", ARITHMETIC"ENTCOBOL", CHECKDIV"ENTCOBOL", FP-ROUNDING"ENTCOBOL", REMAINDER"2" (to set explicitly): all these options control various aspects of the treatment of numeric variables and expressions. Their settings maximizes the compatibility with the source platform.
IBMCOMP (to set explicitly): turns on word-storage mode for the layout of the structures, the same mode as on the source platform and also the one yielding the most efficient performance, at a slight increase in memory consumption.
ODOSLIDE (to set explicitly): Moves data items that follow a variable-length table in the same record as the table's length changes. This affects data items that appear after a variable-length table in the same record; that is, after an item with an OCCURS DEPENDING clause, but not subordinate to it. With ODOSLIDE, these items always immediately follow the table, whatever its current size; this means their addresses change as the table's size changes. With NOODOSLIDE, these items have fixed addresses, and begin after the end of the space allocated for the table at its maximum length. Setting ODOSLIDE leads to the same behavior as on the source platform.
PERFORM-TYPE"ENTCOBOL" (to set explicitly): enables the same behavior as on the source platform regarding nested PERFORM statements with overlapping ranges. The default option PERFORM-TYPE"MF" is semantically cleaner and allows more efficient execution, but may lead to differences of behavior which are hard to detect and diagnose; hence, unless you know that your PERFORM ranges are "well-behaved" and never overlap, we can't recommend the default setting.
RDW (to set explicitly): Enables you to find out the length of a record that has just been read from a variable-length sequential file, by providing a "hidden" length variable just before the first record of the FD (see more details in the Micro Focus documentation). This "trick" is available on the source platform, although not explicitly advertised, and this option allows to reproduce the same behavior.
RECMODE"ENTCOBOL" (to set explicitly): directs the Micro Focus compiler to use the same algorithm as the source compiler to determine whether a file has fixed-length or variable-length for-mat, depending on the length of the various records in the file definition.
ASSIGN"EXTERNAL" (to set explicitly): directs the MicroFocus compiler to use, by default, the EXTERNAL file-assignment method. In this method, the SELECT names are used as keys to search the actual file names in environment variables of the form DD_NAME. This is the mode chosen for ART, because it allows the file assignments to be specified outside the programs, namely in the calling KSH scripts. Not only is this closer in philosophy to the source behavior, but in our opinion this is the most flexible method.
SYSPUNCH"80" (to set explicitly): defines the record length for the SYSPUNCH logical file. Default setting (132) is not the same as on the source platform.
ZEROLENGTHFALSE (to set explicitly): When ZEROLENGTHFALSE is set, all comparisons between zero-length group items, and between zero-length items and figurative constants, return false; when it is not set, they all return true. To reproduce the same behavior as on the source platform, it must be set.
NOADV (default): don't use special printer-control characters on text files. Target-platform print-ing utilities will simply print a file with the same layout as it appears on the screen.
NOTRUNCCALLNAME (default): does not truncate names of subprograms referenced in CALL state-ments. This is necessary for ART-migrated assets, because data access routines generated by ART have long names. In addition, in future evolutions of the asset, you will want to get rid of the short-names limitations imposed on the source platform.
NOTRUNCCOPY (default): does not truncate names of copy files referenced in COPY directives. This is necessary for ART-migrated assets, because copy files generated or inserted by ART have long names. In addition, in future evolutions of the asset, you will want to get rid of the short-names limitations imposed on the source platform.
NOCOPYLBR (default): treat copy-file names as plain paths, not library archives (.lbr files). This is necessary for ART-migrated assets, because copy files converted or generated by ART are not grouped in archives.
REPORT-LINE"256" (default): Specifies the maximum length of a Report Writer line.
COPYEXT"cpy,cbl" (to set explicitly): Specifies the filename extension of the copyfile that the compiler is to look for if a filename in a COPY statement is specified without an extension. This non-default setting is appropriate for AST-migrated asset, because copy files generated by ART always have the .cpy extension and copy files converted by ART generally have the same extension. However, if you configure the Cobol converter for another extension, you will have to adapt the setting of this option appropriately.
After taking into account default and explicit options, and dependencies, the required option list must start with the following:
Listing 10‑5 Required Compiler Option List
P64 SIGN"EBCDIC" RWHARDPAGE INDD OUTDD HOSTARITHMETIC HOST-NUMMOVE"2"
HOST-NUMCOMPARE"2" ARITHMETIC"ENTCOBOL" CHECKDIV"ENTCOBOL"
FP-ROUNDING"ENTCOBOL" IBMCOMP ODOSLIDE PERFORM-TYPE"ENTCOBOL" RDW
RECMODE"ENTCOBOL" REMAINDER"2" ASSIGN"EXTERNAL" SYSPUNCH"80"
ZEROLENGTHFALSE COPYEXT"cpy,cbl"
Note:
Installation-dependent options
These options are not strictly necessary but may help you handle assets in which programs contain a mixture of upper-case and lower-case letters:
FOLDCALLNAME"UPPER" (to set explicitly): directs the compiler to map subprogram names in CALL statements to upper case.
FOLDCOPYNAME"UPPER" (to set explicitly): directs the compiler to map copy file names in COPY directives to upper case.
MAPNAME (to set explicitly): Makes the Compiler alter program-names and entry-point names to make them compatible with the source platform.
By experimenting with these settings, you may find the combination which is appropriate for your particular asset. For instance, FOLDCALLNAME"UPPER" and MAPNAME taken together provide a good enough emulation of the PGMNAME(COMPAT) source-compiler option, but there is no sure way to emulate the other values of this option...
Options depending on customer choice
The following options influence the behavior of the target asset, but may be set more or less at will by the user of the ART system.
BOUND and SSRANGE check that each index is between the correct bounds when accessing an array or in reference modifiers. This is similar to the SSRANGE option of the IBM compiler. We strongly recommend that both of these options be set, at least during migration tests and in the first few months of operation (note that setting SSRANGE also sets BOUND). This choice is a bit controversial because it can break some programs which apparently run correctly on the source platform (without the SSRANGE option). However, in our experience, the only programs which break are incorrect programs which just happen to work by chance on the source platform and would not work in the same way, or at all, on the target platform. The sooner we detect these programs and fix them, the better. In the same way, you could consider setting the CHECK option, which enables various (other) kinds of run-time checks and allows to detect other kinds of seem-ingly-correct programs.
TRUNC: specifies whether truncation to the given PIC size occurs when assigning a value to a BINARY variable (or COMP, or COMP-4). This is similar to the TRUNC(STD) option of the IBM compiler. However, with the present specification of the ART Cobol converter, all such variables are transformed into COMP-5 variables, which do not obey the TRUNC option. See the discussion in Use of COMP-5 Type and the TRUNC Compiler Option.
APOST and QUOTE: allow to choose which character, single or double quote, the QUOTE symbolic constant will represent. This is similar to the IBM options of the same name. Use the same setting as on the source platform.
NOALTER: forbids the presence of ALTER statements in the Cobol programs. Since ALTER statements are a thing of the past, and a very bad thing at that, we recommend that you take the opportunity of migrating your asset with the ART Workbench to chase out any remaining ALTER clauses. Then, set this option to prevent their reappearance and to make compiled code more efficient and safe.
AREACHECK: Causes the Compiler to treat any token which starts in area A in the Procedure Division as a paragraph or section label, regardless of the preceding tokens. If AREACHECK is not specified, only tokens which follow a period are treated as possible labels. This directive provides closer compatibility with IBM error handling, where omitting a period before the label produces a less serious message. We recommend that such erroneous source code is corrected.
NOBYTEMODEMOVE: Controls behavior for alphanumeric moves between overlapping data items. If BYTE-MODE-MOVE is specified, data is moved one byte at a time from the source to the target. If NOBYTE-MODE-MOVE is specified, the data is moved in granules of two, four or more bytes at a time (depending on environment) from the source to the target. Consequently, if the overlap is less than the size of the granule, each granule moved overwrites part of the next granule to be moved. NO-BYTE-MODE-MOVE gives better performance, but may yield incorrect code on some very rare programs which work correctly on the source platform; we suggest that you start with the "more compatible" setting (BYTE-MODE-MOVE), perform complete regression tests until satisfaction, then choose the other option and re-test.
DYNAM: Specifies that CANCEL statements are not to be ignored. This is similar to the IBM option of the same name (but not quite the same, see the Micro Focus documentation). We strongly recommend that you set this option, because the Tuxedo servers in the ART TP Runtime system, which execute the applicative CICS programs, use CANCEL statements to free the memory used to load and run those programs. If NODYNAM is in effect, the amount of memory used by these servers would grow as they execute more and more different programs.
NOFDCLEAR, NOHOSTFD: The "positive" settings of these options reproduce the restrictions on FD usage imposed by the IBM compiler (FD records allocated only at OPEN time, record contents lost after WRITE, etc.). We feel that these restrictions are silly and hence recommend that you don't use these options.
NATIVEFLOATINGPOINT: see the discussion in Use of Floating-point Variables.
NOSEG: turns off segmentation and ignores all segment numbers. The resulting program is a single piece with no overlay. Who still uses segmentation, anyway?
STICKY-LINKAGE"2" / NOSTICKY-LINKAGE: this option controls how a program parameter (Linkage Section item) which has been linked to some actual data item in a previous invocation of the program may be re-linked with the same item if the current invocation specifies no new linking (no actual argument supplied). The STICKY-LINKAGE"2" setting is "more compatible" with the behavior of the source platform, especially for CICS programs, but it is certainly non-standard and error-prone. It may also be incompatible with certain features of the ART TP runtime system, in particular the possibility to distribute TP transactions over several servers running in a cluster with no shared memory. So we strongly suggest to use the default NOSTICKY-LINKAGE setting from the beginning and fix any sticky-linkage-related bugs discovered during regression testing. See also the discussion in Linkage-Section Arguments with NULL Address.
Options influencing compile-time operation
The following options influence only the production of the compilation listing and may be chosen at will:
LIST: specifies the location and format of the compilation listing.
SETTINGS: specifies whether to include the complete list of compiler options in the compilation listing.
TRACE: specifies whether tracing statements (READY TRACE and RESET TRACE) are obeyed.
WARNING: specifies the verbosity of error messages printed in the compilation listing.
FLAG"dialect": specifies whether the compiler must produce language-level certification flags when it finds syntax that is not part of the specified dialect of Cobol.
To guarantee identical behavior between the source programs and the target ones produced by the Cobol Converter, in light of the limitations previously described, the target programs must be compiled with a certain set of compiler options in effect. Indeed, some of the MicroFocus Cobol compiler options do change the behavior of the executed code. The transformations applied by the the Rehosting Workbench Cobol Converter are hence tailored to the option set described below. No support will be provided for programs compiled with a different, or at least conflicting, option sets. For more information, please see the MicroFocus Cobol documentation, in particular the Compiler Directives book.
The main behavior-influencing option to set mandatory is DIALECT"ENTCOBOL". Indeed, this dialect option sets a number of sub-options, such as PERFORM-TYPE"ENTCOBOL", which make the target program behave as closely as possible to the original source program compiled by the IBM Enterprise Cobol Compiler.
However, the Refine Cobol converter departs from the Enterprise Cobol basic choices by using the native character set of the target platform, namely ASCII. This mandates to set the option CHARSET"ASCII". Conversely, it sticks to the IBM convention for representing overpunched sign on DISPLAY numeric values, so the option SIGN"EBCDIC" must be set.
The following minor options must also be set to guarantee the correct behavior of target programs:
NOADV: do not use special printer-control characters on text files. Target-platform printing utilities will simply print a file with the same layout as it appears on the screen.
ALIGN"8": 01- and 77-level data items are aligned at the "most universal" memory boundary.
BOUND: check that each index is between the correct bounds when accessing an array. This choice is a bit controversial because it can break some programs which apparently run correctly on the source platform. However, in our experience, the only programs which break are incorrect programs which just happen to work by chance on the source platform and would not work in the same way, if at all, on the target platform. The sooner we detect these programs and fix them, the better.
COMP-5"2": use native byte ordering for COMP-5 variables. This is necessary for compatibility with the the Rehosting Workbench CICS Runtime routines, among others.
NOCOPYLBR: copy files are just plain files, not .lbr library files.
HOSTARITHMETIC: try to comply with IBM behavior on size error conditions in arithmetic computations.
INTLEVEL"4": allow numeric variables up to 38 digits, and more generally use improved arithmetic behavior.
REPORT-LINE"256": specifies maximum line size for Report Writer.
RWHARDPAGE: use "hard" Form Feed (FF) characters to jump to a new page in Report Writer, instead of using multiple Line Feeds. FF is recognized as jumping to a new page by all target-platform printing utilities.
NOTRUNCCALLNAME: do not truncate the names of CALLed subprograms to 8 characters, as the ENTCOBOL dialect would normally do, because the Oracle Tuxedo Application Rehosting Workbench Cobol Converter uses longer names (and this is better for future evolution anyway).
NOTRUNCCOPY: same thing for names of COPY files.
COPYEXT: specifies the file extensions used for copy files. To set according to the choices you made during migration (see the target-copy-extension configuration clause).
SETTINGS: specifies whether to include the complete list of compiler options in the compilation listing.
TRACE: specifies whether tracing statements (READY TRACE and RESET TRACE) are obeyed.
WARNING: specifies the verbosity of error messages printed in the compilation listing.
COBOL-IT
To reproduce the source COBOL compiler behavior, COBOL-IT offers an IBM compatible configuration file (ibm.conf). This configuration file will be used to compile the targer COBOL asset.
In addition to the compiler options set in the configuration file ibm.conf, at least the following options must be added to improve compatibility between the source and the target COBOL environments.
External-mapping
If set to yes, all the file names of the file declared as EXTERNAL are resolved at runtime using environment variables. It must be set to yes.
Binary-truncate
Binary-truncate is a boolean operator that governs the behavior of the runtime when binary data is truncated. It must be set to no. This corresponds to the behavior of the MicroFocus compiler directive NOTRUNC.
Spzero
If set to yes, Space character moved to NUMERIC USAGE field are converted in `0'. It must be set to no.
Depending on the customer needs, other compiler options can be set. To learn about the COBOL-IT compiler options, please refer to the COBOL-IT Compiler Suite Entreprise Edition - Reference Manual.
Detailed Processing
Overview
When the COBOL Converter starts, it reads and checks the various configuration files, starting with the main one. If any inconsistency is detected at this stage, one or more error messages are printed and the converter exits. Otherwise, the converter uses both command-line options and configuration-file options to set its internal parameters, including the list of (source) programs to process. Then it proceeds to process each of these programs in turn; for each of them:
1.
2.
3.
4.
5.
6.
Lastly, if the deferred-copy-reconcil Clause is not given, either on the command-line or in the con-figuration file, the copy reconciliation process is applied to the target copy files in the private directory.
The converter can be executed by several concurrent processes at the same time, provided that the deferred-copy-reconcil Clause is given either on the command-line or in the configuration file; otherwise, the copy-reconciliation phase of these concurrent processes may run into access conflicts over the "data-base" of final, reconciled copy files, which could lead to corrupted results.
Command-Line Syntax
Refine Launcher Interface
The COBOL Converter is designed to be run through the refine command, which is the generic Oracle Tuxedo Application Rehosting Workbench launcher and is also used to launch all "big" Oracle Tuxedo Application Rehosting Workbench tools. This launcher handles various aspects of the operation of these tools, such as execution log management and incremental/repetitive operation.
cobol-convert Command
Synopsis
$REFINEDIR/refine cobol-convert [ launcher-options… ] \
( -s | -system-desc-file ) system-desc-path \
( -c | -config ) main-config-file-path \
[ other-specific-flags… ] \
( source-file-path | ( -f | -file | -file-list-file ) file-of-files )…
Options
The mandatory options are:
( -s | -system-desc-file ) system-desc-path
Specifies the location of the System Description File. As usual for Unix/Linux commands, the given path can be absolute or relative to the current working directory. Note that many other paths used by many of the Rehosting Workbench tools are then derived from the location of this file, including that of the main configuration file (see next option); this makes it easy to run the same command from different working directories.
( -c | -config ) main-config-file-path
Specifies the location of the Main Conversion Configuration File. The given path can be either an absolute path or a relative path; in the latter case, it is relative to the directory containing the system description file, as usual for the Rehosting Workbench tools.
The generic options which define which source programs to process are:
source-file-path
Adds to the work-list the program source file designated by this path. The path must be given as relative to the root directory of the system, $SYSROOT, even if the current working directory is different.
( -f | -file | -file-list-file ) file-of-files
Adds to the work-list the program source files listed in the file designated by this path. The file-of-files itself may be located anywhere, and its path is either absolute or relative to the current working directory. The program source files listed in this file, though, must be given relative to the root directory of the system.
You can give as many individual programs and/or files-of-files as you wish. The work-list is built when the command line is analyzed by the COBOL Converter, see the detailed description above.
The optional specific flags or options are:
-dcrp or -deferred-copy-reconcil
Has the same effect as the deferred-copy-reconcil Clause of the configuration file, namely to not run the copy reconciliation process incrementally after converting each program. Only with this clause or flag can the COBOL converter run in multiple concurrent processes.
-tpe extension or -target-program-extension extension
This option has the same effect as the configuration-file clause of the same name, and overrides it when given.
-tce extension or -target-copy-extension extension
This option has the same effect as the configuration-file clause of the same name, and overrides it when given.
-keep or -keep-same-file-names
This option has the same effect as the configuration-file clause of the same name, and overrides it when given.
-force or -force-translation
This option has the same effect as the configuration-file clause of the same name, and overrides it when given.
-cics or -activate-cics-rules
This option has the same effect as the configuration-file clause of the same name, and overrides it when given.
Repetitive and Incremental Operation
Even with the powerful computing platforms easily available nowadays, processing a complete asset using the Rehosting Workbench remains a computing-intensive, long-running, memory-consuming task. The Work-bench tools are hence designed to be easily stopped and restarted and, thanks to a make-like mechanism, not repeat any piece of work which has already been done. This allows efficient operation in all phases of a migration project.
Initial Processing: Repetitive Operation
In the initial phase, when starting with a completely fresh asset and up to the end of the first conversion-translation-generation cycle of a stable asset, the make-like mechanism is used to allow repetitive operation, as follows:
1.
2.
3.
This mode is particularly well suited for tools or commands which operate globally on the whole asset such as the Cataloger, but it is also useful for component-wise tools such as the COBOL Converter. This is the normal mode of operation for the Rehosting Workbench tools and there is nothing specific to choose it.
Changes in the Asset: Incremental Operation
The COBOL converter knows the dependencies between the various components (main program files) and associated result files (POB files, target program files). Using this information, it is able to react incrementally when some change occurs in the asset. For example, when a COBOL source file is added, modified or removed: the cataloguer re-parses the affected programs, and then the COBOL converter re-converts only those. Again, this is the normal mode of operation for the Rehosting Workbench tools and there is nothing specific to choose it.

1
In this context, matching simply means that the two blocks of lines must be identical when you reduce each sequence of spaces in both of them to a single space. This is basically "identical" with a little flexibility added.

Copyright © 1994, 2017, Oracle and/or its affiliates. All rights reserved.