COBOL variables
Variables in the COBOL language.
Any COBOL program can be turned into a microservice and deployed in Kubernetes.
To do this, simply compile the program and generate an executable module that can be called dynamically or statically from another application program. You can refer to the examples to learn how to make both dynamic and static calls.
Similar to the functionality of a function in a contemporary programming language, these programs, or subroutines, can accept a set of variables as input parameters. The process is straightforward: the USING
clause in the PROCEDURE DIVISION
must be coded to replicate the number, sequence, and type of variables used in the call from the main program.
In the case of batch programs, the main program (associated with a STEP tab of a JCL) does not normally use parameters (PARM) to receive input variables to the program. Instead, programs often employ the COBOL ACCEPT
statement or read the required information directly from a file.
In the case of online programs, the main program is associated with a transaction and the transaction manager (CICS or IMS) is responsible for calling this program. This program typically utilises the designated transaction manager sentences to receive a message comprising multiple variables (EXEC CICS RECEIVE - GN) within a pre-defined memory area.
Whatever the execution mode (online or batch), these main COBOL programs can in turn make multiple calls to COBOL subroutines using the CALL
statement (in the case of CICS, this call could be made using the EXEC CICS LINK sentence).
Therefore, if we want to expose a COBOL program as a microservice that can be called by other microservices written in different programming languages, it is necessary to understand the different data types that a COBOL program can use and convert them to a standard type that can be used by any programming language.
Data types in COBOL
Defining a variable in a COBOL program can be confusing to someone used to working with modern programming languages. In addition, there are several data types that are commonly used by a COBOL programmer that have no equivalent in any modern programming language.
Let’s try to decipher the operation of the most commonly used COBOL data types in a standard program and define their translation to a modern language such as Go.
Variables that require double-byte characters are excluded.
As in any other language, to define a variable we must declare its name and type (string, int, float, bytes, etc.), however, in COBOL this definition is not done directly, it is done through the USAGE
and PICTURE
clauses.
01 VAR1 PIC S9(3)V9(2) COMP VALUE ZEROS.
The USAGE
clause defines the internal memory storage to be used by the variable.
The PICTURE
(or PIC
) clause defines the mask associated with the variable and its general characteristics; this definition is done by using a set of specific characters or symbols:
- A, the variable contains alphabetic characters.
- X, the variable contains alphanumeric characters
- 9, numeric variable
- S, indicates a signed numeric variable
- V, number of decimal places
- Etc.
USAGE DISPLAY
Variables defined as USAGE DISPLAY
can be of the following types
Alphabetical
They are defined by using the symbol A in the PICTURE
clause.
01 VAR-ALPHA PIC A(20).
They can only store characters from the Latin alphabet. Their use is not very common, as they are generally replaced by alphanumeric variables, which we will see below.
Alphanumeric
They are defined by using the symbol X in the PICTURE
clause.
01 VAR-CHAR PIC X(20).
As in the previous case, it is not necessary to define the USAGE
clause, since variables of type A or X are of type DISPLAY
by default.
Numeric
They are defined by using the symbol 9 in the PICTURE
clause.
01 VAR1 PIC S9(3)V9(2) USAGE DISPLAY.
In this case, we define a numeric variable of length 5 (3 integer places and 2 decimal places) with sign.
The definition of numeric variables of type DISPLAY
can be explicit, as in the previous example, or implicit if the USAGE
clause is not declared.
Internal storage
Each of the defined characters is stored in a byte (EBCDIC), in the case of signed numeric type variables, the sign is defined in the first 4 bits of the last byte.
Let’s look at an example to better understand how this type of variable works.
Number | EBCDIC value |
---|---|
0 | x’F0' |
1 | x’F1' |
2 | x’F2' |
3 | x’F3' |
4 | x’F4' |
5 | x’F5' |
6 | x’F6' |
7 | x’F7' |
8 | x’F8' |
9 | x’F9' |
If we assign a value of 12345 to a variable of type
PIC 9(5) USAGE DISPLAY
It would take 5 bytes
12345 = x’F1F2F3F4F5’
In case of signed variables
PIC S9(5) USAGE DISPLAY
+12345 = x’F1F2F3F4C5’
-12345 = x’F1F2F3F4D5’
USAGE COMP-1 or COMPUTATIONAL-1
32-bit (4 bytes) float variable.
USAGE COMP-2 or COMPUTATIONAL-2
64-bit (8 bytes) float variable.
It is not possible to define a PICTURE
clause associated with a COMP-1
or COMP-2
type variable.
USAGE COMP-3 or COMPUTATIONAL-3
Packed-Decimal variable.
01 VAR-PACKED PIC S9(3)V9(2) USAGE COMP-3.
4 bits are used to store each of the numeric characters of the variable, the sign is stored in the last 4 bits.
Generally, such variables are defined with an odd length to fill the total number of bytes used.
A simple rule to calculate the memory used by a COMP-3
type variable is to divide the total length of the variable (integer positions + decimal positions) by 2 and add 1, in the above example the memory required for the variable VAR-PACKED is;
5 / 2 = 2 -> Total length 5 (3 integers + 2 decimals)
2 + 1 = 3 -> storage requirement of 3 bytes
Let’s look at an example to better understand how this type of variable works.
Let’s assign the value 12345 to a variable of type COMP-3
, as in the previous example.
PIC S9(5) USE COMP-3.
We start with the DISPLAY format number
x’F1F2F3F4F5'.
We remove the first 4 bits of each byte and add the sign at the end.
+12345 = x’12345C’
-12345 = x’12345D’
USAGE COMP-4 or COMPUTATIONAL-4 or COMP or BINARY
In this case the data type is binary. Negative numbers are represented as two’s complement.
The size of storage required depends on the PICTURE
clause.
PICTURE | Storage | Value |
---|---|---|
PIC S9(1) - S9(4) | 2 bytes | -32,768 to +32,767 |
PIC S9(5) - S9(9) | 4 bytes | -2,147,483,648 to +2,147,483,647 |
PIC S9(10) - S9(18) | 8 bytes | -9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 |
PIC 9(1) - 9(4) | 2 bytes | 0 to 65,535 |
PIC 9(5) - 9(9) | 4 bytes | 0 to 4,294,967,295 |
PIC 9(10) - 9(18) | 8 bytes | 0 to 18,446,744,073,709,551,615 |
Up to this point, the operation of this type of variable would be equivalent to the representation of integer variables in most programming languages (short, long or double variables and their unsigned equivalents ushort, ulong, udouble), but by using the PICTURE
clause we can limit the maximum values of such a variable and define a decimal mask.
For example:
01 VAR-COMP PIC 9(4)V9(2) USAGE COMP.
It uses 4 bytes of memory (int32), but its value is limited from 0 to 9999.99.
What is the point of defining integer variables and then defining a fixed mask with the number of integer and decimal places?
To restrict the use of floating point operations. This may seem strange today, but 30 years ago it made sense to reduce the number of CPU cycles used when the cost of computing was extremely high.
On the other hand, almost all operations performed by financial institutions with currencies do not require complex calculations, and by operating with integers we do not lose precision when we need decimals (i.e. cents).
USAGE COMP-5 or COMPUTATIONAL-5
This data type is also known as native binary.
It is equivalent to COMP-4
, however the values to be stored are not limited by the mask defined in the PICTURE
clause.
PICTURE | Variable |
---|---|
PIC S9(4) COMP-5 | short (int16) |
PIC S9(9) COMP-5 | long (int32) |
PIC S9(18) COMP-5 | double (int64) |
PIC 9(4) COMP-5 | ushort (uint16) |
PIC 9(9) COMP-5 | ulong (uint32) |
PIC 9(18) COMP-5 | udouble (uint64) |
Converting variables to a standard type
It is important to note that it is only necessary to convert the variables exposed by the main programs, once the COBOL runtime has initialised and executed the main program, the calls between COBOL programs made by means of the CALL statement are under the responsibility of the runtime.
A special case is the call between COBOL programs deployed in different containers. A special mechanism has been designed for this, similar to the operation of the LINK statement of the CICS transaction manager.
How to make calls between COBOL programs deployed in different application containers can be found in the examples section d8link.
In general, any COBOL variable could be exposed in different forms (int, float, string, etc.) and then converted, but to simplify and optimise the conversion process, we will use a set of simple rules that we will describe below.
EBCDIC
As mentioned above, the IBM mainframe uses EBCDIC internally.
In an attempt to replicate as closely as possible the behaviour of COBOL programs across platforms, some vendors allow EBCDIC to continue to be used internally for application data handling.
While this strategy may facilitate code migration in the short term, it presents huge compatibility, evolution and support problems in the medium to long term. Therefore, migrated COBOL programs will use ASCII characters internally.
It is necessary to identify programs that use COBOL statements that handle hexadecimal characters and replace these strings.
MOVE X’F1F2F3F4F5’ TO VAR-NUM1.
Big-endian vs. little-endian
The IBM mainframe platform is big-endian, so the most significant byte would be stored in the memory address with the smallest value, or in other words, the sign would be stored in the first bit from the left.
In contrast, the x86 and arm platforms use little-endian to represent binary variables.
As in the previous case, we believe that the best strategy is to use the target architecture natively, so the programs are compiled to use little-endian.
Maximum size of COMP variables
In general, the maximum size of a numeric variable is 18 digits, regardless of the type used (DISPLAY
, COMP
, COMP-3
, COMP-5
).
However, on the mainframe platform, it is possible to extend this limit to 31 digits for some types of variables (e.g. COMP-3
).
We are talking about numeric variables that are used to carry out arithmetic operations, except in the specific case of a country with hyperinflationary episodes that have persisted over a long period of time, the limit of 18 digits is enough.
Binary Variables
Although it is not common to use binary variables with a decimal mask (COMP
or COMP-4
), they can be used to operate on large numbers using binary instructions, thus avoiding the use of floating point numbers.
In the case of COMP-5
or native binary variables, it does not make sense to use a decimal mask, so they are implemented directly into an integer variable type corresponding to their size.
Data types
COBOL type | Go type |
---|---|
PIC X(n) | string |
COMP-1 | float32 |
COMP-2 | float64 |
PIC S9(1 to 4) COMP-5 | int16 |
PIC S9(5 to 9) COMP-5 | int32 |
PIC S9(10 to 18) COMP-5 | int64 |
PIC 9(1 to 4) COMP-5 | uint16 |
PIC 9(5 to 9) COMP-5 | uint32 |
PIC 9(10 to 18) COMP-5 | uint64 |
There is no string concept in COBOL. The size of the variable
PIC X(n)
is exactly the length defined in thePIC
clause.If the size of the string is smaller than the size defined in COBOL, it must be justified with spaces on the right.
Now that we have defined the variables that are equivalent to a certain type of variable in the Go language, we will define the behaviour of the numeric data types with decimal mask.
These are the data types commonly used by COBOL programmers. In the event that the program is to be exposed for invocation from an external platform, variables of type DISPLAY
or decimal zoned are usually used to facilitate data conversion (ASCII-EBCDIC) between platforms and to facilitate error debugging.
In our case, to simplify data handling, all these variables are exposed as string
type.
COBOL type | Go type |
---|---|
PIC S9(n) | string |
PIC S9(n) COMP-3 | string |
PIC S9(n) COMP or COMP-4 or BINARY | string |
The calling program must ensure that the data sent corresponds to a numerical value.
The variables received must conform to the mask defined in the
PIC
clause, while respecting the number of integer and decimal places and, if necessary, justifying with leading zeros.
The process of conversion
Let’s start with a simple example, a COBOL program that receives a data structure with different types of variables.
******************************************************************
IDENTIFICATION DIVISION.
PROGRAM-ID. vars.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
DATA DIVISION.
FILE SECTION.
WORKING-STORAGE SECTION.
* Declare variables in the WORKING-STORAGE section
LINKAGE SECTION.
* Data to share with COBOL subroutines
01 MY-RECORD.
10 CHAR PIC X(09) VALUE SPACES.
10 COMP2 COMP-2 VALUE ZEROES.
10 COMP1 COMP-1 VALUE ZEROES.
10 COMP5D PIC S9(18) COMP-5 VALUE ZEROES.
10 COMP5L PIC S9(9) COMP-5 VALUE ZEROES.
10 COMP5S PIC S9(4) COMP-5 VALUE ZEROES.
10 COMP5UD PIC 9(18) COMP-5 VALUE ZEROES.
10 COMP5UL PIC 9(9) COMP-5 VALUE ZEROES.
10 COMP5US PIC 9(4) COMP-5 VALUE ZEROES.
10 NDISPLAY PIC S9(3)V9(2) VALUE ZEROES.
10 COMP3 PIC S9(3)V9(2) COMP-3 VALUE ZEROES.
10 COMP4D PIC S9(8)V9(2) COMP VALUE ZEROES.
10 COMP4L PIC S9(3)V9(2) COMP VALUE ZEROES.
10 COMP4S PIC S9(2)V9(2) COMP VALUE ZEROES.
PROCEDURE DIVISION USING BY REFERENCE MY-RECORD.
* code goes here!
DISPLAY "char: " CHAR.
DISPLAY "comp-2: " COMP2.
DISPLAY "comp-1: " COMP1.
DISPLAY "comp5 double: " COMP5D.
DISPLAY "comp5 long: " COMP5L.
DISPLAY "comp5 short: " COMP5S.
DISPLAY "comp5 Udouble: " COMP5UD.
DISPLAY "comp5 Ulong: " COMP5UL.
DISPLAY "comp5 Ushort: " COMP5US.
DISPLAY "display: " NDISPLAY.
DISPLAY "comp-3: " COMP3.
DISPLAY "comp double: " COMP4D.
DISPLAY "comp long: " COMP4L.
DISPLAY "comp short: " COMP4S.
MOVE 0 TO RETURN-CODE.
GOBACK.
Then we will analyse this structure (it can be parsed automatically) and generate two files:
A configuration file (vars.yml) with the characteristics of the COBOL variables present in the structure to be converted (name, type of variable, length, decimal places, sign).
type = 0 -> numeric, type
DISPLAY
type = 1 -> numeric, type
COMP-1
type = 2 -> numeric, type
COMP-2
type = 3 -> numeric, type
COMP-3
type = 4 -> numeric, type
COMP-4 or BINARY or COMP
type = 5 -> numeric, type
COMP-5
type = 9 -> alphanumeric, type
CHAR
---
copy:
- field: "Char"
type: 9
length: 9
decimal: 0
sign: false
- field: "Comp2"
type: 2
length: 0
decimal: 0
sign: true
- field: "Comp1"
type: 1
length: 0
decimal: 0
sign: true
- field: "Comp5Double"
type: 5
length: 18
decimal: 0
sign: true
- field: "Comp5Long"
type: 5
length: 9
decimal: 0
sign: true
- field: "Comp5Short"
type: 5
length: 4
decimal: 0
sign: true
- field: "Comp5Udouble"
type: 5
length: 18
decimal: 0
sign: false
- field: "Comp5Ulong"
type: 5
length: 9
decimal: 0
sign: false
- field: "Comp5Ushort"
type: 5
length: 4
decimal: 0
sign: false
- field: "NumDisplay"
type: 0
length: 5
decimal: 2
sign: true
- field: "Comp3"
type: 3
length: 5
decimal: 2
sign: true
- field: "CompDouble"
type: 4
length: 10
decimal: 2
sign: true
- field: "CompLong"
type: 4
length: 5
decimal: 2
sign: true
- field: "CompShort"
type: 4
length: 4
decimal: 2
sign: true
A file (request.go) containing the Go representation of the COBOL structure.
package request
type Request struct {
Char string
Comp2 float64
Comp1 float32
Comp5Double int64
Comp5Long int32
Comp5Short int16
Comp5Udouble uint64
Comp5Ulong uint32
Comp5Ushort uint16
NumDisplay string
Comp3 string
CompDouble string
CompLong string
CompShort string
}
And finally, a file (response.go). The structure used above can be copied as the parameters used by the program are input/output.
package response
type Response struct {
Char string
Comp2 float64
Comp1 float32
Comp5Double int64
Comp5Long int32
Comp5Short int16
Comp5Udouble uint64
Comp5Ulong uint32
Comp5Ushort uint16
NumDisplay string
Comp3 string
CompDouble string
CompLong string
CompShort string
}
Now we have everything we need to run our COBOL program.
Running the test program
The following explains how to run the above test program.
The examples can be downloaded directly from the GitHub repo.
The directory structure of the example (d8vars) is as follows:
├── cmd
├── cobol
├── conf
├── internal
│ └── cgocobol
│ └── common
│ └── service
├── model
│ └── request
│ └── response
├── test
│ go.mod
│ go.sum
| Dockerfile
/cobol
Contains the compiled COBOL programs to be executed (*.dylib or *.so).
Use the following compilation options to define the required behaviour for binary type fields.
cobc -m vars.cbl -fbinary-byteorder=native -fbinary-size=2-4-8
/conf
Configuration files described above, used to describe the COBOL COPY structure.
/model.
Contains the definition of the data structures in Go (request/response).
/internal.
In this case we find a simplified version of the code needed to convert data between languages and to execute the COBOL programs.
/test.
Finally, the test directory contains a utility for generating random test data according to the data types defined in the COBOL program.
To run the code, simply go to the /cmd directory, open a terminal and type
go run .
Remember to define to the COBOL language runtime the directory where the modules to be executed are located.
export COB_LIBRARY_PATH=/my_dir/.../cobol
The program will generate a random data structure, convert it to a format that can be used by the COBOL program, execute the program and convert the result to a format that can be used by the Go program.
You may wish to use the example COBOL program loancalc.cbl for further testing purposes. To do so, simply compile the program and modify the configuration and data structure files.
Please modify the app.env file to define the name of the COBOL program to be executed and the name of the COBOL copy to be converted.
COBOL_PROGRAM="loancalc"
COBOL_CONFIG="loancalc.yaml"
Copy the file loancalc.yml to the folder conf
---
copy:
- field: "PrincipalAmount"
type: 0
length: 7
decimal: 0
sign: true
- field: "InterestRate"
type: 0
length: 4
decimal: 2
sign: true
- field: "TimeYears"
type: 0
length: 2
decimal: 0
sign: true
- field: "Payment"
type: 0
length: 9
decimal: 2
sign: true
- field: "ErrorMsg"
type: 9
length: 20
decimal: 0
sign: false
And replace the request.go and response.go structures with the following:
package request
type Request struct {
PrincipalAmount string
InterestRate string
TimeYears string
}
package response
type Response struct {
Payment string
ErrorMsg string
}