Other Kinds of COBOL Efficiency
Arnold J. Trembley
Recently, I received an email question on COBOL programming efficiency. This question reminded me of a ANS COBOL programming course I took in the very early 1980's. Wall-clock Runtime performance is not the only kind of programming efficiency. Here is the primary text of that message:
Subject: Cobol efficiency question
Date: Tue, 17 Dec 2002 11:21:02 -0500
Hello,I came apon your web site below and wonder if you don't
mind if I run aquestion by you.
I work in a shop where there's a cobolii standard which says
that all constants should be defined in working storage.
As a result, we have code like this in our programs:
05 w70-value-1 pic 9 value 1.
add w70-value-1 to w80-total.
In my opinion, this seems like unnecessary and wordy coding.
The fellow who sits next to me thinks this standard exists
for the purpose of optimizimg the programs efficiency. I
think it was defined for the purpose of easier program
maintenance. In my opinion, people didn't understand the
purpose of this rule, took the rule too literally and then
I can see cases where it may make sense code this way, for
05 w70-table-size pic 99 value 10.
perform update-table varying index from 1 by 1
until w80-table-index > w70-table-size.
Here, if the table size increases, you can simply change
the value in working storage and then there is no need to
change code in the procedure division. So here is my
question for you. With your knowledge of Cobol
optimization, does the first example above do anything to
improve a program's efficiency?
Thanks for your time!
Susan is absolutely correct that this kind of "Application Coding standards" rule really exists to facilitate future program maintenance and readability. Defining constants in WORKING-STORAGE instead of using literals in the PROCEDURE DIVISION has virtually no measurable effect on program runtime efficiency.
My own shop has a similar rule in its application standards manual. Generally, the use of numeric and alphanumeric literals in the PROCEDURE DIVISION is prohibited. In 1992 I wrote a program to process COBOL source code files and produce a diagnostic listing, similar to a compile listing, that measures compliance with the local coding standards. But even this program allows the use of ZERO, ZEROS, 0, +0, 1, +1, and -1 as literals in the PROCEDURE DIVISION (actually ZERO and ZEROS are properly referred to as Figurative Constants in COBOL, along with SPACES, HIGH-VALUES, and LOW-VALUES).
Compare the following examples, adapted from Susan Deane's email:
VARYING W80-TABLE-INDEX FROM W70-VALUE-1 BY W70-VALUE-1
UNTIL W80-TABLE-INDEX > W70-TABLE-SIZE.
ADD W70-VALUE-1 TO W80-TOTAL.
VARYING W80-TABLE-INDEX FROM +1 BY +1
UNTIL W80-TABLE-INDEX > W70-TABLE-SIZE.
ADD +1 TO W80-TOTAL.
If there is any doubt as to whether or not the use of defined constants would affect the runtime efficiency of these two examples, the programmer can compile both versions and examine the disassembled code (to check the instruction path length) or run timed parallel tests. I think the second example is easier to code, easier to read, and still uses a defined constant where it is really needed. W70-TABLE-SIZE might easily be referenced multiple times in the program. As a defined constant, it only needs to be updated in one place and would take effect everywhere it is used. If a numeric literal is used in multiple places, missing just one when the table size is increased could create a nasty bug.
Another mistake that programmers make when applying a simple rule ("No literals in the PROCEDURE DIVISION") is choosing a bad name for the defined constant. How about this example:
01 W70-VALUE-500 PIC 9(4) COMP VALUE 500.
VARYING W80-TABLE-INDEX FROM W70-VALUE-1 BY W70-VALUE-1
UNTIL W80-TABLE-INDEX > W70-VALUE-500.
In this case the size of the table is encoded in the name of the constant. If the program must be modified to increase the size of the table from 500 entries to 1000 entries, now the maintenance programmer must change the name everywhere. The whole point of the rule is to make future program maintenance easier and safer. W70-TABLE-SIZE is clearly a better choice for the variable name, because only its value needs to be changed.
Having a rule that discourages the use of literals in the PROCEDURE DIVISION is a reasonably good idea, but it requires intelligent use. I don't define constants in my COBOL programs to make the programs run faster. I define constants to make the program easier to read, and easier and safer to modify when business rules change.
I am not sure if I can remember all the details of the 1980's class I took on ANS COBOL Programming Efficiency, but I do remember that it described five categories of program efficiciency.
- Runtime Efficiency
- Module Size Efficiency
- Compile Efficiency
- Input/Output Efficiency
- Maintenance Efficiency
Runtime Efficiency can be very important if a program takes too much time to run. A batch program might not be able to complete in its overnight processing window. An online program might have very slow response time, which irritates users and reduces their productivity. The first article on my COBOL performance page addresses some of these issues, and is worth reviewing. Efficient programs require proper algorithm choice. Putting unneeded instructions inside program loops will increase your instruction path length. A poor choice of numeric format may increase computation times, especially for calculations inside loops.
Module Size Efficiency is practically a non-issue these days. Older computers had less memory available, and a COBOL program might grow to be too large to load into storage. There were various optimization techniques to solve this program, the most extreme being the use of sections and overlays. Only part of the program would be loaded into memory, and other sections (overlays) would be paged into a workarea for a time, used, and then discarded. Various tricks were used to make the size of the program smaller.
When I learned COBOL programming in 1978, I punched the program onto cards using an IBM-029 keypunch, and the program was compiled on an IBM-360 model 30 with 64K (65,536) bytes of memory. The Z900 mainframe I use now would allow a COBOL program to use nearly 2 gigabytes of main memory. This problem has largely been solved by hardware improvements.
Compile Efficiency is another case of a problem that does not occur with modern hardware and compilers. I remember a programmer telling me about writing programs for a Honeywell computer. Compiling a COBOL program ran for several hours and he had to submit it for overnight processing. If he had a syntax error, he had to wait until the next day to resubmit the compile. There were techniques for coding a program so it would compile more quickly. Even on the ancient IBM 360/30, a medium-sized COBOL program of about 1200 lines would compile in a few seconds or minutes.
Input/Output Efficiency has a large effect on the wall-clock runtime performance of a program. For the typical business program, I/O efficiency has a larger performance affect than instruction path length. This topic is discussed in my original article. Batch processing is generally faster with sequential files than with VSAM files. OPEN and CLOSE are expensive in terms of runtime. Small blocksizes increase runtime and larger blocksizes reduce runtime. If data is referenced over and over again, it should be cached in memory rather than reread from disk.
Maintenance Efficiency is the last and most important type of efficiency. Even in 1980, it was becoming apparent that the cost of developing, maintaining, and installing software was greater than the cost of hardware. Making the program easier to read, understand, and modify, allows the most efficient use of the programmer's time. And making the program easier to maintain can be done without impairing the program's runtime efficiency.
Having a rule that discourages the use of literals in the PROCEDURE DIVISION is an attempt to improve the program's Maintenance Efficiency. Indeed, large COBOL shops created coding style standards manuals in an attempt to improve Maintenance Efficiency. Perhaps the most difficult aspect of this is choosing meaningful data names and procedure names. Data names should either avoid using abbreviations, or else use a consistent standard for abbreviations. Procedure names should also be meaningful. The style-checker program in my shop enforces a rule that a paragraph name must begin with an action verb, followed by a direct object.
I once had to modify a program with a paragraph named "2500-CK". I figured out the "CK" was an abbreviation for "check", but the paragraph did not write a check, it validated an input record and should have been named "2500-EDIT-BILLING-RECORD".
Structured Programming techniques were introduced to improve maintenance efficiency by simplifying control flow structures. Modularization allowed code blocks to be re-used. Better naming conventions and more meaningful names help the maintenance programmer to understand a program written by someone else. Application Coding Standards can improve Maintenance Efficiency, which is the most important type of efficiency.
Someday I would like to write an article on Application Coding Standards. Having some is better than having none. But which standards are really useful? Since I have only written COBOL for two different companies, I can't say the standards I use are the best available. I would be interested in learning what kind of COBOL coding standards are used in other companies.
Here to return to my home page.