4 lessons for modern software developers from 1970s mainframe programming
Eight megabytes of memory is plenty. Or so we believed back in the late 1970s. Our mainframe programs usually ran in 8 MB virtual machines (VMs) that had to contain the program, shared libraries, and working storage. Though these days, you might liken those VMs more to containers, since the timesharing operating system didn’t occupy VM space. In fact, users couldn’t see the OS at all.
In that mainframe environment, we programmers learned how to be parsimonious with computing resources, which were expensive, limited, and not always available on demand. We learned how to minimize the costs of computation, develop headless applications, optimize code up front, and design for zero defects. If the very first compilation and execution of a program failed, I was seriously angry with myself.
Please join me on a walk down memory lane as I revisit four lessons I learned while programming mainframes and teaching mainframe programming in the era of Watergate, disco on vinyl records, and Star Wars—and which remain relevant today.
1. Minimize the cost of computation
Our university data center had three IBM mainframes. Some of the timesharing user accounts were for data center employees, with few resource limits. But most users’ accounts were paid for, either through departmental chargebacks or in real invoices sent out to local customers, such as an off-campus research center and a local hospital. (Undergrad students had limited-access accounts and were treated separately.)
Beyond the charges for the basic account and persistent storage, computation time was measured in CRUs: computer resource units. Users were charged based on the CRUs they consumed during the month.
I don’t recall how much a CRU cost, but it wasn’t cheap. Each mainframe job, whether submitted from a terminal in real time or in a batch job queued for later, had a limit on the number of CRUs it could consume before being killed by the mainframe operating system. If a job “ran away,” such as with an infinite loop, it would ABEND (abnormally end). My recollection that the cost of hitting the CRU limit on a runaway job was several hundred dollars. And that was in the 1970s—when you could rent a great apartment for $300 per month. Ouch!
Once, we might have pointed and snickered at these computing limitations. But putting programmatic attention on efficient code has become more relevant today, thanks to cloud computing. Many of us were spoiled by the “free” resources inside our desktops, workstations, and even servers in our own data centers. But today, just as in the 1970s mainframe era, we are renting billable resources in terms of cloud processor utilization, bandwidth utilization, and storage utilization. Do things in a stupid way, and you can rack up significant expenses from your cloud provider—which might be several thousand dollars, or tens of thousands of dollars.
Organizations migrating existing applications to cloud hosts in a platform as a service (PaaS) or infrastructure as a service (IaaS) are really bitten by this. Their server-based data center code may have been horribly inefficient, but nobody cared. Now they care when they see the cloud service invoices.
So if you are developing cloud software, or looking to migrate existing homegrown applications to the cloud, this might be the first time in decades that your organization has had to pay for the actual resources consumed, including processor utilization, storage, and perhaps even bandwidth. It’s worth spending the time to optimize up front—both architecture and code—in order to minimize the resources those applications consume. How much of your design time is spent considering this issue? Odds are, you’re not considering the modern-day equivalent of CRUs.
2. For data processing, think headless
It seems so crazy today—in the era of web surfing, Twitter follows, and movie streaming—but in the 1970s, we used computers for computing. Well, we usually said “data processing.” A program’s job was generally to take some data input, do something with it, and then present a user with the output. This is also known as the “input/process/output” workflow.
The input might be pulling in academic records from hard disk or tape, applying some sort of criteria (such as, “is in collegiate sports AND is not maintaining a 2.0 grade point average”), and then printing off a report for the college dean. The application might take as input electric meter readings, cross-reference the data against contract terms and financial records, and print out the month’s utility invoices. It might take mathematical formulas and draw beautiful graphs on a Calcomp pen plotter (I loved doing that). It might be source code you compiled into an executable.
For much of my mainframe career, these jobs were not run in real time on the display console; I didn’t look at the results on a computer monitor. They were “headless” applications, which operated without a monitor, graphical user interface (GUI), or peripheral devices. The programs were written and submitted to the timeshare system from the console, metaphorically like submitting punch cards as input and getting punch cards or paper as output. For most of my work, the input and output were disk files: I’d submit my job and come back a few hours later to find an output file in my personal storage space.
The lack of real-time processing meant that you had to think through what you were doing. Your data had to be clean; your program had to be clean; your output format was cleanly expressed. We couldn’t think trial-and-error. Every program (with few exceptions) had to run headless, without direct user interaction—or supervision.
That practice should apply today beyond embedded device programming into building headless back-end applications that run in the data center or in the cloud, or for building APIs. With this sort of back-end software, there is no user experience to worry about. You’re building pure functionality—the “process” part of the input/process/output workflow mentioned earlier. You know what the functions need to be; that’s in the specifications. There’s no room for fooling around here, since other aspects of the complete (like the user client) rely upon the back end working quickly, accurately, and at scale. If there’s one place to put in the extra design work to ensure correctness and efficiency, it’s in the back end.
3. Design and program for zero defects
Nobody wants to make stupid mistakes. There’s nothing worse than coming in on a Monday morning and seeing the results of your latest compilation… which halted with a SYNTAX ERROR ON LINE 7. Insert your favorite expletive here.
The 1970s toolsets were primitive compared with today’s computing tools. My mainframe programs were written in FORTRAN, COBOL, PL/1, or RPG, and occasionally for other systems like Customer Information Control System (CICS), a high-level transaction processing system, or Statistical Package for the Social Sciences (SPSS), a statistical package. They were impressively powerful, and we accomplished quite a bit. But we didn’t have beautiful programming development environments, such as Eclipse or Visual Studio, that auto-completed expressions, checked syntax, or highlighted errors. We didn’t have interactive debuggers. We didn’t have anything cool.
What did we have? Text editors in which to write our code, compilers that some privileged users (like me!) could run from the console (and get the SYNTAX ERROR ON LINE 7 errors right away), and a primitive code profiler that helped us identify where a program was spending its CPU time. The compilers had switches that enabled extended debug information, so we could track down problems by poring over six inches of printouts.
We learned, very quickly, to get programs right the first time. That meant up-front architecture. That meant outlining modules and complex syntax. It meant doing code reviews before we ever submitted the code to the compiler. And that meant ensuring that the logic was right, so when we went to run the correctly compiled code, we would get the desired output, not garbage.
This was especially important for transaction processing such as creating bills or updating database systems. An error in your logic generated flawed invoices or messed up a database. Those sorts of errors could be costly, time-consuming, and career-ending.
Unlike the stand-alone, isolated mainframe era, our applications today are interconnected. Make a mistake on a mainframe bill, and your head could roll. Make a mistake in providing mobile access to corporate data and as a result expose it to hackers? Your company could fail.
Ensuring an application’s correct functionality should not be a matter of trial and error. Unfortunately, that bad habit is too common in today’s era of iterative development. (I’m not talking about coding typos here; usually today’s IDEs take care of that issue.)
Being agile means, in part, efficiently creating code without formalizing complete specifications up front, which helps ensure that the application meets the customer’s (often poorly understood) needs. Agility does not mean using nightly or weekly builds to throw code against the wall and see if it passes muster.
Whether you follow an iterative agile process or use the mainframe-era waterfall methodologies, the goal should be to get a piece of code functioning correctly the first time, so that you can get it approved and move onto the next piece of the puzzle either the next day or in the next two-week scrum.
That all goes out the window if you spend too much time debugging or you accept that debugging is an inevitable part of the programming process. In that case, someone is being unforgivably sloppy. It means your team isn’t spending enough time designing the application correctly, and the team learned that bugs are no big deal. They are a big deal. Design for zero defects!
Note: I’m arguing against using iterative development for debugging. I am not arguing against testing. Testing is essential to demonstrating that the code is functioning correctly, whether you follow a methodology like Test-Driven Development, throw the code over the wall to a separate test team, or something in between.
4. It’s not about refactoring: Optimize up front
I love the concept of refactoring: to optimize production code so that it does the same thing but more efficiently. That concept works great with FORTRAN or PL/I, where you can optimize subroutines and libraries, and it applies equally to the modern code era as well.
Sometimes programmers opt for a quick-and-dirty routine to get a program running, and they plan to refactor later. Need to sort data? Throw in a quick algorithm for now; if the sort runs too slowly, we can swap out that algorithm for a better one. Need to pull information from a database? Write a SQL statement that’s correct, verify that it works, and move on. If the database access is slow, optimize the SQL later.
I’m not saying that we didn’t do that in the mainframe era, especially on one-time-use applications such as custom reports. However, we learned that it was highly advantageous to put in the design work up front to optimize the routines. For paying customers, it paid off to reduce the number of CRUs required to run a complex program. Also, in that era, I/O was slow, especially on serial-access devices such as tapes. You had to optimize I/O so that, for example, you didn’t have to do multiple passes through the tape or tie up multiple tape drives. And you couldn’t load everything into memory: Remember that 8 MB limit?
Nothing has changed.
Sure, we have gigabytes of RAM, super-high-speed storage area networks, and solid-state drives. And a one-off program or a proof of concept does not require you to invest time optimizing its code to ensure that you are addressing the problem in an efficient way.
Modern application frameworks, runtime libraries, and even design patterns can help generate efficient binaries from suboptimal source code. However, if the approach used to solve the problem is not optimal, there’s no way the code can be as robust and scalable as possible. That’s particularly true in resource-constrained environments, such as mobile or IoT, or if you are writing using an interpreted scripting language. This is an architecture problem.
For example, where’s the most efficient place to perform some operations: on the mobile device or in the cloud? How can you design a data stream to use minimal bandwidth, while still allowing for future expansion? What’s the best means to compress data to balance compression/decompression CPU utilization against storage or data-transmission requirements? How can you minimize the number of database table lookups? Which parts of the application would benefit from aggressive threading, and where would thread management, including data re-integration and waiting for synchronization, add to overhead without a positive payoff—or without the chance of actually hurting execution on a single-core processor?
The first answers to questions like this are often to write the code in the simplest, most straightforward manner, and I’m a big believer in that. Certainly that’s one way to get the application to market quickly, and refactoring could speed it up or add scalability later. However, my argument is that it is wise to do the design work up front to build the best code the first time. I realize this can take more time—and that time may not make sense in some circumstances. You’ll have to be the judge of that.
Remember, your IoT home thermostat is probably more powerful than my old IBM System/168.
Remembering old mainframes: Lessons for leaders
- Minimize the cost of computation, especially for cloud computing.
- Check your code up front against functional defects, instead of leaving it for debuggers.
- Optimize your application architecture for efficiency, and not only when coding for underpowered or mobile devices.