This was originally to be a panel discussion. The other panelists all wussed out, so shortly before the conference I was asked to turn my talk into a longer presentation. Here it is.

Porting from Proprietary Minicomputers to Unix

FORTRAN Issues

(a talk presented at UniForum in January, 1992)

Overview

While this is a discussion of porting FORTRAN applications to Unix many of the concepts discussed here will also apply to porting any application to Unix, regardless of language.

In some ways porting an application to Unix is similar to porting between proprietary operating systems. If you are used to porting between dissimilar systems the Unix port will not be nearly as traumatic as it will for those used to a single-vendor environment. But even among Unix vendors there are dissimilarities between not only the user environment, but within the FORTRAN language itself.

Some things you may take for granted will almost certainly be missing or very different. Most vendors do not provide a method for getting command line arguments (without resorting to assembly language). Some vendors don't even use the same FORTRAN across different platforms. IBM, for example, uses a different version of FORTRAN on the RT, the RS/6000, and their mainframes.

On the other hand, most vendors who provide FORTRAN support do provide something very close to the ANSI FORTRAN 77 standard with the DOD bit manipulation facilities such as ISHIFT(), at a minimum. Many provide other DOD extensions, VAX FORTRAN extensions, and other "standard" extensions.

Much of the problem is due to the fact that unix has always been considered somewhat of an ugly stepchild in the much of the Unix community. Unix was developed in and with the C language. C has typically been the language of choice in Unix shops, with other languages running far behind in popularity. Because of this FORTRAN and other high-level languages have taken a back seat in being provided, and in terms of the quality of the compilers and language environment provided.

This is changing as vendors realize the potential of the FORTRAN market. There are still hundreds of millions of lines of FORTRAN code in use that are not likely to be rewritten in C any time soon. To sell to the users of this code, vendors have finally begun to take FORTRAN seriously.

The earliest FORTRAN compilers available under Unix, and indeed many available today were based on the original f77 compiler that came out of Bell Labs. This compiler, while portable, has two major drawbacks. While it comes pretty close to the 77 standard, it has almost none of the de facto "standard" extensions. It is also substantially below par in terms of optimization.

While some vendors provide their own compilers, others buy from a compiler vendor, much like in the PC world. This gives vendors a compiler fairly quickly that is reasonably strong, is usually robust, and typically meets most of the standards FORTRAN developers are used to. While these compilers often sport excellent compiler technology, they typically do not produce code as nice as that produced by the FORTRAN compilers provided with most of the proprietary operating systems.

Related somewhat to this is the fact that the RISC chip developers are often not working closely with developers of high-level languages. Many CISC designs, such as the IBM 370 architecture and the DEC VAX architecture, were developed precisely to work with high-level languages. While RISC has great promise in terms of price/performance, much of this gain may be eaten by software not designed and coded with optimization in mind.

If a RISC system runs 5 times as fast as a CISC system but the compiler is only 2/5 as efficient, the net gain in speed is only a factor of 2. This is still faster; the point is simply that the RISC architectures so common in the unix world are not a panacea. There is still a place for optimization, especially in the world of real-time process control.

Just for the record, C is not a high-level language. It is more of a very portable assembly language with many features of typical block-structured languages.

UNIX as an Environment

Let's take a look briefly at the user and development environment itself. While the command language interpreter (known as a shell in Unix) is similar in concept to the CLIs of most proprietary operating systems, there are some substantial differences. The commands are quite different, too.

Much of what you may have heard about Unix is true. The commands are usually terse, and often seem cryptic. This is also true of many minicomputers until you are familiar with them. On the other hand, many commands are quite similar at a basic level between the various vendor's proprietary operating systems. But the Unix world has certain advantages.

A typical command found on most minis is the type command for displaying a file to the terminal. Unix also has a command to do this - in fact it has several, and the one giving page control has two major variants, depending upon its parentage. (Most Unix systems available today are direct descendants of either System V from AT&T, or the BSD software from the University of California at Berkeley.) Such things take getting used to, but are not really that hard to deal with.

Fine control of asynchronous terminal lines, however, is another matter altogether. Getting into this is not unlike waking up one morning in a strange bed, in a strange family, in a strange house, in a strange country, where a foreign language is spoken. In reality little has changed - you are still human, you still consume the same basic foods, breathe the same basic air, and so forth. But the differences are what tend to grab your attention.

For FORTRAN developers another major difference is in how input and output are assigned to their respective files or devices. Unix has no assign statement at the command level. Since most Unix FORTRANs do not provide for command line argument passing, this is a nuisance. Some vendors allow you to set variables, or reassign logical units on the command line (interpreted by the CLI, not the FORTRAN application), but there is no standard way of handling this.

Most unix configuration of this type is handled either in a command file (or "shell script") by renaming the standard file names the FORTRAN library associates with each logical unit, or via a configuration file the application reads at startup. Another common approach is to use environment variables. Even if the FORTRAN compiler and library provide no support for interpreting such variables, you can generally call a C subroutine or function to get them for you. This is not that complex, and is actually a fairly common practice.

Unix does provide a fairly comprehensive development environment compared to many proprietary operating systems. Text search capabilities and powerful editors (batch and interactive) are all part of all Unix systems. A source code control system is provided with every Unix development environment of which I know. Debuggers are also provided with development environments. The debugger provided with BSD-derived Unix systems is a decent source code debugger, but the standard AT&T debugger is almost universally detested. Good debugging environments with nice graphical user interfaces (GUIs) are available with some systems.

There are certainly other tools still not provided on some Unix platforms. There is a need for a standard documentation facility with the power of the help facility provided with VMS.

Once you get familiar with the unix commands, the vast majority of them will be the same between systems. There are quite a few books available to help learn unix as well, both as a user and as a developer.

General Discrepancies

FORTRAN performs special formatting on output files. Most minicomputer systems have handling for this built in at least within the printer drivers, and often in the terminal drivers. Unix does not typically provide this, but vendors usually provide either switches on the printer command or filters to handle this. Some provide filters for terminal display of such output as well. It is easy enough to write your own filter if necessary.

Most BSD-based versions of Unix provide a function IOINIT() which helps with carriage control conversion and gives some flexibility in file naming from within the FORTRAN application.

Unformatted files on some proprietary operating systems are simple files with no control information. Most Unix implementations of FORTRAN create unformatted files with control information in them. A simple filter, often just a FORTRAN program to read in the the old file as formatted and write the new, unformatted file takes care fo this.

Most Unixes do not provide any method of guaranteeing contiguous file space in the standard file system. This capability is available by writing to raw devices (devices with no file systems) for those programs which really need this for speed. Otherwise, direct access files are handled like any other file by the Unix file system.

The Unix file system itself does not differentiate between different types of files; that is up to the application to handle. There is often some information embedded at the beginning of a file indicating its file type for applications which care. Some versions of Unix do provide further information and checking but this is strictly nonstandard.

Shared Memory

Many software systems in the non-Unix world depend upon multiple programs using shared commons for inter-process communications (IPC). Since Unix was designed with other IPC mechanisms in mind, and since software engineering principles being developed at that time warned against sharing common variables, shared memory among programs was not provided.

As Unix became more widely distributed and more applications were written for and ported to Unix, the lack of shared memory became a problem, especially for systems depending on really fast IPC. System calls to deal with inter-process shared memory were introduced in AT&T's System V Unix. Unlike the conventions of most minis involved in the real time marketplace, memory is not shared between processes without specific programmer intervention.

All System V and many BSD-derived systems now include the shared memory operations. Unfortunately, many FORTRAN compilers do not provide a library interface to these calls. Due to the nature of the data structures used, C routines are required to manipulate shared memory on such systems.

Real Time Communications & Process Control

Since unix was never intended to perform real time operations, there has historically been little support for it. The major issues involved (other than those already discussed) are pre-emptability, general scheduling, and locking processes in memory. Vendors such as HP, DG and MODCOMP do provide some real time support, but many vendors do not. Real time is therefore one of the few areas into which unix has made almost no inroads.

Pre-emptability is a major issue in a real time system.

Unix scheduling does not provide the flexibility many real time systems need.

Finally, to decrease context switching time in real time applications, critical tasks are typically locked into main memory. Again, standard unix has provided no support for this context; any task might be swapped or paged out to disk.

User Interface Issues

One of the areas Unix shines is in the standardization of user interface support, through curses and termcap (or terminfo) for character-driven interfaces and X for graphics interfaces.

Most FORTRAN applications written for proprietary operating systems have used one of three approaches. Many have had simple, command-intensive user interfaces. Many of the older Unix applications excelled at this as well, but for many applications users want or need a better interface.

Other applications have made use of full screen presentation and editing techniques, but many of these have been tied to a particular OS. VMS is an example of an OS which provides good support for this, as long as DEC-compliant terminals are used. Unix, on the other hand, provides excellent support for screen-based programs, with the curses library, and via either termcap (BSD) or terminfo (System V) providing terminal-independence capabilities.

Graphics interfaces abound in the proprietary operating systems world. Far too many of them, however, are tied not only to a particular OS, but to a particular graphics device or family, or to a standard which never garnered major support. The X11 window system (it's a window system named X, not a system named X windows) is prevalent throughout the vast majority of the Unix world. All major workstations and most PCs have X available either from the vendor or from an aftermarket company. The source for X is available free for anyone, so new ports appear at a fairly amazing rate. The libraries have been ported to run on a number of mainframes and minis running proprietary operating systems. With low-priced X terminals, portable, inexpensive graphics are available to almost anyone with a system based on a 32 bit (or wider) architecture.

There are two primary standards for the user-interface part of an X application. These are Motif and OpenLook. Almost every Unix workstation vendor provides one or the other of these, and some provide both. While two standards is still less optimal than one (from a portability standpoint), that's still far better than the situation in the rest of the computer world.

X is not yet a panacea for all graphics problems. For many real time applications X is still too slow, at least in its present form. Vendors with strong ties to the real time community have produced versions of X with much greater speed and robustness.

Some Unix vendors still provide their own graphics interfaces in addition to X, but this is primarily to support older customers. Very few applications are being written for Unix today making use of these older graphics products.

Porting Assistance

No tools are provided with Unix itself to specifically assist in porting FORTRAN applications. Other than the general development tools, about all that is available are profiling tools. Both commercial products and free products are available.

Such tools generally provide standards compliance verification or assist in converting FORTRAN to C. Tools in the first category are usually commercial products, while tools in the second category include commercial products and freely available source code.

There is a large Unix computer network spanning the globe known as the Internet, Usenet, or just The Net. Via ftp or uucp (network and RS-232 based file transfer programs) a user connected to this net can send and receive source code, documentation, and other files. Large software repositories are scattered around the world.

The other important functions (perhaps the most important) are electronic mail (email) and news. News is essentially the world's largest electronic bulletin board system, with hundreds of topics discussed online. Many of these topics are related to computers, engineering, and various scientific disciplines. Topics of direct interest include FORTRAN, various Unix systems, and VMS. A great deal of expertise is available for free via the net. Most of the people on the net are also happy to exchange email, and will help solve specific problems. The net is also a good place to find talented people with specific capabilities.

In Summary

In summary, porting an application from a proprietary operating system to Unix is no more or less difficult than porting between proprietary operating systems. After the initial port, moving to other Unix-based platforms is much easier.

Despite the lack of support for FORTRAN as compared to C, Unix still provides a strong development environment. There are several obstacles to porting certain classes of applications but these are not insurmountable.

The worst support is in the real time arena. Even here, porting to Unix can be a big win. While the number of players is reduced, you still have several vendors to choose from, whereas in the proprietary minicomputer world, every new vendor is typically a major port.

Finally, Unix seems to be the OS of choice for today and the forseeable future for most people. Far more applications are being ported to Unix than to any other OS. The Unix portion of the market increases every year. Almost every college graduate with a CS or engineering background has Unix exposure; many have little or no exposure to anything else. The biggest single problem with this crowd is that they also have little or no exposure to FORTRAN, or indeed to any languages besides C and maybe C++.

Because of this tendency towards C and UNIX, whenever a new application is being written, C should probably be considered along with FORTRAN. FORTRAN still excells at certain things, such as math-intensive engineering, and many applications written in FORTRAN are nowhere near the end of their useful life. FORTRAN will be around for a long while yet.


Copyright 1992, 1994 by Miles O'Neal, Austin, TX. This article may be freely distributed via computer network or other electronic media, or printed out from such media, for personal use only. Any non-personal (ie, commercial) use of this article requires the express, written permission of the author. Commercial copy permission may be granted if, in the author's sole opinion, such usage of this article is for purposes the author holds near and dear to his heart and/or wallet. For such permission, contact the author via email at meo@pencom.com or cs.utexas.edu!pensoft!meo, or via mail at the address below.
Rte 1, Box 558 / Leander, TX / 78641-9413 / USA
This copyright may be freely used, distributed and modified subject to the conditions noted above in the preceeding paragraph.
Last updated: 10 July 1996
Miles O'Neal <roadkills.r.us@XYZZY.gmail.com> [remove the "XYZZY." to make things work!] c/o RNN / 1705 Oak Forest Dr / Round Rock, TX / 78681-1514