Two- and three-dimensional computations of multistage turbomachinery flows are becoming cost-effective on current workstation technology. The speed of STAGE-2 ranges between 3 and 15 Mflops on a variety of available workstations. STAGE-2 operates at 135 Mflops on a CRAY-YMP, so there is a factor of between 9 and 45 difference between the speed of the code on a workstation vs. a CRAY-YMP. Unfortunately, queue lengths on a fully loaded CRAY can be quite large. In some cases, the wall-clock time for a job to turn around is longer than the job cpu time by an order of magnitude or more. Because workstations provide dedicated computing, a single high-end workstation can then give turn-around performance comparable to that of a fully loaded Cray.
Networks of workstations are becoming much more common in industry, government and academia. These networks can represent a valuable source of additional computing power. A natural extension of operating the STAGE-2 and STAGE-3 codes on a single workstation is then to modify them to run in parallel across a network of workstations. The zonal grid structure used in the code is well suited to such a parallel processing environment. Each zone could simply be placed on a different workstation to run in parallel with other zones. There are two different ways in which this could be implemented. The first is to place the next zone to be computed on the next available workstation. However, this method would require transferring a minimum of 6 two-dimensional or 8 three-dimensional arrays at each subiteration. Since network communication time is often the limiting factor in the efficiency of a parallel processing effort, this method could result in an unacceptably large computational time to input-output time ratio. The second method would be to assign a zone to a workstation and just transfer the boundary information between the workstations. In this case, the running of the code would have to be synchronized across the workstations; the slowest workstation would limit the speed of the entire computation. However, this reduces the network communication time which is likely to be more critical than the speed of any individual workstation.
Many different public-domain and proprietary software packages are available for distributed computing. Most of the software packages utilize the "C" language, so the Fortran-based STAGE codes will have to be modified to be driven by "C" or entirely rewritten in "C" in order to efficiently link with the chosen package. However, if the code is entirely rewritten in "C", it will likely reduce the floating point operation rate on vector computers because the "C" language does not vectorize efficiently on most compilers. Another consideration involves the range of workstations to be used in the network. If workstations from different vendors are used, the data representation will likely vary from machine to machine and the representation of the boundary-condition data will have to be modified before it is transferred from machine to machine. Several of the packages have drivers for data representation changes from machine to machine, and this will be a factor in the final choice of the distributed processing package to be used. Hopefully, these efforts will provide a tool that can operate easily and efficiently across a network of workstations in order to give an effective alternative to traditional supercomputing.