Interprocess communications in the AN/BSY-2 distributed computer system: a case study

https://doi.org/10.1016/S0164-1212(01)00151-0Get rights and content

Abstract

This paper presents a case study of the design and implementation of the interprocess communications facility developed for the AN/BSY-2 distributed computer system, the computer system for the Seawolf submarine. The interprocess communications facility was identified as a critical design challenge for the AN/BSY-2 system, as the system incorporated new component and network technology along with new run time system services as well as application programs. The requirements specified for the interprocess communications included aggressive performance, as well as functional capabilities that had not been previously fielded. The AN/BSY-2 computer system is comprised of over 100 processors interconnected in multiple fault tolerant fiber optic rings. First, a description of the AN/BSY-2 distributed architecture is presented. The message passing semantics are then presented. A key feature of the IPC facility is its support for both synchronous and asynchronous communications based on logical addressing. Logical addressing within the AN/BSY-2 system supports point-to-point as well as group communications, and also supports the fault tolerant requirements of the system. The hardware developed to support fast real time messaging, and support fault tolerance is discussed. Finally, the low level semantics of a message transfer through the system is outlined.

Introduction

This paper presents a case study of the design and implementation of the interprocess communications facility developed for the AN/BSY-2 system, the data and signal processing computer system for the Seawolf1,2 submarine. The AN/BSY-2 system, deployed in 1997, represented a major advance in embedded systems design, incorporating functional capabilities typically not included in a closed embedded system. A block diagram of the system is shown in Fig. 1. The AN/BSY-2 system provides nearly 100 processing nodes interconnected on a fiber optic redundant ring network, high speed parallel and serial communications channels from external interfaces, communication channels to display consoles, stand alone processors, and SCSI interfaces. In addition to meeting timeliness requirements, the system includes additional capabilities to support autonomous operation, dynamic system resource management, and fault detection and reconfiguration. The AN/BSY-2 system represents one of the first large distributed real time multiprocessing systems using the message passing paradigm, and is the largest real time system ever successfully fielded.3 Due to its size, complexity, and mission criticality, the AN/BSY-2 system has also been used to study the effect of Ada coding styles on execution performance.4

While the AN/BSY-2 system was being developed in the late 1980s, standards for several open source message passing interfaces for non-real time systems were being defined (MPI, 1994; Saphir, 1993). These interfaces sought to make use of the most attractive features of previously existing message passing systems. As an example, the MPI standard was strongly influenced by the work at IBM T.J. Watson Research Center (Bala et al., 1992; Purushotham et al., 1994), Intel's NX/2 (Pierce, 1988), Express (Parasoft Corporation, 1992), nCUBE's Vertex (nCUBE Corporation, 1990), p4 (Butler and Lusk, 1992), and PARAMACS (Bomans and Hemple, 1990). Standard libraries based on the message passing paradigm have also been developed for specific applications (Gupta and Banerjee, 1992). The largest portions of the early work in developing open source message passing interfaces and standards were not specifically targeted at real time systems. More recently, a working group has been developing MPI/RT (MPI/RT, 2000), a real time version of MPI. As implementation of message passing facilities became more efficient, its popularity as a scalable communications model continued to grow. The message passing model has now been adopted in a wide spectrum of application domains, including the newly emerging domain of micro-electrical–mechanical (MEM's) based next generation networked sensor systems (Hill et al., 2000).

The message passing facility was identified by the Navy at the beginning of the program as a critical component within the system due to aggressive performance and functional requirements that could not be met by any commercially available messaging software system. Developing the message passing facility and verifying that the system met all requirements were made more difficult as the AN/BSY-2 system was based on a custom hardware platform necessary to meet the overall system requirements. Further, the message passing facility would also be required by application programmers to support development of the 4 million source lines of Ada that would eventually run in the AN/BSY-2 system. Due to this need for simultaneous development of the hardware along with the run time system software, a scaled prototype system composed of commercially available components that functionally emulated the AN/BSY-2 system was specified for supporting functional development. The prototype system, along with the run time software, could then be used for prototyping and functional integration of application programs. Although this approach mitigated risk by providing a convenient development platform, final integration and verification of requirements would be performed on the actual hardware once the system was fielded.

Section snippets

Message passing interface overview

One of the first system challenges was to define the requirements for an augmentation to the commercial adopted Ada multitasking run time executive to support operation in a distributed, real time environment. The augmentation was defined to support asynchronous operation and co-ordination of distributed program tasks, and control of common resources distributed throughout the system. The augmentation also included the fault detection and recovery requirements of the system. The augmentation,

IPC hardware support

The AN/BSY-2 computer system included builtin hardware support for message passing, including parallel DMA channels, mailbox interrupts between the network processor and application processor, dual ported memory, and fast exception handling. Low level hardware support was also included for supporting fault location and detection, as well as fast efficient processing of the variable message sizes input from real time signal processing hardware.

Fig. 5 shows the builtin low level hardware support

IPC implementation

The critical design features of the IPC facility included the organization of run time data structures to support the required functionality, efficient run time allocation/deallocation of these data structures, implementation of fast device drivers, and utilization of low level hardware resources supporting transfer of the data. Designing and implementing the data structures was a critical step in the overall design of the IPC facility. The real time requirements of the system placed hard

IPC low level programming

IPC transfers can be broken into three separate stages. The first stage is initiated by the IPC call from the application program to the operating system. In this first stage, the message is broken into segments, and the individual segments are sent over the appropriate communications channel. Fig. 10 shows the transfer of a message to multiple destinations over the fiber optic ring. As shown in Fig. 10, multiple segments of the same message can exist in the network processor's multiport memory

Conclusion

This paper presented a case study of the design and implementation of the real time interprocess communications (IPC) facility developed for the AN/BSY-2 system. The IPC facility includes a platform independent message passing interface that supports the unique requirements of a real time distributed system. In addition to timeliness issues, the system requirements also included support for fault tolerance. The IPC facility allows programs to form virtual channels, separating the network

Dr. Andrews joined the faculty at the University of Kansas in 2000 and is currently an associate professor. Prior to joining the faculty at the University of Kansas, Dr. Andrews worked for General Electric Company, and was on the faculty at the University of Arkansas. Since becoming a faculty member at the University of Kansas, Dr. Andrews has been focusing his research on embedded systems and real time architectures. He received his BSEE and MSEE degrees from the University of

References (12)

  • L Bomans et al.

    The Argonne/GMD macros in FORTRAN for portable parallel programming and their implementation in the Intel iPSC/2

    Parallel Computing

    (1990)
  • Bala, V., Kipnis, S., Rudolph, L., Snir, M., 1992. Designing efficient scalable, and protable collective communication...
  • Bangalore, P.V., Doss, N.E., Skjellum, A. MPI++: issues and features. In:...
  • Butler, R., Lusk, E., 1992. Users Guide to the p4 programming system. Technical Report TM-ANL-92/17, Argonne National...
  • Document for the Real-Time Message Passing Interface (MPI/RT-1.0), March 6,...
  • Gupta, M., Banerjee, P., 1992. A methodology for high-level synthesis of communication on multicomputers. In:...
There are more references available in the full text version of this article.

Cited by (1)

Dr. Andrews joined the faculty at the University of Kansas in 2000 and is currently an associate professor. Prior to joining the faculty at the University of Kansas, Dr. Andrews worked for General Electric Company, and was on the faculty at the University of Arkansas. Since becoming a faculty member at the University of Kansas, Dr. Andrews has been focusing his research on embedded systems and real time architectures. He received his BSEE and MSEE degrees from the University of Missouri-Columbia, and the Ph.D. degree from Syracuse University. His is a senior member of IEEE.

View full text