SRR -- QNX API compatible message passing for Linux

Contents


Description

This is an implementation of synchronous Send/Receive/Reply message passing, and is meant to be API compatible with QNX4 (probably a trademark) from QSSL, one of the best and most interesting OSs available.

The current state is "Beta". Its been used to successfully run a fairly large set of QNX application under Linux, but hasn't been used extensively. I welcome any comments, and hope it will be useful for anybody porting code from QNX to Linux, or wanting to experiment with S/R/R under Linux, perhaps even with the intention of porting to QNX eventually.

I can be contacted at sam@cogent.ca.


Portability

Conceptually, this code should be portable to any Linux version, and even to other Un*x systems (potentially). However, current development is with Linux kernels in the 2.2.* series. Previous development was with the 2.0.37 kernel. The module should still run on the 2.0.* versions, but unless there's a lot of interest I don't expect to continue to try and test it on a kernel thats so different from the 2.2.* series.


Installing

Notes on installation are still a little brief. I'm not entirely sure how users would prefer this kind of software to be installed. See INSTALL for a summary.


Contributors

So far, Andrew Thomas and I. See CHANGES for attributions.


Why?

Because it's interesting, and a good way to learn more about Linux, QNX, and OS architecture. It has also been useful.


Relation to the SIMPL Project

This module has some relationships with the SIMPL project, and I subscribe to the simpl mailing list. FC Software is using shared memory and Unix pipes to implement S/R/R message passing. So far our focuses have been different. My interests are making the API as close as possible to QNX's (to simplify porting of code) and learning more about the QNX and Linux kernels. FC is interested in supporting their ongoing work under Linux. See TODO in the distribution for the directions I'm taking this code.

Some differences you may bump into are that with this module you send and receive messages by process id, whereas with FC's you have to use the return value of a qnx_name_attach() (so Send(getppid(), ...) is not allowed).

The implementations seem similar in performance. I haven't ported my test app to their API yet, but it does run (without change) under both QNX and Linux (with this module). QNX is faster, of course, but not by an order of magnitude.


Currently Implemented

I have considered the "mx" versions of the above. Its not a technical difficulty, though it adds complexity. I'm not so sure I want to do the work, especially since they're most used for IO resource managers (in my experience).

And finally, qnx_prefix_[attach/detach/locate] are possible, but without an open() that uses prefixes they are useless. You could re-write the open() function in the glibc shared library like some of the Linux user-space filesystems have... any takers? ;-)


Copyright

The module is copyright by me, Sam Roberts, and parts of it (the timers in particulare) are copyrighted by Andrew Thomas. It is distributed under the same conditions as the Linux Kernel, the GNU General Public License. The API library (srr_lib.c, srr.h) are distributed under the same license as the GNU C Library used under Linux, the GNU Library General Public License. See COPYING and COPYING.LIB for the details of these licences.

The intention is that nobody be hindered from using this module in any kind of Linux-based project, commercial, uncommercial, or fun (assuming its useful at all...), but it was developed for free and I want it to stay that way.

There's no warranty, none at all, and if you use this in a safety critical system you may be nuts.


The basic idea

The ideas behind this implementation of SRR under Linux are simple: - extend the kernel by writing a loadable kernel module, - register with that module by opening it's dev file "/dev/srr" (either explicitly with SrrReg(), or implicitly by making one of the library calls), - map all (pseudo-)system calls to an ioctl() to the driver -- like other system calls the ioctl() can block, not block, perform arbitrary reads/writes into the process' address space, etc., everything that the QNX S/R/R API appears to do.

The ioctl()/module implementation should look familiar to a QNX systems programmer, it is essentially an inversion of a QNX IO resource manager. In QNX, S/R/R is used to pass structures holding the arguments to Posix system calls (read, write, ioctl!) to the IO manager, in Linux I use an ioctl() to pass structures holding the arguments to QNX system calls to the kernel module.


Problems

There are a few wrinkles:

1) I don't know how, in Linux, to access the address space of anything but the currently running process. In fact, it is not generally possible: the process's memory may be swapped to disk. This forces me to buffer all messages in kernel space before causing a context switch to the process for which the message is destined. Unfortunately this means that all data must be copied from sender to kernel to receiver, whereas under QNX message passing results in a single copy, directly from the senders address space to the receivers. Oh well.

2) There's a problem with pre-2.2 kernels involving the use of fork() when the parent exits, and the child never makes a SRR library call. This problem does not exist in the 2.2.0 and later kernels.

In Unix, when a process forks its file descriptors are duplicated in the child. In Linux this dup() is done without the co-operation of the driver to which the fd points, usually a good thing. However, the only way my module can detect the death of a process is by that process closing the fd it uses to communicate with me. I need to know about process death to release any processes that may be blocked on it. The close() on process exit/ death is guaranteed when a single process has a copy of the fd.

However, when a process forks, both the parent and the child have the fd open, and it *will not closed until both processes close it*. This can happen explicitly via close(), or on exit. This can lead to scenarios where the parent exits, but the child still holds the fd open, so the kernel module never realizes a process has exited. I try and compensate for this by detecting when a process uses any of the APIs through a fd that it did not originally open, then forcing that process to open its own fd. Unfortuneately, if the child never makes any API calls, I will never know the parent died.

In Linux 2.2 closing a fd is indicated by flush being delivered to the module. If it is the last close the flush is followed by a release. This is sufficient information that the module operates correctly under all conditions. See tfork.c in the distribution for an example demonstrating the bug. The example also demonstrates that the bug does not exist when using the module under a 2.2 or later kernel.

3) Need to do some tests, QNX documentation is not quite complete enough to determine what error numbers should be returned for certain strange errors, sending to oneself, attaching a proxy to a proxy, etc. You'll get an error, but perhaps not the correct errno.

Note: I have done this now, and discovered some interesting things about QNX. For instance, contrary to the docs, you can call qnx_proxy_detach() on *any* proxy, assuming you have sufficient priviledges, not just on proxies attached to you. I still have to do a reread of my code to make sure that I'm doing the correct things in these cases.


API documentation

Not yet. Works like QNX, except that a process doesn't exist for the purposes of sending to it or receiving from it until it has registered with the module. Calling any API function, or SrrReg(), will cause the process to be registered with the module. Exiting, or calling SrrUnreg() witll cause the process to be exited from the perspective of s/r/r, causing ESRCH for processes blocked on the deregistering process.

See the 4 simple example files recv.c, send.c, reply.c, and wait.c for simple examples of how to call, and tserver.c for a more complex example.

Look at /proc/srr for info on names, task states, proxies, and timers.


Changes:

version 1.2:

- rearranged source directories, and made install targets

- fixed bugs in task death notification that caused deadlock (Andrew Thomas)

- added a /proc/srr directory with a lot more information (Andrew Thomas)

- ported back to linux 2.0.x (again)

version 1.1:

- re-implemented timers in the module to more closely follow POSIX semantics

- timers can now deliver signals as well as proxies

version 1.0:

- Work around for sleep bug (see TODO), user won't see the bug, but it still bothers me.

- Fixed fork bug (but only for >2.2 kernels, see README).

- Ported to 2.2 kernel (Andrew Thomas)

- Andrew Thomas implemented:

- Creceive(),

- task death notification (qnx_pflags),

- proxy notification on timer expiry (timer_*), and

- select() for 2.2.*, I back-ported it to 2.0.*.

- Fixed bug with a proxies not having their args.pid set, despite being in a state of send blocked, which confused the task release procedure.

- Added SrrDebug() to control verbosity of debug messages in a module built with debugging compiled in.

- Fixed non-QNX names for task states (sorry, slight misunderstanding...).

version 0.4: saturday, may 01

- Fixed bug causing busy loop when module wasn't loaded.

- Can compile tserver agains FC Software's SRR emulation, but due to incompatibilities between the QNX API and their interface it will take more work to do a runnable port.

- Default build is non-debuggable and triggered by the SRRDEBUG env variable.

version 0.3: sunday, april 05

- Removed dynamic memory allocation for recv_q management.

- Added work-around for tasks being woken up when they shouldn't, don't know if it's my bug, or something I don't understand about wait queues.

- Made task structure simpler (and more like Linux's).

- tserver test now runs without problems.

back to main SIMPL page