An ICE Rendezvous Mechanism for X Window System Clients


William D. Walker

Abstract

Although ICE provides a convenient mechanism for establishing a connection between clients, it is difficult for ICE-speaking clients to become aware of and initiate ICE connections with each other. This document describes the initiation problem and explores three alternatives for creating a proactive ICE rendezvous mechanism for X Window System clients.

Acknowledgments

Thanks to Sue Liebeskind, Keith Edwards, Beth Mynatt, Donna Converse, Ralph Swick, Daniel Dardailler, Ralph Mor, Anselm Baird-Smith, and Dave Hill for their ideas, design expertise, and knowledge of the X Window System. Thanks also to the DACX effort (Disability Action Committee for X) for providing the initial problem to solve: a screen reader for X.

Definitions

The terms client and agent are used throughout this document and are defined as follows:

Client. A client a standard X Windows System client.

Examples of a client include word processing applications, window managers, and terminal applications.

Agent. An agent is a special X Windows System client that is interested in other clients.

Examples of agents include resource editors, audio feedback clients, screen readers, and testing tools.

Overview

As a result of the DACX effort to make the X Window System more accessible to people who are blind, the concept of RAP (Remote Access Protocol) was born. The idea behind RAP is to allow agents to communicate with any X Window System client to determine and possibly manipulate the state of the client's user interface.

The enabling technology for RAP can be broken into three major components:

The first component, the rendezvous mechanism, is the primary subject of this document. The ICE protocol [1], new with X11R6, provides the inter- client communication mechanism needed for the second component. The Xt hooks [6], also new for X11R6, provide half of the third component for Xt clients. The other half of the third component is the RAP protocol itself, and will be covered in a separate document.

Requirements

As mentioned in previous section, a key requirement for RAP is the "under the covers" rendezvous mechanism that allows agents and clients to automatically establish communications with each other. The requirements for the rendezvous mechanism are as follows:

  1. The mechanism must not require source code modification on the part of the client, although it is acceptable to require a re-link of existing clients.

  2. The mechanism should be light-weight and not affect the performance of clients.

  3. The mechanism should be enabled as early in the client's life cycle as possible.

  4. Clients must be able to simultaneously communicate with multiple agents, even some using the same protocol.

  5. Clients and agents are started and stopped in no particular order.

  6. Clients may also be agents.

Achieving the first requirement will allow pre-existing applications to participate in RAP and RAP- like sessions. In addition, it helps eliminate the need for application developers to build in support for RAP-like protocols.

The second requirement addresses the possibility of clients not wishing to participate in a RAP-like session, and does not force these clients to suffer performance hits for functionality that will not be used.

Testing applications and other user interface monitoring tools may desire information about the client's user interface as early as possible, perhaps even at the time of opening the display. The third requirement has been listed to address the needs of these types of applications.

The fourth requirement allows multiple agents to work with one client at the same time. For example, a blind user using an agent to do testing on a client may also be using a screen reader [7] to access that client.

The fifth requirement allows clients and agents to start and stop as they please. This helps eliminate the "I must exist first" situation similar to that of the Session Management protocol [3].

The final requirement allows agents to also act as clients and is related to the fourth requirement. For example, a blind user using a testing agent may use a screen reading agent to access the testing agent.

Overview of ICE

The ICE protocol provides "inter-client" communication independent of X that is much quicker than the ClientMessage/Selection mechanism used by the Editres protocol [9]. The main process for establishing an ICE connection between Application A and Application B is as follows:

  1. Application A sets itself up for an ICE protocol reply for a given protocol layered on top of ICE. Application A then issues an IceListenForConnections request to ICE and is assigned a list of ICE network ID's. The ICE network ID's are similar in concept to an X Display ID in that other applications wishing to connect to Application A must know the ICE network ID's ahead of time.

  2. Application B sets itself up for an ICE protocol setup for a given protocol layered on top of ICE. When Application B learns that Application A exists, perhaps via a rendezvous mechanism described in this document, it uses Application A's network ID's to issue an IceOpenConnection request to open an ICE connection with Application A. Once the ICE connection is opened, Application B issues an IceProtocolSetup request to tell Application A what protocol it wishes to speak.

The biggest problem in establishing communications using ICE is getting Application B the two crucial pieces information it needs to initiate an ICE connection with Application A: Application A's ICE network ID's and whether or not Application A speaks the same protocol(s) as Application B. The remainder of this document focuses on this problem by exploring three alternative rendezvous mechanisms that supply Application B with this information.

Agents Use a ClientMessage Event

The first rendezvous mechanism puts the agent in the role of Application A and the client in the role of Application B. Upon startup, the agent registers the protocols it speaks with ICE, performs an IceListenForConnections, and puts the resulting ICE network ID's in a ICE_NETWORK_IDS property on its toplevel window.

The agent then finds all toplevel client windows and sends them a ClientMessage event that contains the ID of its toplevel window, the name of the protocol it wishes to speak, and a pointer to the ICE_NETWORK_IDS property. In addition, the agent registers for a SubstructureNotify event on the root window of the display. When the agent receives a SubstructureNotify event, it checks to see if it was the result of the creation of a new client toplevel window. If it was, the agent sends the new window a ClientMessage event identical to the one it sent to existing clients upon startup.

Upon startup, each client registers for a ClientMessage event. The handler for this event checks to see if the message is from an agent. If it is, the client checks to see if it speaks the same protocol as the agent. If it does, the client obtains the ICE network ID's from the ICE_NETWORK_IDS property on the agent's window and initiates an ICE connection.

The primary advantage of this mechanism is that it eliminates the need for bookkeeping to determine what agents exist. In addition, the only work necessary for a client to do on startup is register for a ClientMessage event. If no agents startup during the lifetime of the client, the client will not suffer a performance hit.

The main disadvantage of this method is that it requires both the agent and client to have created a window prior to initiating the ICE connection. This may not be early enough for some agents, although it can be argued that the ICE communication really cannot take place until both the agent and clients are in their main event handling loops. Another disadvantage of this method is that some ill-behaved clients call exit() when they receive an event they do not understand. In particular, the ClientMessage's sent from the Editres application cause some of the receiving applications to die. Although this can be viewed as a bug in the clients, it should be avoided if possible.

Clients Maintain a Well-Known Property on their Toplevel Window

The second rendezvous mechanism is similar to the first except it puts the client in the role of Application A and the agent in the role of Application B. With this mechanism, the client keeps two properties on its toplevel window: an ICE_NETWORK_IDS property and a PROTO_NAMES property.

Upon startup, each client registers the protocols it speaks with ICE, and also puts those protocol names in the PROTO_NAMES property on its toplevel window. The client also performs an IceListenForConnections and puts the resulting ICE network ID's in the ICE_NETWORK_IDS property on its toplevel window.

Upon startup, each agent checks the toplevel windows of the display for the PROTO_NAMES property. If it finds a window with a protocol that it speaks and wishes to initiate an ICE connection with the client, the agent registers the protocol with ICE and uses the client's ICE_NETWORK_IDS property to initiate an ICE connection with the client. Also similar to the previous rendezvous mechanism, the agent registers for a SubstructureNotify event on the root window of the display. When the agent receives a SubstructureNotify event, it checks to see if the new window has a PROTO_NAMES property. If it does and the agent speaks that protocol, the agent gets the ICE_NETWORK_IDS property and initiates a connection with the new client.

The main advantage this mechanism has over the previous one is that it eliminates the need for the ClientMessage event. The major disadvantage to this approach is that it requires a client to register ICE protocols and listen for ICE connections it may never use. Since clients will outnumber agents, this can result in a lot of wasted ICE network ID's. Another disadvantage is that the client may not have put the PROTO_NAMES property on its toplevel window by the time agents receive the SubstructureNotify event on the root window. This could result in some clients being ignored.

Agents Maintain a Well-Known Property on the Root Window The final rendezvous mechanism puts the agent in the role of Application A and the client in the role of Application B. This mechanism relies upon agents to maintain a well-known property, EXT_AGENTS, on the root window of the display. This well-known property is a list of elements, where each element contains a unique ID for an agent, the agent's ICE network ID's, and the protocols the agent speaks.

Upon startup, each agent registers the protocols it speaks with ICE and performs an IceListenForConnections. It then adds the resulting ICE network ID's and protocol names to the EXT_AGENTS property on the root window.

Upon startup, each client checks the EXT_AGENTS property on the root window to determine if there is an agent that speaks the protocol(s) it knows. If one exists, the client registers the protocol with ICE and initiates an ICE connection with the agent. In addition, the client must register a PropertyNotify event on the EXT_AGENTS property. In the event that the property changes, the client will check to see if there are any new agents to talk to and initiate additional ICE connections accordingly.

The primary advantage of this mechanism is that it allows for very early rendezvous between agents and clients: the rendezvous could happen as early as XOpenDisplay. In addition, EXT_AGENTS acts as a central repository that could be useful for other purposes. This mechanism also eliminates the need for the ClientMessage event that exists in the first mechanism.

The main disadvantage with this mechanism is the bookkeeping necessary to maintain the well-known property. If an agent unexpectedly dies, the EXT_AGENTS property can become out of date. The workaround to this problem requires all new agents and clients to update the EXT_AGENTS list, removing any information that points to non-existent agents. Although the check and updating of the EXT_AGENTS property on the part of clients is minimal, it adds to the startup time of the client.

In addition, every client will receive a PropertyNotify event each time an agent adds itself or removes itself from the EXT_AGENTS property. Although this is trivial for each client, the combined effect could be detrimental to the performance of the system. Also, if each client checked and updated the EXT_AGENTS property each time its value changed, the result could be an infinite series of changes to the property.

Conclusion

The second mechanism is very similar to the first, except it eliminates the need for the ClientMessage event at the expense of requiring all clients to register their protocols with ICE whether they use them or not (as well as listen for ICE connections that may never be made). For this reason, the first mechanism is preferable to the second mechanism.

The third mechanism allows for a very early rendezvous between agents and clients at the expense of requiring clients to check and possibly update the EXT_AGENTS property upon startup. The size of the EXT_AGENTS list should be relatively small, however, so the overhead of maintaining it should be minimal.

Given the slight disadvantage of the third mechanism having to maintain the EXT_AGENTS property on the root window compared to the larger disadvantage of the first mechanism requiring the clients and agents to have already created windows before initiating ICE connections, I.[[[I'm having trouble figuring out which one is better. The last one seems way cool except I'm somewhat antsy about maintaining the EXT_AGENTS property. Does anyone have any preferences???]]]

References

[1] Robert Scheifler, Jordan Brown. "Inter-Client Exchange (ICE) Protocol." Version 1.0. X Consortium Standard. Version 11, Release 6.

[2] Ralph Mor. "Inter-Client Exchange (ICE) Library." Version 1.0. X Consortium Standard. Version 11, Release 6.

[3] Mike Wexler. "X Session Management Protocol." X Consortium Standard. Version 11, Release 6.

[4] Ralph Mor. "X Session Management Library." Version 1.0. X Consortium Standard. Version 11, Release 6.

[5] James Gettys, Robert Scheifler. "Xlib -- C Language Interface." X Consortium Standard. Version 11, Release 6.

[6] Joel McCormack, Paul Asente, Ralph Swick. "X Toolkit Intrinsics -- C Language Interface." X Consortium Standard. Version 11, Release 6.

[7] Elizabeth Mynatt, Keith Edwards. "The Mercator Environment: A Nonvisual Interface to X Windows and Unix Workstations." Multimedia Computing Group, Georgia Institute of Technology.

[8] Anselm Baird-Smith, Philippe Kaplan. "The k- Edit System." Update 07/94. http://zenon.inria.fr:8003/koala/k-edit.html.

[9] Chris Peterson. "Editres -- A Graphical Resource Editor for Users and Programmers." The X Resource: A Practical Journal of the X Window System. Issue 0. Fall 1991.


BACK to DACX PAPERS