1996 CSUN Conference


The following narrative was transcribed from a tape made during the Disability Action Committee for X (DACX) Meeting held during the 1996 CSUN Conference.

A "best effort" attempt was made to get all views correct. Mark Novak, Chair


Mark: I'd like to welcome you all here. We've got a split agenda tonight. For the first part of the meeting, what we want to do is have a few people report on some of the different things that DACX has been doing in the last couple years, just to give us a brief update. Then for the latter part of the meeting, we're going to try to have a discussion focused in the area of some of the things we (e.g., DACX) might want to do from this stage forward. At that point, I'm going to turn the meeting over to various people and try to be more of a moderator for the rest of the evening. As a reminder, a text version of all the DACX meetings we've had since Closing the Gap, 1992, are available for review on the Trace Web site. To start things off, I'd like to have Will Walker give us a quick update on some of the developments in the area for people with physical disabilities.

Will: I have two pieces of really good news to present tonight. One is that XKB or the keyboard extension for X has been approved as a standard for X11R6.1, and what that means for DACX is that AccessX or the AccessDOS type capabilities are now a standard part of X. So any vendor that ships X11R6.1 will have accessibility features built in. So that's good.

The second thing is the ICE rendezvous protocol has been approved as a standard. And what that means is that X protocol that we've been talking about for the last couple of years, as part of Mercator, the screen reading project, the ICE rendezvous protocol is now standard which means it lays the foundation for RAP to become standard. And so what this shows is that even though the work has been slow, it's now a standard part of X.

Those two are the good announcements that we have. That's about it. Any questions?

?: What was the second thing that was accepted?

Will: The ICE Rendezvous Protocol. Okay, I'll explain more about that. For X11R6, they developed a mechanism where two clients can talk to each other through various fast transport and TCPIP, very quick. And the problem was, those two clients wouldn't know that the other one existed. And there's no way for one to be connected to the other without prior knowledge of the other. And the ICE rendezvous protocol describes the mechanism about how one client can learn about the other client and establish a connection with them. And by doing that, you lay the foundation for a screen reader to go out and find out what clients are currently on the system and establish a connection with them. And then the remote access protocol (e.g., RAP) will allow the screen reader to access the GUI from this point and get the information necessary to present it in a different format.

Mark: Thanks Will. The rest of the night, I'm going to turn the meeting over pretty much over to Earl Johnson to run as we look at some things that DACX might consider giving some consideration to for future work.

Earl: This is an idea, hopefully I can get through my short presentation, it'll sound good and it might be possible to work on. The goal here is to try to define a common API, both for the screen reader to ActiveX or RAP layer and then also for the API between the screen reader user interface and the off screen model.

The whole idea and the whole reason that we're here today is that the result of a visit to T.V. Raman, and he said that if X doesn't have a screen reader out in six months, then you might as well forget it because everybody's going to be buying Windows 95 products. So I just thought about it and I said, well, okay, if that's the fact, if that's true then how will we approach that. And so that's how the API thing came along. How do we get leverage, how do we find out, how do we take advantage of the products that are already out in Window's land. And then give you the user, more choices.

Looking at some overheads, the three main starting points are: user needs are what determine what a screen reader needs to obtain from a GUI, so the user feature, the user askp, all the features that you've got built in to the screen reader determine what the screen reader is going to be asking for from the GUI. We can tell from the work that Microsoft is doing, the work that Mercator folks did, and the work that the Berkeley system folks have done, that we can modularize the major components of the screen reader. We can modularize the stuff the GUI part of it, the off- screen model part of it, and then the screen reader UI part of it. And then ultimately, if you accept those first two points, a screen reader is going to be information dependent and it doesn't have to be OS or GUI dependent. But the only way that that would be possible is by driving this product from the user's side over to the technology side. Identify what the screen reader needs, what the off-screen model needs from the system and then using ActiveX, or RAP, or Access Aware, to program to some of these company's API's. So that's kind of what's going on.

And I put a quote up here from Bruce Tognazzini and because of the fact that I believe we should be starting from the user's side versus the technology side. And what he said is, "effective applications, like effective buildings, are designed from the user's in rather than the technology's out." So I'm kind of taking these words and turning them into a picture which I'll try and do a decent job of explaining it.

This is what we get. We've got three black boxes up there. One of them is the GUI, one of them is the off-screen model and one of them is the screen reader. There's a connection coming from the user to the screen reader user interface module. In between each one of these black boxes is the API, where the function calls would be. Then coming out of the GUI we have the MAC, the OS/2, the Windows and X Windows going into the computer. So what I'm trying to say here is that, the user again, determines what the screen reader needs to get from the system. And we're trying right now, we're trying to modularize the API for the off-screen model and the API for the GUI. And right now X Windows and Microsoft are in the same process.

The question is can we kind of perceive the defining, the coming up with a set of common function calls, I guess we want to call them, or common API, that we can agree upon across platforms that still allows us to use our own, from the GUI perspective, ActiveX or RAP, yet, have the same API so that if I want to use the Henter-Joyce System, for example, on my X system, all I have to do is, they're using the same API so they'll work across platforms. There will be a certain amount of dependency because you need to be able to synthesize keyboards and mouse events, you will need to be able to talk to the audio device, do Braille screens, but there will be a certain amount of OS dependency, but the screen reader and the off-screen model can be made from the components of what a screen reader is. Is there any question on the picture?

The goals here, and they've kind of changed I think from the Microsoft's meeting yesterday, were to try and define a common API that sits between the screen reader module and the off-screen model; as well as the off-screen and the OS GUI. So we're looking to define what set of common words or common function calls can we agree upon across platforms, so that we can provide a kind of platform independency for screen readers, so that users can use their same technology across different platforms. So those would be the two goals of what this project would be.

I tried to identify what sum of the work that I felt was going to be needed to go about in defining what this API was, and I think what we need to do is identify the typical calls that the screen reader needs in the system. Identify the typical calls the off-screen model needs to have. What information do they need. And also what type of calls is the off-screen model pulling from the GUI. And then from there, we would go about developing a common set of functions to agree upon.

The way I envision it, at least at this point, is we bring all these calls together, find out what is it that these calls are trying to do, then can we kind of "generisize" this, provide the same capabilities, still allow ActiveX to do its stuff, RAP to do its stuff, we're just calling it the same set of words. So those are the things that I've identified so far and hopefully discussion will be identifying more things.

And I'm going to save the email-based work group, but also to determine how much time we have to accomplish this. After going to the Microsoft meeting, it looks like we, if we're going to be looking at the screen reader to off-screen model, we've only got four months. Is that a question?

Chuck: Our schedule is incredibly, incredibly tight. We will be out, our specification will be locked down within less than two months from now. And, you know, the code will be final within six months.

Earl: Right. So that's kind of the impression that we were getting as we were sitting in the meeting last night. So the next question becomes, there's a certain amount of flexibility that we've lost in defining a common API because you guys are locked in right now; the Microsoft folks are locked in right now. So the next question would be can we kind of approach this problem, can we work together with the Microsoft folks to see if there's a way to identify or use the same type of function call that you've got. It's really unclear at this point how we would proceed. But there still is the possibility on the screen reader/off-screen model. Last night you really were saying you really haven't been working. You've been spending too much time on that interface.

Chuck: Well, the interface. Maybe we should take a poll. How many people are familiar with our development efforts in this area? Do we need to go over that for starters?

Earl: You're going to have to describe some stuff...

Chuck: Basically, we're approaching the problem from the standpoint of visual representation of the screen is, usually applications to describe things as objects. Both internally, for componentization reasons, and much larger reasons, and to, well, for other reasons. In the Windows world we have what's known as OLE, what used to be called Object Linking and Embedding, it just means OLE now, and they way OLE objects talk to or amongst one another is with what is known as component object model (e.g., COM), which is a binary interface, and each object supports a certain number of interfaces.

The first one being IUnknown, I'm giving you like the really quick overview, IUnknown, the "I" stands for Interface. Allows you to query on the objects and say what other interfaces do you support. So let's say a chart object would support an "I" chart interface that has properties and methods and events that are specific to charts. So you can set a property that says how many data elements are in this chart and there may be five, what is the data element for the fifth item and there may be fifty. We're proposing, we're implementing another interface called IOLEAccessible Object. We've actually been talking about this way too long, probably just call it "IAccessible". And that's an interface that contains a certain amount of properties and methods and a way to navigate between the various objects that appear on the screen and gather information that way.

So if we have a dialogue box and it contains two of the fields, maybe "name" and "password" on labels and two buttons, "okay" and "cancel." We would go ahead and, let's say the application wanted to do this in a completely custom fashion; they wanted to do all their own widgets and not use any system widgets. They would expose this as 1,2,3,4,5,6 objects, or alternatively the edit fields would be one object, the edit field label would be one object, the password and the label for that would be one object, each button would be an object. And they're all part of the container known as the dialog, so that the assumption that would be that the parent, being the dialog, would contain let's say that the name of the edit field would contain four children. Each of them being child objects, and the password, and each one of these objects supports a certain set of copies such as name, description, row. So in this case the row for the name would be an edit field. Password would be an edit field. The okay button in the row would be a push button and things like that. We have to find a whole bunch of roles and if you don't fit into our certain pre-set roles, you can go ahead and find new ones as well.

You can navigate amongst these objects and say, what is the object, conceivably the parent/child relationship here. You can say, give me the parent and start talking to that one, navigate to other ones, or you can sit on the main level and say give me the next logical object and you can navigate to that one. It's just a kind of object oriented way of doing that.

Now, one of the cool things about ActiveX Accessibility, don't pronounce the X, ActiveX is by itself is a Microsoft brand, ActiveX Accessibility is what we're doing. If you're using the system provided widgets, you get all this for free. So, in the standard case, if you're in application does it the system way, you're all set, just go through the situation now, but if you want to do it all on your own and implement all this, let's say you only had really one window here and you just had a bunch of text you can still expose things logically.

The point being that objects are exposed, you can get to the objects, you can discover them, you can navigate amongst them. A screen reader would go ahead and help the functions to say I have a different point on the screen I can get to the object if that point is under and navigate amongst them, so that's kind of the, that really, right there, is the heart of it.

So, it's one interface no matter how it's implemented. The system's expose its interface and applications expose their interface. Now how the OSM works in with this is that, let's say that this was, this dialog box was an old application that did its own private stuff, it was just the worst application in the world. It just did not support ActiveX Accessibility. The OSM would see the drawing of name, password, okay and cancel and would be trying to infer object information about it, we would probably not do a very good job with this particular case. The OSM exposes the same interface so you can query on the interface and discover as much information as the OSM knows about it and we can use another interface called "Itext," which allows you to go ahead and say what is the extents, so you can find the screen location of letter "a" things like that. It (e.g., the OSM) acts as a proxy on behalf of applications that don't support ActiveX Accessibility. In the case of the application the supports ActiveX Accessibility, the OSM is not needed unless you need exact screen positioning of text or if you need, let's say the bits of a graphic.

Earl: So it seems two of the questions I can think of right now are, is there a flexibility in the off-screen model and how are we going to do, from the X world, how are going to do the API between the GUI and the off-screen model and then the off-screen model and the screen reader. So now I'm looking for input, at this point it's now, there's more audience interaction here.

Chuck: We need more screen read developers in the room.

Laura: I would say that, actually, the most important thing is what properties are supportive on the platforms, do we need to have super set of roles, are like the widgets on one platform like the widgets on another. Because if you can do that, then you can make wrapper realize that, you know, behind the scenes you use ActiveX Accessibility or Itext on our platform and then you can RAP on another. It seems like to me, at least, the common properties that we have, that you guys have, and MAC has, forming what those are, are by far the most important thing to get together on. Because if you can do that, then you can always make the API pretty much.

Chuck: Okay, I think that leads us into, do we do this at the RAP level, or the ActiveX Accessibility levels, or do we go up one higher and wrap both of them, abstract them out. You know, kind of, I'm thinking in this case, especially Access GUI Toolkit from Berkeley, which, it does it on 30,000 foot level, the platform independence.

Earl: So you're including the off-screen model into the GUI Access Toolkit.

Chuck: It does but it's optional. That's the key. You call it in when you want to. And, so when you're looking for an object on the screen, you have the option of asking the OSM to proxy it up for you or not. So, it is optional.

Earl: So this is kind of abstracting it to a higher level. You're talking about abstracting it in an off-screen model.

Chuck: For the screen reader interface? The screen reader interface would not be talking directly to ActiveX Accessibility or to RAP. It would be abstracted with a common set of verbs, whatever, a function process, which would then know how to talk to the platform specific implementation.

Jim: There was another model you talked about last night, the text model, and in building an X Windows screen reader, I really think you need to take into account all the applications that blind people have been using already, X term type model, a tty type model, it seemed to me from what I understood about that is that is the model you'd want to use, that type of off screen model, for that type of application.

Chuck: I don't understand what the tty model would be?.

Jim: You know, the scrolling up screen, text only, command line.

Earl: Like your DOS prompt.

Laura: You mean, like basically, the thing that sort of organizes stuff into lines and columns?

Jim: Well...

Chuck: How does that effect the OSM?

Jim: Well, last night, talked about two different models. You talked about a text model and then the other.

Laura: Oh, I'm sorry. The text thing that we were talking about was getting into even way more detail about what is on the screen than just kind of the semantic information. When we say ActiveX Accessibility, it is kind of about the semantic layout of what's on your screen. There is a push button there and it is pressed.

The text stuff that we're talking about was getting, the text object, it's not a discreet thing, it's not like icons in the client area, like a folder window on just about any platform. Those are kind discreet objects. Text is just kind of a continuous screen. And they're not just screen readers but like dyslexic applications and stuff that want to get at the actual characters and know to the pixel where they are on the screen, and so we think of kind of Itext, so we think of this kind of more extra like incredibly detailed information of what's there on the screen and how to manipulate text, which is a really complicated object compared to, you know, your title bar. So I guess maybe when you saw Itext and were thinking tty, that's what we meant.

Jim: Yeah, because at this point, blind people using X Windows can get along very well as long as you stick to telneting in, or something on that order, or some X Windows applications, as long as you stay in that ASCII format, you know, simple things like that you can get along. They're command lines, you'll get along fairly well. But I wouldn't want to give those up right away. I don't see any reason to give them up but maybe we don't need them.

Chuck: No, I don't think there's any model that says that it would. The OSM is for the exact screen representation. Let me give you a clear example. If you go ahead and resize the window or one window is overlapping another, the object itself does not know that half of its text may be invisible and if you ask that object, "what is your text", it will give you all the text if you have a line. But it would not know that it was clipped because at that level it doesn't. Whereas the OSM does know that it is clipped, it knows that only half the text is there. And it knows exactly where in A,B,C,D,E,F,G, it knows exactly where the E is. It knows it's so many pixels into that string so that the object on a higher level is not needed. And asking the object for that information would be a big burden on the object in the application.

Jim: So your off-screen model is a two-dimensional model, a Berkeley kind of model?

Chuck: Yes.

Laura: It turned out that the kind of application you were talking about can pretty much figure out everything, right? The text based kind of application, we don't really have graphics they're just pulling text. I know this sounds like the best thing around. I mean, it's pretty straight forward.

Chuck: Yeah, in all fairness, we've come to the party late, there's been Access Aware, there's been RAP, you just happen to be molding these things onto a Microsoft technology which we call the component object model. Which, if you're technically inclined, it's a really cool way of getting things in the system talking to each other, it's a very powerful mechanism.

Earl: So again, the goal, well, the approach should be from the user. The goals are to see how, can we define a common API that allows, that will allow a screen reader to plug into the two platforms.

And then the next thing becomes now how to proceed towards that point. Do we need to have the typical calls? Does Microsoft already have that information? The typical calls are the screen reader, the screen reader module would ask of a off-screen model and then an off-screen model would ask from the GUI or, you know, vice versa. The communication.

Laura: The best people to ask about that are the people that provide, like, those little tool kits. So you write to them, superset, subset, API that they, then, compile up. You know, to be honest, I know something about the MAC and I know a lot about Windows and I feel the two can be similar. I'm sure if you guys know or not, like, who's got the focus and who's got the capture. So those sorts of things on that is level is really what you're talking about, right? Like, is there some commonalty? Because that's what people get out of GUI, what's the current state of GUI?

Earl: Again, I'll go back to, I'm trying to keep things global. If there's a way to again identify what type of information is needed from the system or needs to be communicated to the system, that's kind of a general, that's a general starting point for this idea as to whether or not we can move forward. But then again, now it takes further collaboration, we're not really going to solve the problem right now. But, what we ought to try and do is see what the next step is that we ought to take.

Chuck: Let me give you the benefit of my experience over the last year. We, as many of you know, Microsoft licensed the OSM model from Henter-Joyce and we got a lot of criticism from the other screen reader vendors who say, well, no matter what you did with the OSM model, it still has its roots in Henter-Joyce and I'm not going to use it.

So, the point I'm trying to make here is that the calls that a screen reader makes to an OSM are very specific to that screen reader and when we designed the OSM and refitted it, we said we're going to expose this in an entirely different fashion. We had previous APIs such as "retrieve next item" or "retrieve next unit" and you specified what unit. Did you want the next line, the next word, the next letter, and things like that. And the OSM would go to its database is to pull out that information and return it back. And you know, it had a very sweet effect, it was a wonderful little model, but all the screen readers have totally different ways of doing that and to keep this high level, they're going to want to go ahead and come up with very basic verbs and if we get to the point where it gets too generic, they're not going to be interested. So, in order to accomplish your goals, the next meeting should include a bunch of screen reader developers.

Jim: For a while, did some work with a three-dimensional model, which eliminated a lot of the problems we ran into with clipping and that kind of thing. It took some more space, in other words, we kept each active window separately, and only paid attention to the active window. Now that worked out okay in some applications and it works out okay in OS/2, in that I've never seen a place I needed to be outside of the active window, but when I go to Windows applications, it happens all the time, so I'm not sure what the trade off is.

Chuck: Yes, the other thing to worry about with that model is that in a lot of Windows applications, they get smarter. They do their own redrawing optimization.

Laura: I would say, what I'm hearing, I think that what an OSM presents to the screen reader, even though everybody's got their own, what that is pretty well defined and probably the same on every platform, its a database of what is on the screen. I mean at the text kind of a level. It seems to me finding a common API or this super set of, whatever that is, and then we have some kind of helper DLL, I see that as, that would be easy. And I see what properties there are, agreeing on those, is easy. I would say the hard part of the strategy, and maybe that's the one we do last and work towards the middle, is getting stuff out of the GUI. Just what is it, is it a subset or superset across platforms and everything like that because I know that even with just PM, there's little complicated things, it's not the big things that will get you but the little things. I think it can be done, there are people who ship toolkits for lots of small ISVs out there, if we can get them to write to some standard, they can compile now to that and work on a MAC or now they work OS/2. And for guys who are small, like a lot of the disability community, it's worth it to them to write to this layer, right, cause they get all this benefit for free.

Earl: Now, Chuck, you were saying that, I was kind of getting from you that you'd be looking at the off-screen model because everybody has their own different flavor, even though they want to get the same type of information. And now maybe not focusing on that is the best thing to do. Looking down, the next thing level down, between the GUI and the OSM.

Chuck: Actually, I think the former is more preferable than the latter. Because I think there are too many platform differences to make it, to abstract it, lower down rather than higher up. I'm just concerned that you make, my point in bringing up the OSM difference is that, be forewarned that there may be a lot of resistance to it.

Keith: This is a point of clarification from hearing Chuck's description of the Microsoft approach. It kind of struck me that the definition, the definition they're using for off-screen models is a bit different from the one that I had. So, it seems like in the Microsoft approach, there is kind of a structured object model based on getting information via COM and then there's a separate off-screen model which is the 2D, pixelized representation of what's on the screen.

Chuck: That is exactly right.

Keith: Do you have a feel for how much the screen reader developers use that 2D, pixelized representation because I know in Mercator we used a RAP based screen and we really don't use that except for text, basically because there's no easier way to get text.

Chuck: The screen reader itself may not use it at all, or they may, so they can position the cursor on the letter as they read letter by letter. An LD application may want to know the exact "rect" of a word so it can invert it as it speaks it. And that's it. So that's why those are there. But, a speech input utility may not want to know, so one of things we did have to do is to define the role of the OSM and ActiveX Accessibility to include a lot more than just blind access.

Keith: My definition of off-screen model is sort of a higher level object based thing....

Chuck: There is definitely a separation.

At what level do you go ahead and get platform independence? Do you do it at the RAP or at the ActiveX Accessibility level or do you go higher up and put a layer in between them? Basically, do you mold RAP and ActiveX, ActiveX Accessibility to the individual platforms or do you abstract it one level up?

Peter: I'm going to talk to fair amount about all of this at my session at 9, but I'm going to give the five minute overview here. We've got a platform independent Toolkit with an platform independent screen reader sitting on top of it. And the whole model behind the Toolkit was an attempt at promulgating forward, the same model, for accessing what's on the screen with adaptive technology and hiding all the insanity and junk underneath. The idea to make it cross platforms was, build a master accountant that keeps track of everything and have platform specific hooks that feed it.

So, our master accountant generalizes data structures for text and graphics in windows. And windows, child windows, we've got a full window hierarchy modeled in the off-screen model. Off-screen bit maps, memory bit maps, they hare also stored in the off-screen model. So, then the task becomes wiring up patches or hooks into the OS. And I define a patch as something more ugly and a hook as something vaguely OS supported.

To the entry points, to the master accountants input message, for example under Microsoft Windows, I've patched the text out routine. Take all the arguments, cook the data as we call it, we do clipping, we do font translation, we query the cache to figure out the font size, font style, all of that is. We then turn around and make a call to the platform independent Toolkit and the text create an insert call that says, here it is. We knew of the text record, which is the child of the generic off-screen model record, and that then gets inserted in to the off-screen model. The off-screen model is as we've implemented it, not only takes up text and inserts it, but notice that there isn't any text near by. It that text is near enough by, it puts the two records together. So as you're typing letters in a word processor, every time you would hit a new letter, that letter goes out with text out, take text out, grab it and insert it into the off-screen model. Or rather a platform dependent patch called the platform independent text create insert call, that says, oh, this is right next to another piece of text that we already have in our off-screen model. So let me just append it to that directly. And on and on and on. So we've got an off-screen model obliterate message so we wire that up to a erase rects, bit blits, etc...

So in this fashion, we have abstracted out all of the things that a GUI graphic engine would do and we have platform- dependent patches and hooks that catch these and then funnelled them in to the master accountant. So this is happening asynchronous to the user; it's happening as the graphics stuff occurs. This has some nice side effects, we can read crash dialogue because crash dialogue goes through GDI. We can read an immense amount of things that no other screen reader can read because our off-screen model is piggy-backing onto the GDI and fills the off-screen model as it happens. Similarly, we patch the bit blit patch so if somebody renders into an off-screen bit map, it's in our off-screen model. The bit blit patch is hit, we look at the source, we look at the destination, we turn those into tables on our off-screen model where you think of each table as we call it, the drawing context. And then give that to the off-screen model master accountant.

Laura: Okay, we've got the drawing part of it down. What about the more higher level GUI thing?

Earl: I'd like to ask him about the original question. If you were to ask, I mean, if you were as the screen reader manufacturer, were to ask us to provide support....

Peter: Let me make a point. So, the idea then is we have a set of routines for querying and getting information out of the off-screen model. And so the platform-independent layer that we intend to provide and are providing on the various GUI's that we're running on is the platform-independent model of, here is what we defined as text record, what we defined as a graphic record, here are the keys for navigating through the window hierarchy, left, right, up and down. So, looking at Access Aware on the Macintosh, ActiveX Accessibility here, RAP under X, etc., maybe some private things on all of these platforms if we need, for example, a private way of communicating with Frame under X. That would all just funnel into the off-screen model. We would then provide the layer of navigation that is platform- independent. We did preliminary port to X, the off-screen model code ported in two days, our programmers then wired up a bunch of, but not all of the patch and hooks, so were getting all of the text, all of the obliterates, and all of the window hierarchy, but we are not getting scrolls, bit blits, we are not getting bit maps.

Earl: Peter I want to know how we kind of make this a cross-platform thing. What is your opinion on that?

Peter: How would you make the information cross platform? I'm not sure that it makes sense.

Earl: I mean, if we were to develop a common API, where would you want it to be? At the off-screen model level or at the GUI ActiveX Accessibility or RAP level?

Chuck: I think Peter's saying at the GUI Access level.

Peter: I would certainly recommend the GUI Access level as a common API. It's already implemented on one platform. On two platforms, if you consider Windows 95 separate and it's in prototype form on X and we are looking into the MAC. What more of the cross-platform implementation do you want?

Earl: What we're trying to understand here is where is that level at? Does the GUI Access include the off-screen model, it can include the off-screen model.

Peter: Well, GUI Access and GUI Access API presumes an off- screen model somewhere. Whether we would wrap it around the Microsoft off-screen model or maintain our own or have it our own on some platforms and not our own on others. In looking forward, and kind of imagining, I would start building my off-screen model more and more with some of the tools that Microsoft provides. Expect, at least for the medium term, that are continuing to need to use some of my patches developed in my own off-screen model. Until ActiveX Accessibility covers the 100 case.

Earl: We kind of had a discussion the other day about separating the third party from the GUI manufacturer. But looking at this part here, the bottom line is that GUI manufacturers would be providing their part, as far as support. Because we know that each one of these individuals know that their individual ability and they can keep abreast of changes much better than a third party.

Peter: I would say that a minimum responsibility for a GUI manufacturer is to expose its OS, to expose its graphic rendering engine, to provide a mechanism for third party application developers to expose themselves.

Chuck: We're doing all that.

Peter: I understand that.

Peter: That's what I would have been from asking OS vendors from the beginning.

Laura: You mean in a higher semantic level?

Peter: Absolutely. I would like it at that level, but honestly there is enough knowledge in interpreting at that level that I would say the high semantic level isn't a minimum. Once we go beyond the minimum, I would definitely start looking at and wanting higher semantic level. But I would say as a minimum we need that. In my experience, it's the third party applications that are the biggest problems, the hardest tractable causes for screen reader vendors. I can patch and special case all the OS. That's only one thing I need to track. To the extent that the OS vendor can rationalize complications and encourage, cajole, provide mechanisms for them to be well-behaved.

Greg: What I wanted to say was, just to try and put things in a logical perspective or something like that. There are several things that we could be pursuing and we have been talking about over the last day between Microsoft and the DACX people in terms of there is this question of is it is possible to design a common API on which you could go accessibility that will work in both platforms. That might be an interesting question, but that's a really tough question, a very charged question. And I think some of the stuff that Will and I have been talking about is more along the lines of something more immediately practical which is compare, a smaller step, which is comparing the list of methods and properties, the information provided by the Microsoft solution, with that information is provided by the RAP solution on X Windows. And with the idea of not necessarily trying to unify them but trying to cross fertilize ideas between the two. Each one from the other and try and adopt what we're missing because both of them move forward.

Chuck: I think that's a no-brainer. Definitely.

Earl: Okay, then the next step would be, I agree with Chuck. I think that's a good point. Now, about this commonalty thing. We now are cross-fertilizing each other as far as between RAP and ActiveX Accessibility. But that still is not a common API per say.

Chuck: To answer your opening question, the effort of this group, or what you're proposing should be at the right side, between the screen-reader and the OSM, which would mean implementing an OSM on top of or beside RAP, just like we are implementing the OSM beside ActiveX Accessibility.

Earl: So then that's about the only place where you'd see a common API.

Chuck: Yes.

Earl: How about you Peter?

Chuck: I think that is what Peter's doing in his.

Peter: You need a OSM to do most of the hard problems in GUI accessibility. I would certainly propose we have one and whether we decide that a common API is necessary or not is an open question. And if a common API is necessary, whether the API is our API or whether that API is Microsoft's API. There's a tremendous amount you need to have in place to use Microsoft API, from what I understand. The whole semantic model behind ActiveX Accessibility where you can do things like what is next, asking all this information from applications which they will then provide is a tremendous amount of infrastructure. GUI Access is in many ways much more modest, but it is certainly strong enough to provide for a powerful screen reader. So I would suggest that...

Earl: So the model that you have developed to get GUI Access which is kind what Chuck was saying, is a good approach to take. Whether or not we use GUI Access is another issue.... But that seems to be the best place to focus on doing the common API. That's kind of what you're saying.

Peter: I would say so, yeah. Everything else is two platform specific underneath. Or if you made it platform- general you wouldn't provide enough.

Chuck: I do want to point out that we did not design ActiveX Accessibility or our OSM to be platform independent and that this is wrapped up in Microsoft technologies. Whether those technologies appear in other platforms, which is a goal, we're wrapped up in Microsoft technologies, which may or may not necessarily tie us to the Windows platform. Right now they are tied to the Windows platform.

Earl: I understand. That's why I go back to one of the original things of starting from the user, user back. I don't know if it's possible or not but seeing what is the information that a screen reader needs to acknowledge this from an off-screen model and then kind of looking at whether or not it's possible to divorce the platform from the area.

Chuck: I think you could but it would require a layer on top our code. I'm not familiar enough with X or Apple's programming methods to see, are you going to have a radical change in the way your program operates so that you're really going to have to build a separate application.

Peter: You can provide a uniform way of interacting on all three platforms.

Chuck: That's not my area of expertise.

Laura: It seems to me what gets interesting is, let's say that there was an OSM, then the question really becomes, if I'm running a screen reader for all platforms, how am I exposing my information, again, the properties. Am I speaking in names, what it is?

Earl: Which brings us down to the ActiveX Accessibility and RAP stuff.

Laura: Yeah, so, again, I see that the OSM is the logical place where you could present a unified interface for screen readers. But it seems like the only way you're ever going to write a screen reader across platforms if you can decide this is what I'm going to say.

Chuck: So what do move into, where do we go from here?

Earl: Well, it sounds like the next step is to compare the RAP stuff and the ActiveX stuff and see where cross- fertilization is necessary. Then to look at the off-screen models and bring Peter, obviously into this because of the fact that you've looked across various platforms and then see how we want to approach the off-screen model, the common off-screen model, whatever that is, if it's possible or not. Just look at it from there.

Chuck: Well, our interface is published as of right now.

Laura: We could just make an API today, but the point being that you can always make properties and whether what I'm really trying to get at is it doesn't really matter, necessarily, if COM is on a particular platform although it would be a lot less work to do this RAP if it were, but the point being that you could do it without it.

Keith: I've actually, from the stuff I saw from you guys, I think RAP is kind of similar in that it is high level, semantic level, it talks about objects and properties.

Chuck: I think it's the same philosophy.

Greg: But there are a couple big differences, though. For example, RAP doesn't allow the application to provide information about custom controls it's doing. Those are provided only at a lower level in the toolkit, so if the application is drawing a bit map of text, at the moment doesn't handle that.

Chuck: So, I see that you wouldn't want to implement RAP on Windows and you wouldn't want to implement ActiveX Accessibility on X. So, I don't think that's the level you want to talk. You're idea of cross-pollination, definitely. We can learn from each other on this. But that's kind of what I'm saying, you want to go one level up. I think Peter's got that level in his product. Peter wants to sell his product, too, so you definitely got a problem there, so if you want to do this.

Will: Peter's model is important to have. It's not like you're saying that your model's not important, it does something similar to Mercator...

Peter: The other issue is how soon can you expect on each individual platform, ActiveX Accessibility, RAP, etc, to cover the 50% case, the 70% case, the 90% case. I mean RAP is X11R6 with Xt latest conversions with this, that, and the other. And that's not going to cover more than the 20% case for some time. ActiveX Accessibility is going to rapidly adopted by Microsoft products. Does that means the 80% cases or the 5% cases?

Earl: Are you talking about application developers adopting, putting the right calls in and stuff like that?

Peter: Right, I'm talking about coverage. I'm talking about software coverage.

Laura: I'd say 75% of all windows applications and 50% of the other major ones is about what I would see because of all the COM support in the OS.

Peter: When?

Laura: When it ships, right? I mean, 50% of just about any Windows application has standard stuff in the system and then 50% is private widgets. You mean the private stuff in applications?

Peter: Yeah, I'm talking about the private in applications. I'm talking about the text window of a word processor, or a spreadsheet or a database. ActiveX Accessibility is not going to cover that for most applications for a long time.

Chuck: But our OSM does.

Peter: I understand that.

Chuck: But don't discount it.

Peter: I'm making a different point. And that point is going to lead to the need for an OSM. So, given that these are just going to be semantic assistance requests for some time, you are going to need the OSM, that's the level to build it on. So I guess that's pretty much the consensus here.

Earl: So we need to set up some email. Should we set up, yet another list serve? I think that's the fourth one I've heard about since I've been here. Do we want to handle this as separate aliases? Why don't we pass a sheet of paper around and ask people to put their names and email addresses on it.

Laura: That's already been done.

Mark: Why don't people who want to be involved in that particular project just stick around for a couple of minutes afterwards.

Earl: So you can put a check mark by your names and stuff like that.

Mark: And I don't think that will be everyone. And that's the cross-fertilization, correct?

Earl: Is there anything else? Thanks.

------ list of attendees------

Janina Sajka

Linda Petty

Jim Caldwell

Judy Brewer

Earl Johnson

Sandra Wagner, Daryl Diller, Gary Day

Jim Hoover

Keith Edwards

Will Walker

Peter Korn

David Hermansen

Steve Jacobs

David M. Little

Charles Oppermann

Greg Lowney

Laura Butler

Peter Wong

Marc Stern

Gregg Vanderheiden

Mark Novak


BACK to DACX NOTES