Use of audio-haptic interface techniques to allow nonvisual access to touchscreen appliances

Gregg C. Vanderheiden
Trace R&D Center, University of Wisconsin-Madison
S-151 Waisman Center, 1500 Highland Avenue
Madison, WI 53705
608/262-6966 gv@trace.wisc.edu

Abstract

A set of audio and haptic techniques have been combined to extend the usability of touchscreen kiosks and hand-held devices to users with visual or reading limitations as well as to individuals in environments or activities which preclude the use of vision. This includes the use of touchscreen kiosks by individuals who have low vision or blindness, by individuals who are illiterate or have difficulty reading the particular language presented on the device, and by individuals whose vision is otherwise occupied, such as individuals attempting to use a touchscreen phone or information appliance while driving their car. The strategies may also have implications for the operation of field equipment in emergency or battlefield conditions where vision may be temporarily unusable or unavailable.

Keywords: touchscreen, access, nonvisual, kiosk, blindness, illiteracy, ADA, haptic, speech

Background

As information systems are becoming more complex and more comprehensive, it is becoming increasingly difficult to create interfaces using discrete buttons and controls. ATMs and information services have grown to the point where they offer so many services that it is not feasible to have a discrete button to represent each function. Similarly, information kiosks may have as many as 300 different functions or services provided by a single station. Even small cellular phones are gaining new functions. The Simon™ cellular phone developed by IBM and sold through Bell South provides fax, e-mail, calendar, notepad, address book, calculator, and world clock functions as well as full phone functions, all within the cellular phone package. To address the interface needs of these systems, designers have increased their reliance on touchscreen technologies. These technologies allow the designer to break up the functions of the phone into discrete subsections which can be presented hierarchically on simple screens.

This type of display, however, presents particular access problems for people with visual impairments, blindness, or literacy problems. The number and arrangements of "keys" changes and there are no tactile cues on the touch panel. Furthermore, tactile cues cannot be added, since the number and arrangement of the "keys" usually varies from screen to screen. Memorization of the location and function of the keys is also not feasible, due to their sheer number. It is also difficult or impossible to operate by anyone whose eyes are otherwise occupied, as when driving a car. The lack of tactile cues means the user must look at the touchscreen to operate it.

The magnitude and significance of this problem is growing rapidly steadily as such systems are being increasingly used in automated transaction machines, government service kiosks, personal electronic telecommunication devices, and even home appliances. In some cases, it is an inconvenience, since other accessible devices currently exist. In other cases, especially public information systems, electronic building directories, automated transaction machines, and government service kiosks, these interfaces are blocking access. As they appear on car phones, pocket cellular phones, personal digital assistants and other devices which may be used in a car, they may pose significant public hazard, since they require the user to take their eyes off the road for extended periods of time in order to operate them.

Talking Fingertip Technique

A new technique called the "Talking Fingertip Technique" has been developed which can allow nonvisual access to touchscreen-based devices, as well as facilitating access by individuals with literacy, language or other problems which prevent them from reading the text presented on the touchscreen. The technique uses a hybrid haptic and auditory techniques to allow individuals nonvisual access to touchscreen systems, even if they have a wide variety of form and format. In fact, a prime consideration in the development of the technique was the creation of an interface strategy which did not require the kiosk to touchscreen appliance designer to constrain the design of their product in order to incorporate accessibility.

Basic Elements

The basic elements and principles of the techniques are as follows:

Verbal names: All of the elements on the screen that are either actionable or provide information are identified and given a verbal name. (This name is generally presented auditorially, but could alternatively be presented via an external braille display.)

Screen description: An invisible button is located in the extreme upper left-hand corner of each screen. When the voice mode is active, touching this corner can elicit a description of the screen. (When the voice mode is not active, nothing occurs when the button is touched.)

Empty space sound: To use the Talking Fingertip technique, the individual touches the screen and slides their finger about on the screen. Whenever they are over an empty area of the screen (no text or buttons, etc.), a white noise is emitted, somewhat like the sound one might hear if one were dragging a finger around on a sheet of paper.

Auditory ridge around objects: Whenever the individual drags their finger across the edge of an object (button, field, etc.), a short click-like sound is heard. The sound is emitted both when entering and when leaving the field or button. The effect is similar to an auditory "ridge" that the individual encounters on the edge of the key.

Verbal announcement: Whenever an object is entered, the verbal name of the object is presented to the user.

Edge hysteresis: In order to avoid "chatter" when the individual is right on the edge of a button or field, hysteresis is used. The exact nature of the hysteresis varies depending upon the size and spacing of the buttons, keys, and fields. For example, large buttons which are widely spaced would have a different degree of hysteresis and a different centering of the hysteresis on the edge of the key than would small keys which were immediately adjacent or abutted. Various combinations of hysteresis are currently being explored.

Text fields: Text fields that are small and contain only a few words are spoken in their entirety when touched. Larger fields are announced when touched but are spoken only when activated.

Separate activation button: Buttons on screen are not activated when they are touched. Instead, a separate activation button (which can be located on screen but has been located off-screen in our prototype) is used. The individual may activate this button with the same finger that they were using to explore the screen, or they can activate it with one hand while they are exploring the screen with the other, for greater speed.

Last current choice: In some cases, an individual may move their finger off a button while lifting it from the screen. This is particularly true if the individual was dragging their finger and stopped just as they entered the button and heard its name announced. Whenever the individual presses the confirm button (located below the screen), the system assumes that the individual meant to activate something, and will therefore activate the last legal actionable object that was touched by the individual prior to pressing the button.

Hot lists: Some screens may contain scrolling fields with lists of items, where each item in the list is actionable: that is, touching a particular line in the hot list field would cause some action to be taken. When dragging one's finger up and down the hot list field, each line is announced as if it were a separate object.

Speedlists: In addition to direct exploration and activation of the objects on the screen, a "speedlist" which contains all of the action and information items on the screen is also provided. Touching the upper left-hand corner screen and dragging down the left margin causes the speedlist to appear (if voice mode is active). By running their finger down the list, the individual is able to select any item on the screen as if they were touching the actual object. In this fashion, the kiosk can be operated without ever taking one's finger out of the trough formed by the face of the screen and the edge of the cowl around the screen.

Incorporation of hot lists in the speedlist: On screens which includes hot lists, the items currently visible in the hot list can be directly included in the speedlist, with each line in the hot list constituting a line in the speedlist. This flat access approach provides a much simpler and more straightforward interface than a hierarchical access technique (see below), and it is feasible for most screens on kiosks and touchscreen appliances. This is in part true because the nature of touchscreen input precludes the screen from having too many items or too many lines on the display (since it must be operable by large fingertips). Whenever a hot list is scrolled up or down, the items in the speedlist change, and this is fact is announced to the user, as are all other times when the contents of the speed ist change to reflect changes in the content of the screen.

Hierarchical access: For very complex screens, a hierarchical access strategy may be used. In this strategy, individuals would first select among objects, fields or groups of objects on the screen. Whenever a list field or group of objects was selected, the individual could then navigate within the list or group. This type of hierarchical behavior is not generally necessary on touchscreens, as noted above. It is much more common on screens where a keyboard is used for navigation.

The Kiosk Platform Chosen

The kiosk platform chosen to implement and test the Talking Fingertip technique is a commercial kiosk developed by TRG, Inc., for use on college campuses. This particular kiosk was chosen for several reasons. Primarily, the kiosk represented a diverse set of screens, which included a screen with large icon-type buttons (Figure 1); screens with text-based buttons (Figure 2a and 2b); a screen with an onscreen numberpad such as might be find on an automated transaction machine (Figure 3); a screen with an onscreen keyboard used to enter names to look up in the campus phone book (Figure 4); screens with maps, where the user would touch individual buildings to get more information about them (Figure 5); and screens where the buttons were randomly sorted on the screen in order to provide interesting visual effects (Figure 6).

Figure 1: The main menu or start-up screen for the kiosk.

Figure 2a: A screen showing text buttons.

Figure 2b: The same screen, except that unfamiliar symbols have been substituted for each letter of the alphabet to simulate what this screen might look like to someone who was illiterate, and to demonstrate that it is still usable with the Talking Fingertip technique active.

Figure 3: A screen with a number pad used to enter the student's ID number and security code. This screen is similar to what might be found in an automated transaction machine.

Figure 4: An on-screen keyboard arranged in standard QWERTY order.

Figure 5: A screen showing a map of the campus, where touching the individual buildings brings about information about each building.

Figure 6: Screens where the "buttons" are represented by graphic devices which are randomly arranged on the screen for esthetic effect. In this case, the buttons appear as beach bubbles coming from the figures scattered around on the screen.

All told, the kiosk demo program has over 100 screens, almost all differing from the others. The objective was to make all of the screens accessible without changing the layout on any of the screens.

Integrating the Components: Experience with Users

The prototype, which has been in development for the past year, has been used by over 100 individuals with low vision or blindness. Since a primary interest in the initial design was the ability of novice users to access and use it, the prototype was taken to five major disability conferences, including three on blindness, in order to get a large number of individuals unfamiliar with the design. In addition to providing a very rich environment for people to try the device, the conference environment also provided a somewhat noisier environment, more akin to that which would be encountered in actual public settings.

As users were presented with the kiosk, the basic operation was explained to them, but not the layout of the screens. The users were then asked to carry out various tasks, such as:

Finding the Personal Information section, entering their phone number as a student number, and passing through the security screens;

Finding out where the restrooms were located in Building 4 on campus, which required them to find and navigate through the campus map; and

Going into the Directory Search screen and entering their name, which required them to operate an on-screen QWERTY keyboard.

Participants were observed while operating the device. In addition, they were encouraged to comment throughout the testing process. Suggestions for improvement, comments on aspects that presented problems, and aspects they liked were all encouraged. This process was carried out throughout the development and refinement of the prototype, and yielded much valuable insight.

Results of Phase 1

Activation Schemes

Initially, three different activation schemes were explored. With the first, the individual would run their finger around on the screen. When they got to the item they wanted, they would lift their finger from the screen. Although this seemed awkward at first, most users quickly got accustomed to it. However, during the learning period (usually 5-10 minutes), the users often made errors, by lifting their hand when they did not mean to. Some individuals tended to lift their finger whenever they touched a wrong button -- a natural defensive action. As a result, this technique, although efficient, did not appear to be a good general technique, especially for use on kiosks with new users. The error rate was too high, and the users very quickly got lost in the kiosk without ever knowing exactly what they did.

A second approach that was discussed was a technique using variable pressure. With this, when the individual got to a desired item, they would push harder. If an individual touched the desired button as soon as they touched the screen, they would simply give it an additional push to select it. Because very few touchscreens have Z-axis sensitivity, this technique was not explored in great depth. During some simulations, however, it appeared that there might be problems with false triggers when an individual came down directly on the desired item and then released versus an individual coming onto a wrong item and removing their finger.

The technique with the external confirm button was settled on for this prototype because it allowed free "safe" exploration of the screen by the individual, and also provided a very sure and certain confirmation movement.

This ability to clearly and certainly decide when to move forward was seen as important by the users who are blind, particularly as they were negotiating unfamiliar screens.

Need for nonspatial access techniques: Although some individuals were able to very quickly pick up the auditory exploration of the screen, others were not. Some individuals picked up the auditory exploration very quickly, and were able even to touch type on both the numeric keypad and the QWERTY keyboard, hitting most keys on the first try. Other individuals seemed to have no ability to auditorially explore the screen, or even to find a button again after they had just touched it. For these individuals, the use of the speedlist (which does not require the user to move their hand away from the left edge of the screen) was more than a convenience. It was a necessary feature to allow them to access the kiosk.

Importance of screen description to blind users: All users who were blind, and some with low vision, found the speedlist to be the fastest and easiest way to select items from the screen. However, most of the individuals who were blind also said that they greatly enjoyed the ability to auditorially explore the overall screen and to "see" how it was laid out for sighted users.

For some, the ability to explore the layout of the touchscreen and to actually use it directly was a moving experience. Some of them expressed great excitement and joy at being able to perceive what had previously been unperceptible.

All who attempted to use the kiosk were able to do so, with the exception of one individual who seemed to lack spatial skills and who tried the kiosk prior to the implementation of the speedlists (which should address this problem).

Touch and searching behaviors: An interesting observation was that most people who were blind found it difficult at first to place their finger on the screen and then slide it around. Their natural exploratory behavior was to keep taking their fingers off and explore the screen in a sequence of gentle touching motions. In fact, often the touch was so light that it was not detectable on a pressure-sensitive screen. This "light touch" searching strategy on the screen appears to be a direct extrapolation of the technique that would be used in searching for things on a table, where sweeping motions would quickly knock things over or onto the floor. Asking them to slide their finger across the screen, however, quickly acquainted them with the increased efficiency of this approach, and they adapted quite quickly. On pressure-sensitive screens, the use of a fingernail rather than the fleshy part of the fingertip often facilitated smooth movement over the screen, and some users found it easier to use the blunt end of a pen or stylus rather than their fingertip, while others preferred the fingertip.

Finger roll problems: One problem faced by some individuals was that encountered on closely spaced items resulted from the rocking motion of the fingertip when it was raised from the screen. Sliding their finger around on the screen, they would touch and enter desired item. If the items were close enough (and the hysteresis was not great enough), the individuals would sometimes find that while raising their finger from the screen, they would roll the finger up, inadvertently selecting the item above. This was particularly true on vertical lists where the line height was small enough that only minimum hysteresis was possible. Hyteresis, the "last valid chioce" technique, and appropriate spacing addressed most of this. User education handled the rest.

CONCLUSION

The primary objective, of creating a set of interface strategies which would allow access for people with visual impairments to the touchscreen kiosk without changing its layout or operation for users who do not have visual impairments, was achieved. Furthermore, the resulting solution worked out better and was received with greater enthusiasm than had been anticipated. Through careful and continuous interaction with users having a wide range of visual impairments and skills, the ease of use and robustness of the technique was greatly improved. There are several implications of this work.

First, it appears that access to touchscreen kiosks as well as personal data devices and household appliances is possible for people with visual impairments or blindness, including the large percentage of that population who do not know braille. Furthermore, with the rapidly dropping price of voice synthesis, this access can be built into kiosks and other public information systems for basically the price of the confirmation switch and a $.50-$50.00 synthesis licensing fee (current prices depending on quantity). In addition, these strategies provide access to the kiosks by individuals with illiteracy or other reading problems. In a relatively short time, the costs will drop further, to the point where the techniques can be incorporated into hand-held and other appliances for access by people with visual impairments, people who are driving, people with literacy problems, as desired. A simple extension of the underlying architecture via infrared link also provides mechanisms for individuals who are deaf-blind or who have severe physical disabilities to operate these kiosks using personal assistive technologies.

This paper covers the development to date. More extensive user testing and further development are continuing.