Review Of Pacific Bell AtHand Web Site


by Alex Cuthbert

Research related to spatial cognition has focused on map reading (Stasz & Thorndyke, 1980), direction giving, locative phrases (Slobin, 1996; Hayward & Tarr, 1995), mental rotation (Shepard, 198x), reformulation in problem-solving tasks (Kirsch, 1991), partitioning in perceptual tasks (Clark, 1983), navigation in virtual environments (Osberg 1993; Harrison & Dourish, 1996), and the representation of spatial knowledge (Hernandez, 1994). Piaget (1952) identifies several components of spatial processing including the ability to comprehend perspective, transformations, ordinal relations, classification, kinetic imagery, reciprocity, transitivity, probability, and conservation (cf. Osberg, 1993; Patterson & Milakofsky, 1980). Similarly, Stasz & Thorndyke (1980) identify six effective learning procedures for map reading: partitioning, imagery, memory-directed sampling, pattern-encoding, relation encoding, and evaluation. Other studies look at representations as indicators of how knowledge is represented internally. Unfortunately, attempts to formalize or directly access the contents of memory have generally failed to produce guidance for systems designers perhaps due to a focus on knowledge representation rather that the relational dynamics of the setting.

In the case of the AtHand Web site, the relational dynamics of the setting are defined by the interaction between a single user and the computational system. To analyze the AtHand system, I begin with structural questions about the representational forms used to characterize a physical space. What are the relationships between these forms? And more specifically, what features become abstracted out and what features become highlighted at different levels of representation? Once these structural aspects are detailed, it becomes possible to answer more conceptual questions such as, "What are the cognitive demands of moving between representational forms?" and "How do these representational forms facilitate the task of locating resources such as businesses in a particular region?" Given the descriptive nature of this short paper, issues are identified rather than explored in-depth.

This type of cognitive analysis serves two purposes: (a) it provides an organizational and explanatory framework for useability analyses, and (b) it details possible enhancements to the system that take into account the strengths and weaknesses of the visual and perceptual systems. Considering these human factors as part of the engineering and marketing process can move the focus for development so that it aligns with the types of tasks and styles of interaction that influence the acceptance and level of continued use of the product.


Constraints & Affordances Of Spatial Representations


The need to use maps in order to create representations of a physical space provides certain constraints and affordances. This navigational origin of maps leads to several hypotheses about how people access and interpret spatial representations. These hypotheses can be summarized in the following manner:
Path residual: All maps have a residual element of path (i.e., human access) embedded within them either implicitly or explicitly.

Translation through perceptual icons: Movement between representational layers in a map occurs through identification of landmarks and pathways used in orientation in physical space.

Translation through notational systems: Extended use and the resulting familiarity with notational systems permits translation to occur through schematized relations in addition to perceptual icons.

Cost-benefit ratios: As the complexity of the representations increases, the processing costs increase relative to the informational benefits.

Alignment: Alignment of the representations and familiarity with the notational systems (conceptual alignment) can decrease the cognitive demands of moving between them.

Shifting levels of activation: When characterizing problem solving tasks, it is more useful to think of shifting levels of activation rather than subjects moving between discrete modes and states.
When people give directions, they are frequently procedural in nature (e.g., turn left at the third light) taking advantage of perceptual icons such as landmarks to anchor the procedural instructions. Previous research indicates that localized, route knowledge develops into globalized, relational knowledge (Stasz & Thorndyke, 1980; Cohen & Schuepfer, 1980). Such a globalized perspective accompanied by a formalized system of orientation is found in most street maps with their legends, scales, coordinate systems, orientation, and color schemes. Insets frequently provide one level of nesting typically a detail map of a downtown area. The links between these levels are indicated by outlining the area on the larger map or, in some cases, by offsetting the detailed view using perspective lines. The purpose of explicitly defining these affordances is to provide a framework for assessing the AtHand system based on mapping between embodied cognition and navigation mediated by representations.

Overview of AtHand


The AtHand system attempts to facilitate the process of locating physical structures particularly businesses by providing both a hierarchically indexed categorical breakdown of the sites along with a search engine split on "business name" or "business category". The framework for the interface consists of a canvas frame on the top of the screen and a toolbar on the bottom (Figure 1). There are numerous issues involved both in how categories are defined (e.g., using emperical data from representative users vs. categories based on a proportional representation of the data) and in how the spatial information is presented. My focus here is on the presentation of the spatial information and not the structure of the categories.


Figure 1. AtHand Toolbar Frame

The Athand system uses a single north-south view onto a region that can be defined by: (a) city, (b) zip code, or (c) a region relative to a specific address. The scale can then be adjusted by clicking on the Region-Street scale control (Figure 2).


Figure 2. AtHand Region-Street Scale Control

The viewpoint can be changed by clicking on some region of the map or by pressing the NSEW compass controls. All views onto the space are aerial as illustrated by the street-level view in Figure 3. Color is used to differentiate major highwyas (red and blue) from smaller streets (black and grey). One-way streets are not indicated and characteristics such as the topography and natural composition have been abstracted out with the exception of larger regions such as Golden Gate Park (green shading). The orientation is always to the north resulting in the street maps being skewed as in Figure 3.


Figure 3. Street-level view of a location

The cost-benefit relationship for multiple representations is decreased through familiarity with the notational systems. As the complexity of the representations increases, the processing costs outweigh the cognitive benefits. In other words, while accessibility might be increased through the use of multiple representations, at some point the complexity of relating numerous perspectives exceeds the benefit of providing different views onto the problem space. The AtHand system provides a single view onto the space although multiple views can be extracted using the ten map views. The icons for the map view control are scaled to indicate perceptually the amount of distance between the camera and the image. Alignment occurs in this situation because the size of the region increases relative to the size of the icon and relative to the camera distance. Figure 4 provides an example of this expanded view.


Figure 4. Expanded view of location

Computational demands can be off-loaded to the environment and propagated across different media (Hutchins, 1995) through the use of interlocking functional systems. For example, when landmarks are not easily identifiable due to the representational scheme, coordinate systems can be used to locate features such as street names. However, when these systems are not in alignment, the computational demands of interpreting the spatial relations increase significantly. I suspect that designers continually work to balance the relationship between simplicity and complexity or richness of information. However, the access demands will not always be verbally referenced or articulated in the final product. With AtHand, there is a trade-off between providing multiple views on the same display vs. multiple views on sequential displays (the choice made by the AtHand designers.) This approach is a lowest-common denominator approach that minimizes the complexity of the system. This design works well for single interactions with the system where the task is targeted at locating a specific business. However, when we consider extend use of the system or use by a community that shares information, problems begin to arise in this LCD design. Cueus in the physical environment that we implicitly or explicitly use to orient ourselves are stripped from this 2D view. For example, the TransAmerica building, Twin Towers, and the various hills in North Beach are used by San Francisco residents to orient themselves and to segment regions. The ability to customize the map by selecting landmarks or switching to a topological view would help people who are not referencing the space solely by street names.

The fact that users tend to interact repeatedly with small clusters of information (Card, 1996, p. 112) suggests that we need to provide tools that help communities working on similar problems share relevant information. Since collaborative information foraging introduces a number of additional complex issues, considering the individual user interacting over a period of time can reveal some but not all of the problems with collaborative systems. Consider the need to locate several different types of information or the possibility that the type of business falls into several categories. There is currently no information about the precision and recall for different combinations of categories. The result is that users frequently get no matches or too many to differentiate effectively. The ability to refine the query based on the existing data set would let users iteratively differentiate sites. Similarly an integrated map could be constructed that pooled searches for businesses and promoted chosen sites to the integrated map. This personalized map would contain frequently visited sites and allow for some type of proximity analysis. The possibilities for add-on features would then include optimized trip planning algorithms, calendar scheduling based on operating hours of the businesses, and searches through the integrated map. Profiles of users could also be constructed based on the emergent patterns from the searches. This tracking would allow the provider to suggest potentially interesting locations for the user in much the same way that Amazon books suggests reading materials to its customers.