Research related to spatial cognition has focused on map reading (Stasz
& Thorndyke, 1980), direction giving, locative phrases (Slobin, 1996;
Hayward & Tarr, 1995), mental rotation (Shepard, 198x), reformulation
in problem-solving tasks (Kirsch, 1991), partitioning in perceptual tasks
(Clark, 1983), navigation in virtual environments (Osberg 1993; Harrison
& Dourish, 1996), and the representation of spatial knowledge (Hernandez,
1994). Piaget (1952) identifies several components of spatial processing
including the ability to comprehend perspective, transformations, ordinal
relations, classification, kinetic imagery, reciprocity, transitivity,
and conservation (cf. Osberg, 1993; Patterson & Milakofsky, 1980).
Stasz & Thorndyke (1980) identify six effective learning procedures
for map reading: partitioning, imagery, memory-directed sampling,
relation encoding, and evaluation. Other studies look at representations
as indicators of how knowledge is represented internally. Unfortunately,
attempts to formalize or directly access the contents of memory have
failed to produce guidance for systems designers perhaps due to a focus
on knowledge representation rather that the relational dynamics of the
In the case of the AtHand Web site, the relational dynamics of the setting are defined by the interaction between a single user and the computational system. To analyze the AtHand system, I begin with structural questions about the representational forms used to characterize a physical space. What are the relationships between these forms? And more specifically, what features become abstracted out and what features become highlighted at different levels of representation? Once these structural aspects are detailed, it becomes possible to answer more conceptual questions such as, "What are the cognitive demands of moving between representational forms?" and "How do these representational forms facilitate the task of locating resources such as businesses in a particular region?" Given the descriptive nature of this short paper, issues are identified rather than explored in-depth.
This type of cognitive analysis serves two purposes: (a) it provides an organizational and explanatory framework for useability analyses, and (b) it details possible enhancements to the system that take into account the strengths and weaknesses of the visual and perceptual systems. Considering these human factors as part of the engineering and marketing process can move the focus for development so that it aligns with the types of tasks and styles of interaction that influence the acceptance and level of continued use of the product.
Path residual: All maps have a residual element of path (i.e., human access) embedded within them either implicitly or explicitly.When people give directions, they are frequently procedural in nature (e.g., turn left at the third light) taking advantage of perceptual icons such as landmarks to anchor the procedural instructions. Previous research indicates that localized, route knowledge develops into globalized, relational knowledge (Stasz & Thorndyke, 1980; Cohen & Schuepfer, 1980). Such a globalized perspective accompanied by a formalized system of orientation is found in most street maps with their legends, scales, coordinate systems, orientation, and color schemes. Insets frequently provide one level of nesting typically a detail map of a downtown area. The links between these levels are indicated by outlining the area on the larger map or, in some cases, by offsetting the detailed view using perspective lines. The purpose of explicitly defining these affordances is to provide a framework for assessing the AtHand system based on mapping between embodied cognition and navigation mediated by representations.
Translation through perceptual icons: Movement between representational layers in a map occurs through identification of landmarks and pathways used in orientation in physical space.
Translation through notational systems: Extended use and the resulting familiarity with notational systems permits translation to occur through schematized relations in addition to perceptual icons.
Cost-benefit ratios: As the complexity of the representations increases, the processing costs increase relative to the informational benefits.
Alignment: Alignment of the representations and familiarity with the notational systems (conceptual alignment) can decrease the cognitive demands of moving between them.
Shifting levels of activation: When characterizing problem solving tasks, it is more useful to think of shifting levels of activation rather than subjects moving between discrete modes and states.
The Athand system uses a single north-south view onto a region that can
be defined by: (a) city, (b) zip code, or (c) a region relative to a specific
address. The scale can then be adjusted by clicking on the Region-Street
scale control (Figure 2).
The viewpoint can be changed by clicking on some region of the map or
by pressing the NSEW compass controls. All views onto the space are aerial
as illustrated by the street-level view in Figure 3. Color is used to
major highwyas (red and blue) from smaller streets (black and grey). One-way
streets are not indicated and characteristics such as the topography and
natural composition have been abstracted out with the exception of larger
regions such as Golden Gate Park (green shading). The orientation is always
to the north resulting in the street maps being skewed as in Figure 3.
The cost-benefit relationship for multiple representations is decreased
through familiarity with the notational systems. As the complexity of the
representations increases, the processing costs outweigh the cognitive
In other words, while accessibility might be increased through the use of
multiple representations, at some point the complexity of relating numerous
perspectives exceeds the benefit of providing different views onto the
space. The AtHand system provides a single view onto the space although
multiple views can be extracted using the ten map views. The icons for the
map view control are scaled to indicate perceptually the amount of distance
between the camera and the image. Alignment occurs in this situation because
the size of the region increases relative to the size of the icon and
to the camera distance. Figure 4 provides an example of this expanded
Computational demands can be off-loaded to the environment and propagated
across different media (Hutchins, 1995) through the use of interlocking
functional systems. For example, when landmarks are not easily identifiable
due to the representational scheme, coordinate systems can be used to locate
features such as street names. However, when these systems are not in
the computational demands of interpreting the spatial relations increase
significantly. I suspect that designers continually work to balance the
relationship between simplicity and complexity or richness of information.
However, the access demands will not always be verbally referenced or
in the final product. With AtHand, there is a trade-off between providing
multiple views on the same display vs. multiple views on sequential displays
(the choice made by the AtHand designers.) This approach is a lowest-common
denominator approach that minimizes the complexity of the system. This design
works well for single interactions with the system where the task is targeted
at locating a specific business. However, when we consider extend use of
the system or use by a community that shares information, problems begin
to arise in this LCD design. Cueus in the physical environment that we
or explicitly use to orient ourselves are stripped from this 2D view. For
example, the TransAmerica building, Twin Towers, and the various hills in
North Beach are used by San Francisco residents to orient themselves and
to segment regions. The ability to customize the map by selecting landmarks
or switching to a topological view would help people who are not referencing
the space solely by street names.
The fact that users tend to interact repeatedly with small clusters of information (Card, 1996, p. 112) suggests that we need to provide tools that help communities working on similar problems share relevant information. Since collaborative information foraging introduces a number of additional complex issues, considering the individual user interacting over a period of time can reveal some but not all of the problems with collaborative systems. Consider the need to locate several different types of information or the possibility that the type of business falls into several categories. There is currently no information about the precision and recall for different combinations of categories. The result is that users frequently get no matches or too many to differentiate effectively. The ability to refine the query based on the existing data set would let users iteratively differentiate sites. Similarly an integrated map could be constructed that pooled searches for businesses and promoted chosen sites to the integrated map. This personalized map would contain frequently visited sites and allow for some type of proximity analysis. The possibilities for add-on features would then include optimized trip planning algorithms, calendar scheduling based on operating hours of the businesses, and searches through the integrated map. Profiles of users could also be constructed based on the emergent patterns from the searches. This tracking would allow the provider to suggest potentially interesting locations for the user in much the same way that Amazon books suggests reading materials to its customers.