Executive  summaryChapter1Chapter2Chapter3
Chapter4Chapter5Chapter6Chapter7Home

CHAPTER 1
Background (Theory & Methods)

Robert C. Yamashita

The Museum Educational Site Licensing Project (MESL)1 provided a unique opportunity to examine the social and economic issues surrounding the digital distribution of cultural heritage images and their associated data from museums to universities for use in the classroom. MESL's digital storage and distribution offered an alternative to existing analog slide image repositories in academic slide libraries. The vast majority of these libraries have highly restrictive policies that hinder students' use of course images and digital network technologies offered the opportunity to broaden access to digitized repositories of cultural heritage images. MESL was the first effort at providing such access to a large, multi-institutional image database.

This evaluation of MESL's social and economic aspects had to recognize the limits of the project. First, MESL constituted an experiment in the production and distribution of digital images and data. The MESL model needs to be understood as only one of many possible models for this kind of project. More significantly, because of its experimental character, this model would probably not be replicated in production environments. In addition, MESL's design focused on a variety of institutions because different institutions use images in very different ways. MESL image providers included several major art museums, a cultural anthropology museum, a photography museum, and a national library, and image users included both public and private research institutions. This made direct comparisons difficult at best. The analysis therefore needed to move beyond the specific structure of MESL to an examination of the general process of digital distribution. However, the variety of different institutions enables us to outline the range of practices and to identify the underlying rationales and mitigating circumstances that lead to the systems that are developed.

This evaluation sought to identify the different kinds of costs associated with the distribution of managed digital and analog images. We were not interested in the general kinds of images that might be found on consumer level CD-ROMs or in 35mm slide sets. Rather we focused on the managed image, which is housed in a collection designed to be accessed by multiple users. The ways in which these images and their data are managed in a collection not only situates the images in their proper contexts, but also allows users to search the collection, and it allows access to many images by many different users and types of audiences. We wanted to fully explore these issues.

Figure 1: Schematic of the Managed Image

The managed image consists of both the image itself and a linked set of associated text. The text, known as metadata, consists of information that can be descriptive, contextual, or can provide control (illustrated in Figure 1). Descriptive metadata provide basic identifying information (e.g. artist name, picture title) about the original object that the surrogate image represents. Control metadata are specific information about the surrogate image that resides within the collection (e.g. call numbers, file name, resolution, compression, etc.). Contextual metadata are any additional information about the specific item (e.g. curatorial notes). The combined set of data enables the image to be managed, and permits shared access of the individual managed image as well as the entire collection of managed images. 2

Managed images have traditionally resided in physical collections like slide libraries. Digital technologies have enabled the creation of a new kind of managed image that resides in virtual repositories, accessible through digital networks. This new modality promises to alter image access and usage in the classroom. In order to understand and develop an analysis of these different forms of distribution, we identify several layers of cost that are organized around the structural relationships of image production and distribution.

  • The practice environment is the broadest and most general category we can use to examine any image delivery model. Operations in this layer can be categorized into physically distinct zones: (1) the production of images and associated text, (2) the processing of image and text data and the creation of managed image volumes, (3) the deployment of these managed images through a distribution system, (4) the implementation of security measures to control access, and (5) training for end users so they can use the managed images. Our examination of the practice environments can similarly be broken into a number of distinct areas that define the economics of the digital distribution of images: structural elements, cost centers, infrastructure, and institutional organization.

  • Structural elements define the various steps that go into creating a managed digital image. For example, the production of an image requires accomplishing at least three distinct tasks: selecting an image, digitizing it, and creating the associated text. Procedurally, the tasks are independent; however, the creation of a managed digital image requires all three components.

  • A cost center constitutes a collection of linked activities that are required to accomplish a particular task. For any specific structural element, there are a number of discrete cost centers. For example, the process of securing the legal rights to digitize an object would be a necessary cost center for an image producer. This cost center would be different from the center relating to the process of identifying images that consumers might want to use, or the center that involves the actual digitization of the object. There are different kinds of relationships between the cost centers within the structural elements of each practice environment. Some are procedural with linear dependencies (e.g. one step requires another), others operate as parallel processes (e.g. they happen simultaneously), and some are discrete either/or operations.

  • The technological infrastructure refers to the connections between practice environments or between clusters of specific activity. For example, the delivery system between the production environment and processing environment could be the Internet or a mail delivery system like the postal service. In defining the social and economic costs of the infrastructure, it is necessary to identify the embedded layers of different kinds of shared and assumed costs. For example, all usable classrooms presumably have power and lights. All high-technology classrooms would have network connections. The kinds of costs associated with these features are generally distributed across a campus, and not limited to a specific department or project.

  • The institutional organization is the defining social structure that shapes practice. This can include participating operational units as well as sponsoring agencies. In both museums and universities, the department that assumed primary responsibility shaped how the MESL Project operated. The functional relationship can be top-down or bottom-up. Slide libraries are similarly influenced by their surroundings. Where they are organizationally located (as part of a department, college, library, or university) clearly impacts their responsibilities and budgets.

In order to understand the differences between digital and analog image distribution systems, it is important to analyze the structural design and framework of both the MESL and analog delivery systems. This effort helps to define assumptions, decisions, and their implications. This paper outlines the structure of the MESL digital distribution model and follows with a comparative analysis of our analog sample. We conclude with a functional comparison of digital and analog modalities.

Digital Distribution: the MESL Case Study

While the penultimate goal of the Museum Educational Site Licensing Project (MESL) experiment was to establish the site-licensing model of digital image distribution, the immediate goals, as expressed in the public documentation of the project, appeared to be more modest. MESL wanted to "define the terms and conditions under which museum images and information can be distributed over campus networks for educational use" (MESL: Goals and Objectives, 2/22/95:1). It therefore was designed as a prototype demonstration of how to use "digital imaging and network technologies."

MESL was designed as a prototype demonstration of using "digital imaging and network technologies" to "make cultural heritage information more broadly available." It consisted of three objectives:

  1. To develop, test and evaluate procedures and mechanisms for the collection and dissemination of museum images and information.
  2. Propose a framework for a broadly-based system for the distribution of museum images and information on an on-going basis to the academic community.
  3. Document and communicate experience and discoveries of the project.

    (MESL: Goals and Objectives, 2/22/95:1-2)

These objectives addressed the fundamental questions about the feasibility of designing a digital image delivery system, and of providing long-term access to the system. MESL provides us with a unique opportunity to examine costs and uses of digital images delivered over campus networks. The goal of the economic evaluation is to identify and evaluate the technical infrastructure requirements and the social and economic resource needed to accomplish the MESL project. Understanding the costs associated with various steps in the MESL distribution process is critical, not only for understanding the relative success of MESL, but also for framing the future direction of digital image distribution.

The primary MESL objective was a feasibility study to examine the question of whether the networked distribution of a large number of images from museums to universities was possible.

However, MESL was not simply about the distribution of digital images of museum objects. The goal was to provide a large number of managed images from a variety of different institutions to a number of different universities for use in different classroom situations. The task was to include other kinds of associated information that enhanced the understanding of the image (e.g. contextual metadata such as curatorial descriptions and notes). The technical hurdle in MESL was combining two sets of digital information: the relatively new digital images, and the legacy text data stored in local information systems. Extracting legacy data from systems designed around a local culture and putting it into a shared database requires a degree of standardization that needed to be negotiated between the participating MESL institutions.

As an exercise in feasibility, MESL appears to offer a publisher-buyer model where digital images and accompanying text documentation were created by museums and distributed to universities. (Robert Ubell Associates, 1996). This formulation suggested a procedural flow from publisher-to-university-to-end user. Figure 2 illustrates the relationships in the basic MESL Distribution Pathway.

Figure 2: MESL Distribution Pathway

However, this model has natural limitations. While universities were the buyers of published data, they were not the users. The universities need be understood as redistributors of these data to their respective user communities for classroom and research purposes. As such, the universities are a kind of "intermediary" in the exchange between publishers and buyers.

The second objective sought to explore these concerns by defining the character of this museum-university relationship. The digital distribution model in MESL made assumptions about users' interest level in accessing and using these materials. As with many digitization projects, it was assumed that MESL's methodology had enough inherent advantages that end users would use these systems provided there was sufficient, and appropriate, content. However, acquiring content required permissions to use the material from copyright holders. This issue focused attention on the legal and technical concerns of site licensing and the management of terms and conditions. These concerns frame the conditions for use of reproduced images-they give museums permission to reproduce digital images of objects and provide access to these reproductions to end users. This imposes an institutional arrangement on top of the distribution chain, where the terms and conditions of use direct the practice in the distribution pathway (illustrated by Figure 3).

Figure 3: MESL Terms & Conditions Delivery Model

This model also makes a number of assumptions about the interest level in accessing and using these materials. MESL, like many digitization projects, assumed that its methodology had enough inherent advantages so end users would use these systems provided there was sufficient, and appropriate, content. 3 The digital distribution model asserted that if publishers could gain permission to release enough of the appropriate content, users would be motivated to acquire their images digitally. Appropriate licensing frames would permit the buyers to deploy these images in ways that would grant permissions to its users to retrieve these images (see Figure 4).

Figure 4: MESL Theoretical Delivery Model

This framework operates as a push model, in which information suppliers (publishers) provide access to their repository data in a specified manner. It also constitutes a top-down framework where the supply of information determines possible choices, and where the supplier frames the uses of its information.

As an exercise in feasibility, MESL appears as a publisher-buyer model (Robert Ubell Associates, 1996). This formulation suggested a procedural flow from publisher to university to end user akin to the commercial publisher-buyer model with a one-to-one relationship between the terms and conditions, usage preferences, and delivery system. However, the commercial model has natural limitations.

The museums were not traditional publishers. They were providers of representations of the objects held in their collections. Traditionally, third parties (e.g. book publishers, image distributors) were the publishers of museum images. Other than basic fees for using the reproduction of their representations, museums did not receive any benefit from the commercial dissemination of these images. The third parties were always the primary beneficiaries. Under the MESL model, museums became the official publishers of their images.

  • Although the universities were the technical buyers of record for published data, they were not its users. Universities redistribute their acquired data to their respective user communities. As such, the universities collectively behave as a kind of "intermediary" in the exchange between publishers and buyers that simply deploys what it has acquired.

  • Unlike commercial buyers, universities are not interested in exerting direct control over the usage of the supplied information. This presents a challenge for the simple publisher-buyer model and the commercial exchange of information. Rather than controlling each instance of use, universities try to exert control over who is eligible to access their resources, and will take advantage of the underlying control technology in the digital distribution where limits on access can be defined. This kind of control creates a mismatch between the formal terms of usage and actual practice.

  • Finally, because financial gain from each transaction is not their primary motive, universities do not directly benefit from increased usage of their deployed resources. Importantly, universities are in a situation where they want individual end users to benefit from the use of their resources. As such, universities invest in infrastructure development and support efforts that are designed to expedite access of their resources.

These structural limitations suggest a four-part distribution pathway of (1) image producer or MESL museum, (2) image deployment by the MESL universities, (3) security or access control system, and (4) the end users' environment.

Figure 5: MESL Practical Delivery Model

Another feature of the MESL experiment was the participants' willingness to create a situation where standards for the digital distribution of images could emerge, rather than be imposed. However, this willingness to explore the possibility of emerging standards was immediately impacted by the many-to-many relationship. The high number of digital images, combined with the number of sender and receiver institutions, each with different digital imaging skill levels, increased the possibility for file errors in basic file structure. While each recipient could theoretically manage these differences (and errors), there was an early recognition that the attempt to create a minimally usable data set would result in duplicate efforts. While several solutions were proposed, the MESL project opted to have a single center conduct basic file maintenance. The center's responsibilities were concentrated on basic quality assurance: file checking of digital images and associated text data sets for visible errors (i.e. delimiters), standardization of the data sets, and packaging and delivery of these data to the universities. This model recognized that the integration of data elements into managed digital complexes and the aggregation of the multiple data sets required each MESL university to perform additional processing in order to develop its local database. Each university then used the corrected data to implement their own database and delivery systems. MESL's original objectives, together with these modifications, established MESL's actual distribution system (illustrated in Figure 6).

Figure 6: MESL Actual Distribution Model

From the standpoint of social and economic costs, the MESL distribution pathway can be further defined by identifying some of the structural elements and their accompanying cost centers. 4

  • Providers create digital images and their associated text. This involves the organizational and technical processes needed to produce digital images and deliver them to a third party. In MESL, the providers were museums. In order to produce digital versions of images, the providers had to give permission to digitize the object. While meeting user needs is one of the main criteria when prioritizing what gets digitized and distributed, other issues, such as what materials are permissible or have already been digitized for other projects, can intrude on the decision-making process. This was especially visible during the first MESL distribution where several museums committed a significant amount of staff time to soliciting universities for requests. During the second distribution, these institutions tended to be more autocratic and selected images for the university (Notman, 1998). Provider cost centers include selection, rights clearance, object capture and digitization, image data conversion, object data selection, text data conversion, and some form of aggregation and shipping.

  • The processing task is to create a database container that houses the range of data elements in a specified order. This intermediate task includes taking delivery of the images from the museum and conducting file-checking procedures. If there are problems with any specific image, this needs to be corrected by working with the individual museum. The final task is to deliver the processed images to the universities. Cost centers include quality control, aggregation, and integration. The relationship between these cost centers tends to be procedural.

  • The distributors constitute a specific kind of intermediary. In MESL, their responsibility was to implement a functional database for local delivery systems. The individual delivery system includes the technological infrastructure required to use digital networks as well as the "back-room" production environment, including the storage space and database tools that make access to the images possible. Additional tasks include providing the basic public interface to the database. The environment is also dependent on getting licensing permission 5 to mount and distribute these images, as well as faculty interest in developing courses using digital images. Cost centers include image and data (both structured, e.g. descriptive and control metadata, and unstructured, e.g. contextual metadata) preparation, database creation, and the implementation of interface functionality (e.g. web interface, search engine, search interfaces, etc.).

  • Security (access control) restricts access to the database and sits between the university delivery system and the end user. The primary purpose is to limit access to a specific target community. Importantly, the security system also needs to be transparent enough to the end user so that it does not become an obstacle to access. Although security can take many different forms, we can loosely group it into three distinct categories: proprietary systems, general access controls, and enhanced authentication. Unlike other practice environments, the security environment tends to be discrete, and follows a standard set of procedures.

  • Usage is determined primarily by individual preferences and priorities. Although the university delivery systems can enable general, all-purpose access to the digital image datasets, the major impetus for general use is from the courses that use these images. Course development requires an available university delivery system, and is limited to the range of images selected for digitization. It consists of four basic elements: outreach to inform end users of the availability of the images, training in how to use the database, support for ongoing use, and actual use of the images.

Analog Distribution

Analog slide libraries provide useful points of comparison to the digital distribution model, since these collections have been the primary method for the distribution of visual cultural heritage information to the educational community. These resources are generally not available as campus-wide facilities; rather, they serve a specific discipline or more commonly, a sub-specialty area. Most collections are thus closely linked to the peculiarities of a local department and faculty. This gets reflected in the specific content, cataloging, and organization of individual collections. Slide libraries image distribution systems need to be understood within this context.

Slide libraries began after image projection technology was developed in the late-nineteenth century. The invention of the standardized 35mm slide accelerated the development of image repositories (Irvine, 1979). For the most part, the development of these repositories was aimed at capturing visual information about cultural heritage that would be difficult to understand (or even imagine) through textual description alone. Therefore, many collections were started by art and art history faculties so students could see the cultural heritage objects without having to visit the museum where they were displayed. Similar collections were started in architectural programs so their students could see examples of building design. These programs tend to be highly specialized and relatively small, with severely constrained departmental budgets. To economize, the departments needed to share their visual resources across their faculty. Visual resource libraries were designed to facilitate this type of image sharing. Their funding was usually derived from departmental or college resources; their development was driven by the particular interests of resident faculty. The result was that many collections were highly idiosyncratic, with collections being organized, cataloged, and stored according to local interests. Over time, these repositories acquired a large number of images (the average library in our sample has existed for some 50 years and houses over 280,000 images).

Significantly, the physical acquisition rate in any given year at any facility is a relatively low percentage of the entire collection (approximately 2.5%, or about 7,000 images from a variety of sources). Collection development efforts do not target an infinite universe of images; rather they respond to the needs of the user base. Collection development plans not only respond to individual requests but are also designed to anticipate trends in faculty areas of study. Copyright issues are of critical concern for slide librarians. Collection developers give priority to acquiring the highest-quality images, which usually means commercially available slides. However, when there is an urgent need for a particular image, slide libraries will often make a copy of the slide rather than initiate a new purchase. It is important to note that buried within the acquisition process there is the reality of the burden of paperwork (e.g. purchasing approvals, facilities receiving and processing, etc.) that encourages the use of copy photography for small numbers of acquisitions.

All slide collections have a physical security system that limits access to the collection. These access restrictions are generally framed by different classes of users (e.g. resident faculty, visitors, graduate students, undergraduates, etc.). Because most of the slide libraries are departmental rather than university-wide facilities, most primary faculty members have keys to the facility. Many sites restrict general undergraduate student access to the collection. In these cases, undergraduate students will have to resort to finding books in the library that contain the image, or, library policy permitting, view a limited subset of images that are posted as prints or in 35mm slide holders taped behind a window. These inconveniences have historically been framed by concerns over fair use, and the general fragility of the media.

A final area is the usage environment. In order to facilitate the use of slides within the facility, most slide libraries provide a basic system where end users can select, sort, and project their images (slide tables, projectors, carousels, etc.). Library staff members typically handle slide re-filing, and every library has slide check-out and check-in policies and procedures.

The structural organization of slide libraries makes it easy to compare the analog and digital distribution systems. Structurally, individual slide libraries act both as image publishers (museums) and image providers (universities). Every slide library has some type of acquisition procedure where analog images are either produced or purchased. There is also a processing phase where the finished images and their data are checked and entered into a physical record and digital database. The slides are then loaded into a storage space with other analog images. The deployment stage requires a number of additional resources including check-in, check-out, and re-filing systems and procedures. There is usually a security environment that restricts access by end users, usually according to user status (e.g. faculty, graduate student, undergraduate). There is also a user training environment where individuals can access the images. This can mean physically removing the images from the site, looking at individual images on a light table, or simply holding up slides to a light or window. Figure 7 outlines the analog distribution model.

Figure 7: Analog Distribution Model

While the analog distribution pathway operates in the same direction as the digital distribution model, there are critical differences. In the analog world, user preferences determine what is in the collection. In turn, these interests frame the terms and conditions of use where classroom needs underlie and presume a claim of fair use. Formal access policies further restrict physical access to a well-defined user pool.

Like its digital counterpart, the analog distribution pathway can be further broken into a number of stages:

  • Acquisition: This environment centers around the tasks required to acquire images. The first task is to select the images. Many libraries have specific collections development efforts for defined faculty areas of research. The second task is to physically acquire the image. This is done through purchasing, copy photography, or gifts. Each method has a specific and distinct set of procedural steps. A third task is to acquire the associated text data-the descriptive metadata, control metadata, and, if needed, contextual metadata (although most slide libraries do not have provisions for contextual data). Depending on the source, additional research may be needed to complete all the appropriate data fields.

  • Processing: In this environment, the images and their data are compiled into usable managed objects. The initial tasks are error checking and quality control (this can include putting the slides into heavy-duty mounts). Associated metadata are created and placed in the appropriate cataloging system (this includes attaching data to the slide itself).

  • Deployment: This environment involves the physical deployment of the images into the shared storage system. Additional tasks include providing access to the image (e.g. circulation, slide tables, staff assistants, etc.) and implementing a system to check out individual slides. A final task is the physical return of slides (including check-in, re-filing, and maintenance).

  • Security: This environment establishes procedures for accessing the collection. Most slide libraries have distinct rules and access procedures for different classes of users (faculty, students, etc.). They also have specific regulations about room access and use (which are different for each class of user). For the most part, the specific check-in and check-out procedures are self-regulated by individual users.

  • User training: This environment includes the basic efforts to support end users. On the library side there is outreach and education (usually formal programs) as well as standard development and immediate user support (usually informal activities).

Functional Model

Within our framework, the digital MESL and analog distribution modalities appear to be structurally similar. The basic physical delivery path includes the fundamental elements of image and associated text creation, image and associated text processing, managed image distribution, access control, and user training. Under MESL, the distribution elements were located in separate institutions: museums published images and text, the University of Michigan operated the central processing facility, and universities delivered the images and text which end users accessed for particular purposes. Within each university, different units and departments provided different services; for example, computer centers or special library departments mounted and deployed the images, and courses in specific departments used these images. In contrast, the slide libraries housed most of these operations in a single unit. This basic schema allows for a comparative examination of the practice environments along the two image delivery pathways.

Figure 8: Comparative Distribution Environments

The schema outlined in Figure 8 highlights some critical points.

First, there is a procedural distinction between image production and image acquisition. Image production focuses on a one-time construction of a master image. Copies of the master image are then redistributed. Acquisition is purchasing or making a copy of the master image. Museums in MESL and in future digital distribution modalities would be producers of master images. Image consortia and universities would acquire copies of these images. Analog slide libraries often do both: they acquire copies of slides from third party vendors and they produce specific images for their users.

Second, the end products of both production and acquisition are only the basic data elements-not the managed image. The creation of a functional managed image occurs towards the end of the physical delivery process, at the point of database creation. In the case of digital technologies, managed images are created at the time of delivery. Importantly, many MESL sites produced static web pages using managed images-the digital equivalents of analog managed slides.

Database creation is a new practice environment. Building a database—whether in the form of a catalog or as an image database—clearly should be a distinct practice environment from actual deployment. This activity was masked in MESL reporting where the production of university deployment systems focused on interface development rather than the accompanying database development effort.

In slide libraries, general cataloging was clearly distinguished from actual physical deployment While all slide libraries reported that part of their database cataloging process involved putting the slide in a circulation drawer, it was usually not clear what kinds of other access pathways were provided. For example, some sites had a comprehensive local, online database of their entire collection; others provided a local online database to only part of their collection but had a comprehensive card catalog. Some sites implied that the user needed to go to the shelf to find out what was really housed in the collection. Regardless of how the slide libraries chose to provide access to their collections, all of them were functional, operational entities. The variety of different access pathways is instructive for future distribution models. Interface is a matter of local culture. What is required is a standard distribution database for image collections, where the images and data are stored in standard and interchangeable formats, but can be accessed through different interfaces depending on the needs of the institutions and the users.

Another issue in illustrating these practice environments is their different cost matrices. One-time expenditures and ongoing costs can frame the economics of the two distribution systems. One-time costs essentially cover the cost of developing the core database. These one-time expenses need to include the costs of acquiring images (such as production, acquisition, and processing). However, acquisition costs tend to be dynamic. For example, in a site-licensing model, acquisition spending remains more or less constant over time. In contrast, direct purchasing means that the collection of images is paid for once. Each has different attenuate costs and benefits. The actual database is the critical "back end" to a functional managed image repository. The point at which one-time costs end depends on the circumstances under which the images were created. Analog distribution costs stop at the point where the managed image is loaded into the physical deployment environment.6 In the digital world, because managed images can be assembled on demand, one-time costs end after the deployment interface is built. Ongoing costs include security implementation, user training, and support. For the most part, basic security is relatively inexpensive. Providing a more sophisticated system for user authentication naturally requires greater expense.

A further significant difference between the analog and digital worlds is the character of the one-time costs. In the analog world, where the file format is standard, moving to an updated physical environment (e.g. a new room) does not add any cost (apart from the cost of remodeling the room!). In the digital world, moving to an updated environment often results in significant expense, because the data must be recompiled to meet the new digital standards. In the real, empirical world, the physical life of computer systems means that there will be routine, almost annual, changes, whereas some slide libraries have been in the same space for decades. In both environments there are periodic costs to enhance the deployment system. This can include adding more storage or modifying the interface (e.g. adding functionality to the interface or a new light table). Both environments would also have routine maintenance costs associated with them. 7

A final issue is the problem of image selection, a process occurring in both digital and analog systems. However, in digital collections, the costs are potentially enormous because they involve acquiring rights for both reproduction and distribution. In the analog world, selection costs appear to be minimal. But selection criteria will direct the amount of time and energy spent identifying the images to be acquired. For those images that are not commonly available, there will also be an anticipated periodic charge for producing local images.

A critical assessment of costs associated within this path needs to take into account the formal terms and conditions of use. Formal terms and conditions involve publisher (museum) permissions to digitize images and to allow access to these images at universities. Sometimes, these elements include direct costs for buying digital images, at other times, they are simply factors in the technical implementation process. For example, a library may not have to spend money to acquire the digital images themselves, but will have to spend more on security in order to limit access and protect copyright. As noted previously, this framing is most significant for publishers (museums) or their consortia that offer a large number of images for an undefined user base-a situation commonly found in digital distribution models.

A final frame is user preference and perceived relevance. Related to this frame is the issue of what constitutes a comprehensive collection. Third-party distributors and consortia emphasize the fact that they give (or will give) access to comprehensive collections" of images. However, what constitutes "comprehensive" is ultimately defined by what the user sees as being relevant. For example, a comprehensive collection for a single user could consists of the 500 images they use in the classroom-not the millions of images that are universally available. If one image is not part of the millions, then that body is, by definition, not comprehensive. The task of image providers is to meet the needs of the actual users, and to not simply acquire everything they can get their hands on.

This discussion outlines the required practice environments for an effective, functional image distribution model. This frame is illustrated in Figure 9.

Figure 9: Functional Distribution Environment

  • First there is an acquisition environment where raw images and their associated text are acquired. This can come through direct purchase (or site licensing) of images or through some other kinds of process where the relevant image and image formats are created. The production of digital images for distribution and the acquisition of images by libraries are not direct equivalents. Digital production involves the creation of a master image (even if that master is derived from an existing analog image). The creation of master representations requires using something that is close to the original, and it presumes that the rights to offer such a reproduction have already been granted. On the other hand, the acquisition process involves the work required to get a copy of the master. This environment involves the organizational and technical processes needed to produce images and deliver them to a processing facility.

  • The processing environment essentially constitutes error checking. Both analog and digital distribution requires this step. Error checking corrects problems with any specific image and its data before any further work is done. Although the details are task-specific, this is a basic and fundamental environment in any model.

  • The database environment is the development of the structure for accessing the images. This is the most critical task for the sharing of the managed images. It requires the appropriate mapping of the different kinds of data so they can be found. In the analog world, this means attaching metadata to the slide and creating appropriate records (either electronic or card). In the virtual world, this means creating an electronic database. In slide libraries, the images and text data are compiled manually. In principle, the digital world allows for the managed image to be assembled virtually, at the point of access. 8

  • The deployment environment involves the mechanisms through which the images are delivered to end users. It includes the necessary infrastructure required to use digital networks or to provide access to analog collections, as well as the "backroom" production environment such as storage space and database tools that makes access to the images possible. Additional features include the basic public interface to the database. The environment is also dependent on having permission to mount and distribute these images as well as faculty or research interest for developing courses and using digital images.

  • Security (access control) sits between the image database and the end user. The primary purpose is to limit access to the collection to a specific community of users. As is the case elsewhere, the security system needs to be transparent enough to the end user so it does not become a barrier to access.

  • The usage environment is determined primarily by individual preferences and priorities. Within the university this environment includes user training and support. For the most part, use of 35mm slides requires very little training. However, under MESL, the digital interfaces required significant training; most lacked the features to make them useful for classroom use; and network connections and classroom displays became critical obstacles to the adoption of digital databases. More significant than any difficulties with the underlying technology were problems the lack of a comprehensive collection and guaranteed future access. The lack of a comprehensive collection could be managed through local digital production. However the concern over long-term access to the digital collection cannot be so easily resolved, because many parties are involved. In physical collections, while new acquisitions can always be curtailed, all instructors can rely on the fact that the images they already use will always be available. In digital collections, however, the images are made available by outside parties which could opt to cut off access at any time. The rapidly changing nature of the technologies involved also raises concerns that, even if the images are still available, they might not be readable. Future distributors of digital images need to address the continuity of access to digital collections, especially given the realities of fluctuating budgets. In physical collections, all instructors can rely on the fact that the images they already use can be reused in future courses. Course development requires an available and stable delivery system. It is contingent on the range of image selection but also the knowledge that these images will be available whenever the course is taught.

Summary

These models permit a pragmatic specification of the cost parameters-one that begins to address costs associated with the digital and analog distribution modalities. The most basic and fundamental distinction is the driving organizational force. The top-down distribution model of digital image consortia emphasizes the need for comprehensive image database collections. Because these consortia target a general and unspecified population of users, what constitutes "comprehensive" is abstractly defined to include all available images from their accessible repositories. This mass collection dissemination primarily requires addressing the critical concerns of copyright and the terms and conditions of use. The bottom-up distribution modality of analog images focuses on user needs. A comprehensive collection is defined by what is needed at the local level. It therefore does not have to have images of everything in the world, simply the ones required by local users. Copyright and terms and conditions are acknowledged, but they are secondary to local needs and uses.

The realities of analog image collections directly challenge the "comprehensive" ideals of the consortia model. In our study, the evidence suggests that image collections consist of both purchased acquisitions and locally produced images. These locally produced images are not all copies of famous master images; many are of local interest-either specific to the region or area or relevant to the instructional interests of individual faculty. These collections of locally produced images offer a natural supplement to any single collection of acquired images (regardless of source). Any future design of a digital image deployment interface needs to be able to take these local interests into account.

An issue unique to MESL was that its database included the addition of supplemental text data (contextual metadata). Although only a few MESL university sites provided access to these data, there was at least one instance where individual faculty research provided invaluable additions to the existing metadata. In a wider digital distribution environment, there needs to be a way that end users can supplement existing metadata.

A final point is the changing character of ongoing costs. Managing images requires constant updating of the deployment environment. For example, re-filing costs will be eliminated, but they will be replaced by the costs of maintaining equipment and networks. Furthermore, despite the automation of many tasks, personnel costs will not disappear. In fact, the evidence suggests that relatively low-paid, unskilled work-study students will be replaced by fairly expensive professional staff. Nor does going electronic mean that there is no need for real-world space. The need for rooms to house collections will be replaced by the need for digital closets with lots of specialized, expensive equipment. This cost-shifting matrix needs to be explored.

Methods

This study had little on which to base its substantive data collection and evaluation (MESL was an experimental project and a study of the economics of slide libraries had never been formally conducted). Given the uniqueness of the investigation, we chose to collect data from multiple sources, gathering quantitative economic data and qualitative evaluations. For both MESL and slide library investigations, we used technical evaluations that collected quantitative data on estimates to complete certain tasks, qualitative site-visits, and in-depth interviews. Where possible, we also examined additional reports such as user/re-filing logs. Additional data for MESL were collected as part of MESL's internal evaluation (e.g. user surveys).

The technical challenge of the economic evaluation is to reconstruct the full range of activity-both formal and informal-needed to accomplish MESL and to distribute analog images. However, from the beginning, this economic evaluation was confronted by practical problems in data collection. First, although the managed images were similar, their production and distribution environments at each institution were different. This meant that specific commonalties and cost centers along the production and distribution chain (e.g. from the museums' creation of the images and accompanying text, to their distribution to the universities, to their use by individuals) needed to be derived from diverse project implementations. This heterogeneity made standardized data-gathering problematic. Second, MESL was understood to be an experiment in the electronic distribution of digital images and data. MESL participants were on a steep learning curve, caught in a web of solving specific technical issues in order to accomplish the overall goals of the distribution. Thus, although individual sites were asked to keep accurate logs of what was happening during the course of MESL implementation, most did not. Much of the MESL experience therefore has to be recreated from memory after the fact. A parallel situation exists for analog slide libraries. Many of their routine practices are not regularly documented; in some instances they provided estimates or best guesses. This project's technical data was gathered in collaboration with MESL's evaluation team, which designed the data collection instruments. The reports reflected the institutional heterogeneity and experimental mindset of the MESL participants (and thus reflected different units of measurement and inconsistency of operations from one site to another). Despite these limitations, the MESL-Mellon team identified data collection points where experiences could be compared Figure 10 outlines the data collection relationship between the economic evaluation, MESL, and primary data sources.

Figure 10: Data Sources

For the economic evaluation, the primary data collection device was the technical report. It was jointly developed by MESL management and the Mellon Project and requested from each participating institution as part of their project reporting. Supporting data came from the MESL project archives, site visits, focus group interviews, surveys of MESL participants and end users, and web server log files.

  1. Technical reports. The reports review the implementation requirements of MESL and include documentation of the associated economic costs. Parallel reports were developed for museums and universities. Each has three sections: an institutional profile, technical implementation, and reflections on experience. The profile documents the resources of the institution, general procedures, and staffing dedicated to MESL. The final section asks open-ended questions that allowed project team members to expand on their experience.

    The heart of the report is the technical section that documents procedural steps. For museums, this includes information about collection management systems, content selection, and image and text processing. For universities, this includes system architecture, data preparation, functionality and support.

    Each question in the MESL technical report can be linked to specific production environment and mapped to a logical procedural moment in the digital imaging distribution path. For example, sections 2.1.4 in the Museums report asks about the digital imaging process. This section includes questions about past experience and prior resources, as well as MESL experiences and resource commitment. On the other end of the distribution chain, Section 2.1.3 asks the universities about the data preparation and loading requirements to load MESL data onto their systems. Questions under this section cover resource needs and expenditures for different kinds of data preparation (e.g. images and both structured and unstructured text). The cost reports include both exact figures and anecdotal assessments. The individual steps in the production and distribution processes can be collected into functional "cost centers" in the workflow.

  2. Cost report from the central processing facility. The University of Michigan's central processing facility submitted their own cost report on their activities in text and image aggregation and correction. It evaluated the types of problems they encountered, and included costs estimates. This report provides insight into the technical hurdles confronted by the museum image production environment. It also provides the core data for understanding the processing environment.

  3. MESL archival material. This section includes project announcements, a request for applications, project proposals, MESL electronic list messages, and published reports.

  4. Focus group interviews. Group interviews are useful for eliciting information on individual experiences that otherwise might go unreported The focus groups were held during the final MESL project meeting.

  5. End user surveys. Objective measures for the end-user environment come from a cooperative effort of several teams working with different instruments that target distinct features of MESL. The University of Illinois at Urbana-Champaign was responsible for evaluating the classroom experience at the MESL academic sites. Cornell University is currently conducting an evaluation of end-user response at the MESL web sites and a survey of MESL participants' experiences.

  6. Site visits. The Mellon team members have conducted six formal and informal site visits to MESL institutions to evaluate the impact of the project. These visits were informative because they allowed participants to discuss and demonstrate their own MESL experiences. Specific meetings with faculty during these visits to universities have provided important clues about the creation of courses and impediments to use of digital images.

  7. Server logs. A few of the MESL university sites that use Internet web servers have made their log files for a single academic semester (Spring 1997) available for further investigation. Log files provide useful information, such as when the site was visited, where users came from, where they entered, and where they left. Data mining of these files may provide some insight into patterns of use of different functions.

The evaluation recognizes the variety of data type and quality. Interpretation requires linking "hard" or "objective" data to softer, experiential, or qualitative information. The sections in the technical report can thus be mapped to supporting documentation. Site visits and focus-group interviews help to interpret the technical reports. Data in the background sections of technical reports are derived from the applications and proposals submitted by each applicant. Supporting material for the technical implementation comes from meeting notes and electronic mailing lists, as well as data from the central processing report. The reflection section of the technical reports receives support from the various participant surveys. These relationships are illustrated by Figure 11.

Figure 11: Data Sources and Object Relationships

The formal design of MESL deliberately brought together different institutions with distinct strengths and interests. Although general cost center analyses and cost trajectories are instructive, the disparities between institutions are as important as their similarities. The heterogeneity of institution means that any individual site can be viewed as an archetypal example of a specific type rather than a member of a collective. The analysis thus has to be both an intra- and an inter-institutional exercise situated within context and examined in terms of local social organization and local culture. This requires a two-step examination of costs that not only compares the data quality across institutions (correspondence) but also its link to accounts in individual experience (coherence).

In order to gather comparative data on the analog side, we used questionnaires that were structured similarly to the MESL reports. The questionnaires requested data on the known practice environments and cost centers in slide libraries. To a large extent, the kinds of questions asked paralleled the MESL-Mellon evaluation. We also included site visits to each institution, detailed discussions with individual slide librarians, and additional reports (e.g. re-filing logs, annual reports, etc).

Both the MESL's technical reports and the Slide Library evaluations requested information on procedural steps and personnel requirements. These reports form the key data for determining hard economic costs. They were developed on the basis of what was known to be occurring in each institution. A review of the data in these reports demonstrates a number of problems commonly associated with any first-time analysis. The most significant problem was the presumed level of specificity. There were wide discrepancies in the number of personnel (or person hours) required for accomplishing a given task. It could be that the specific differences are an artifact of institutional culture (it has always been done that way), data quality, or projection of the number of bodies in a given unit. Assessment of the numbers therefore needs to begin with an internal examination of the institution, before making comparisons across sites.

Finally, the researchers did not ask specific questions about infrastructure requirements. In many instances it can be assumed that there is some minimal set of technical elements that needs to be in place before a given task can occur. The different levels of existing capabilities can lead to results of varying quality. Nonetheless, their physical impact on the overall cost picture needs to be documented. Despite these limitations, the kinds of data collected by the project provide significant help in understanding the socioeconomic parameters of digital and analog image distribution.

 


 

Notes

1. There have been a number of articles about MESL published by the project team (see Albrecht, 1995; Besser and Stephenson 1996; Giral and Dixon, 1996; Lebowitz 1996; Trant, 1994-1995; Trant 1996a; Trant 1996b; Trant 1997) Back to text

2. The basic concept of the managed image can be ported to a discussion of other kinds of collections (e.g. books, journals, articles, multimedia, etc.). These "managed volumes" are what distinguishes a library that is designed to share its volumes among a variety of unrelated users (e.g. the university library) from a collection of texts shelved in a faculty member's office. Back to text

3. This appears to clearly be the view of some of MESL planners (see Bearman, 1997). In contrast MESL participants knew they had a hard sell, before they could get faculty to use MESL images. During the first years, University participants committed significant resources in the effort to attract faculty to MESL images. Back to text

4. An economic evaluation requires identifying and measuring the resource commitment to accomplish a given task-calculating the costs of machines, mechanical processes, skilled personnel, and the necessary associated infrastructure, and identifying the accompanying form of social organization. The UCB Mellon Study-as the first effort to identify the cost centers for image distribution-only collected and analyzed data on the centers that were clearly visible at the beginning of the study. As the team analyzed these data, it became clear that were other cost centers that needed to be identified and examined in future investigations. Back to text

5. It is critical to distinguish between rights clearance and licensing permission. Rights are determined between the museum and the owner or donor of the particular museum object. Licensing is the agreement between the museum and the university (or end user) that dictates how the reproductions can be used (see Levering and Levine, 1998; Levine, 1998). Back to text

6. In principle, the implementation of pre-computed managed digital images would also end at their insertion into the delivery system. The technical problem involves not the managed images themselves so much as how to accessing. In the physical world, one just needs to "go there" and look. In the virtual world, browsing is possible, but cumbersome. In both instances, you will need to "learn" the filing system and naming conventions. Back to text

7. While it might appear that moving to a new room (or redecorating/moving items) and improving the functionality of a computer system is significantly different, the realities of the change can be mapped onto each other. In the analog world, the move-any move-is usually accompanied by the claim of improving the access and usability-even if the changes are only cosmetic. A similar claim is made in the upgrading of a digital delivery system. The empirical reality is in improving access and functional use (a subjective assessment, at best). Back to text

8. Interestingly, the majority of MESL implementations used formatted, static query result pages (which essentially are manually compiled sets of digital information) rather than dynamically generating the pages. These manually compiled pages are the digital equivalents of analog slides. Back to text

 


The Cost of Digital Image Distribution:
The Social and Economic Implications of
the Production, Distribution, and Usage of Image Data

By Howard Besser & Robert Yamashita
http://sunsite.berkeley.edu/Imaging/Databases/1998mellon