Howard Besser & Robert Yamashita
A number of communities are interested in the viability of digital libraries. The study summarized in this article is an important step towards understanding the issues likely to affect the economic viability of one specific type of digital library: a collection of photograph-like digital images of cultural heritage objects and their accompanying descriptive metadata, as distributed to university communities.
With generous support from the Andrew W. Mellon Foundation, researchers from UC Berkeley undertook a 22-month study (September 1996 through June 1998) to examine "The Costs of Digital Image Distribution for Educational Purposes" by looking at an experimental project, the Museum Educational Site Licensing Project (MESL). The focus of this study was to identify, define, and explore the primary cost centers in the digital network distribution of images and text through the MESL Project.
The MESL Project was the first attempt to take a collection of images and accompanying metadata from a variety of museums and deliver these in digital form to university users over campus networks. It was a two-year experimental collaboration among seven museums and seven universities that distributed over 9,000 digital images and associated text for classroom use.
The UC Berkeley Study initially planned to focus all of its attention on the MESL project. As the study progressed, however, it became clear that a study of MESL alone was inadequate to understanding issues of digital image distribution and long-range viability. Because there had been no prior studies of costs of departmental slide libraries, we embarked on a set of slide library sub-studies that could provide us with comparative "baseline" costs in the analog world. 1 And because any digital image distribution scheme would fail unless embraced by university instructors, we also created a study designed to explore the factors influencing faculty willingness to "buy in" to digital distribution and use digital images in teaching.
The resulting study compares costs of the MESL distribution method to previous analog methods of image distribution via 35mm slide library collections. The study's July 1998 Final Report (available at http://sunsite.berkeley.edu/Imaging/Databases/1998mellon) discusses the advantages and disadvantages of digital image distribution, and identifies impediments to user acceptance of digital image distribution.
From our study we have learned that, in its technical and operational details, the MESL distribution was a very special case that will not be repeated. However, the goal of museum/university digital image distribution is important to most of the parties involved, and we believe that there will be future attempts at digital image distribution that will follow other models. Our study has uncovered many features of analog slide libraries and how they are run. Some of this information will be important to architects of future distribution systems. We have also uncovered ways in which analog slide libraries are quite different from digital image distribution systems, and we expect that the two will exist side-by-side for many years to come.
Comparisons between emerging digital image distribution systems and any existing entity are limited, at best. Though their content resembles analog slide libraries, their funding schemes and organizational structures and settings will not. Digital image distribution schemes serving universities will require different types of institutional roles and responsibilities. As Clifford Lynch has said, their success will depend upon a complicated set of issues involving institutional readiness, commitment of campus-wide acquisition budgets, centralized support of infrastructure, and a host of other issues common to the introduction of digital collections (Lynch, 1997:5-13). This makes predictions about their future difficult at best.
In this summary of the Final Report, we discuss the advantages and limitations of extrapolating from a study of MESL, general findings from the MESL Project as a whole, important issues emerging from our focus group interviews with faculty, discoveries from the study of analog slide libraries, and important findings from the cost center analysis. We also make a number of general observations about what we regard as important issues in understanding the visual resources environment (several of which are not directly addressed within the body of the report).
Advantages and Limitations of Extrapolating from a Study of MESL
The promise of digital distribution is increased accessibility and the potential for enabling new uses. MESL was designed as an experiment to test whether increased access to images through digital distribution was feasible. The initial MESL proposal sought to use "digital imaging and network technologies" to "make cultural heritage information more broadly available." It cited two basic objectives: (1) to develop, test and evaluate procedures and mechanisms for the collection and dissemination of museum images and information, and (2) propose a framework for a broadly based system for the distribution of museum images and information on an ongoing basis to the academic community.
Although the goal of this study was not specifically to assess whether MESL met its stated goals and objectives, it is very clear that as a feasibility study MESL was a success-it demonstrated that a large number of cultural heritage images and accompanying metadata could be distributed over campus digital networks. On the other hand, the success of MESL, as a demonstration of an ongoing, accessible, and multi-institutional database of digital art images is unclear. The MESL Project did not shed light on how institutions will organize the acquisition and management of image collections on an ongoing basis, and it highlighted serious consistency and accessibility issues. These are part of a fundamental set of differences between creating a true integrated digital library and simply merging records from different repositories. 2
The MESL project appeared to provide an excellent opportunity for a study, because it was the first attempt at large-scale distribution of digital images to the educational community. The potential of studying seven different university distribution implementations involving images and text coming from seven different collections seemed to offer the potential to compare different approaches to solving a similar problem.
But several of the factors that made MESL a rich environment to study also made it extremely difficult to obtain consistent and useable data; some of the heterogeneous environments were so different that it was not even possible to find common units of measurement to make comparisons. Our study, therefore, chose to focus on cost centers that were likely to be present as steps in the distribution chain (from the selection of the original object to the point where its digital representation reaches the end user) in most potential distribution models. For each cost center at each step, our study estimated costs, primarily in terms of hours committed to accomplishing the tasks. (Because dollar costs from MESL's heterogeneous institutions are difficult to identify with any degree of certainty, and are likely to fluctuate radically among institutions and over time, we avoided citing them.) And because MESL was a short-term project, its costs were not reflective of the "steady-state" costs which would accompany an ongoing long-term project.
A key problem with trying to extrapolate from the MESL project is that, of necessity, MESL planners chose only one of many possible distribution/delivery models. --In this case, each museum prepared images and text according to rough specifications. This data was then sent to a central site in Michigan and checked for delimiters there. Next, the universities engaged in similar procedures of merging the data from the seven museums, importing the text into their own database structures, indexing the text in different ways, converting images into the sizes they wanted to support (including thumbnails), etc. The MESL distribution model was designed for a demonstration project; it is highly unlikely that this model would be used in production mode. Follow-on projects would likely involve either direct delivery to end users from a single central site that also takes responsibility for data integrity (instead of the MESL model which involves local mounting at each campus and barely addresses data integrity issues), or complete local mounting of data obtained from a variety of sites (the analog slide collection model). This is yet another reason why we have chosen to focus on relatively broad cost centers that may shift between a museum, a central facility, or a university (depending on the distribution/delivery model). And even though the costs in any given center may increase or decrease (even to near zero) depending upon the model, the cost centers we have chosen are broad and critical enough that just about every one of them is still likely to exist somewhere within any model in the foreseeable future.
Extrapolations from a study of MESL must be limited for a variety of reasons. The procedural organization of the MESL distribution model is not likely ever to be repeated. The project was begun ahead of its time on the technology curve, and many of the technologies and procedures attempted were then experimental or unknown. It was a voluntary project with few incentives for the participants to act or respond as they would in an ongoing distribution arrangement. And gathering data for analysis was problematic because of differing units of measurement.
We view our primary contributions as identifying the major cost centers and providing methods for examining each of these cost centers by estimating their associated time commitments, while acknowledging contextual variability such as learning curves. The study also identifies the human knowledge, background, and organizational infrastructures needed to create, distribute, and mount the digitized materials in order to complete the MESL project. By identifying and isolating these cost centers, as well as the knowledge and resources needed to accomplish each phase of digital distribution, we believe that we can provide an important framework for future projects, even those using models where any given cost center may move into a different type of organization. Most importantly, we have also identified a variety of impediments that must be overcome before digital image distribution schemes are widely adopted.
Because this study examined relatively virgin territory, it was necessary to explore the uncharted surrounding areas of image delivery in order to make sense of what digital distribution modalities mean. To effectively understand MESL's digital distribution process and costs, we needed to understand the analog distribution model that has been provided by 35mm slide libraries. By compiling cost and distribution data from six slide libraries at five university campuses, we performed the first extensive cross-institutional study ever done of slide library costs. These analog distribution costs form an important foundation for examining digital distribution costs. However, because the role and functionalities of analog slide libraries differ from those of digital distribution schemes like MESL, and because sustainable digital distribution systems will differ markedly from experiments like MESL, we would strongly caution against direct comparisons.
The MESL Project has shown us that merely mastering the complex technical process of delivering images and text to the desktop does not by itself make such a system viable. A number of accessibility issues are hidden within that process, such as how to get a set of repositories to adopt standards (and how to enforce standardization), how to ensure consistent use of data values between repositories, how to map terminology into the vernacular for users, how to determine what kind of user interfaces and searching capabilities are needed, etc. In addition, many critical issues exist outside of that technical delivery process, including how to provide enough of the images that users actually need, and how to ensure that instructors who invest in curriculum development will continue to have the rights to ongoing access to the images they develop their courses around.
From MESL we have learned that a digital distribution system is a very complex and interlinked process. Its viability depends on supply, accessibility, and demand; images need to be available, easily accessed, and, perhaps most importantly, wanted and needed by the intended users. The various parts of the system are dependent upon decisions made in the other parts (e.g. usage depends upon image and metadata quality, critical mass, delivery and infrastructure, etc.). Examining any single function in isolation (such as image storage and digital database access) would lead to misunderstandings and misrepresentation of the system.
General Findings from the MESL Project
MESL demonstrated that while some factors encourage the use of multi-institutional digital image databases of cultural heritage objects, there are also significant barriers to their widespread use. The following are critical observations and suggestions in the areas of viewing, content, searching and access, technology, infrastructure, and policy.
The digital distribution environment, as a whole, appears to be good for individual usage, and provides access from multiple locations. Most users' home environments are currently inadequate for comfortable use of digital images, but that should change with increased bandwidth, processing power, and screen size. We are even beginning to see wired dormitories as part of campus networks. Shifts to off-site use may alleviate the need for more on-campus computer labs, but will require more sophisticated user authentication systems. However, groups that are not central to the university mission (e.g. alumni, visiting faculty, other visitors) which currently enjoy walk-in access to analog resources may lose access to these additional resources altogether, as authentication systems and licensing arrangements for digital materials become able to distinguish more finely between user groups.
Digital image distribution in its existing form is problematic for group viewing situations, such as in the classroom, where analog delivery is simple, fast, cheap, dependable, and requires little technological infrastructure. Electronic classrooms, computing and network infrastructure, technical and instructional support, and image quality issues need to be addressed before digital distribution to the classroom becomes viable.
The lack of comprehensive content made the database extremely problematic for coursework purposes. For a digital image distribution scheme to be successful, a repository must be able to provide a critical core of important images, what Clifford Lynch has called a "reference collection." (Lynch, 1997:5-13). Most significantly, the definition of "critical core" is likely to be dynamic. New approaches to disciplinary understanding are constantly changing what is considered to be central material for pedagogical purposes (for example "popular art" and "art and gender"). For most users, even a critical core will not offer a comprehensive corpus. Many faculty teaching with MESL images vocalized a need for a "critical mass" of images that would approach the corpus size of their analog slide libraries.
Because faculty content needs can be robust and shifting, a digital image distribution scheme will almost certainly also need to give faculty the option of integrating locally produced material. (Many MESL universities reported having to supplement the MESL database with custom images drawn from their slide libraries.) Future systems must be both extensible and easy to supplement.
MESL content attempted to be responsive to faculty needs. A content selection process solicited faculty input. This kind of active connection between content selection and instructional programs is not likely to scale up. Collection development for future systems will probably be museum-centric, with museums defining a core reference collection that they will distribute.
The different metadata vocabulary and general language used by different institutions made the creation of an integrated and consistent database problematic at best. It is glaringly evident that a project like this needs guidelines and standards at many levels (from field delimiters to controlled vocabulary), 3 and that the standards developed within MESL were not, by themselves, enough. And it is likely that this type of problem will increase as the corpus or domain of coverage scales up. The MESL data dictionary managed to map actual field names into a common exchange format, but the project neither addressed what those field names meant to the body of end users, nor addressed the differing ways in which the contributing repositories used vocabulary within a given field. And since most object metadata was taken from collection management system records, most vocabulary was in the language used by museum curators and registrars. Digital distribution schemes like this could be much more effective if we better understood vocabulary issues in general: how to translate the specialized vocabulary used by specialists into the vernacular used by general users, and how to better map between the various knowledge organization frameworks of different domains.
The interface and the ability to query and manipulate the database is critical for future use. Additional tools for examining, organizing, and saving retrieved sets are also necessary. The MESL model of localized control over distribution discouraged development of expensive retrieval systems. A more centralized model would be able to spread the development costs over a wide body of sites, and would likely lead to better retrieval tools. But local customization of such a system may still be desirable, and this poses an interesting research issue in system design.
At the time of the MESL project, storage space was becoming less of an impediment, but network speed and bottlenecks at routers and servers may remain an issue for image and multimedia delivery, particularly if users are accessing remotely mounted collections.
Universities appear capable of constructing systems that can provide some security and protection for intellectual property. Recently developed university user authentication systems seem to be good enough to meet today's museum requirements. But in the long run museums may want control over image use (such as copying and reposting) rather than the control only over initial image access that today's security systems provide.
Humanities departments tend to be underfunded and technologically inexperienced compared to engineering and science departments, even at technologically advanced universities. Though the MESL Project did hasten the arrival of wiring to some humanities buildings, workstations to humanities faculty desks, computer projection to humanities classrooms, and computing support to humanities departments, these departments may continue to lag behind in speed, processing power, and training needed to use new systems.
There is much university enthusiasm for the use of digital surrogates for cultural heritage material, but many problems must still be addressed before there is widespread end-user acceptance. Instructors are particularly concerned about lack of departmental recognition for what, in their experience, has been a vastly increased workload from teaching with digital images. Tools need to be developed to make digital images easier to use and particularly to make it easier to use them to build curriculum material. But who in the institution will have the responsibility, funding, and expertise to develop these tools is a serious question. Will this be the responsibility of the central library, the departmental library, central computer services, or the individual instructor?
The MESL Project has shown that universities and museums have common interests in providing images and metadata to users. Though conflicts arose periodically, in general the project proved that they have more common than divergent interests and can work together well. But new areas of conflict may arise (such as when faculty present enhanced content back to the museum, or when faculty want to distribute new products they create using museum images).
Audiences for a museum's digital information also exist outside the university community. All involved hope that museums can leverage their efforts at digital distribution to universities to help them deliver to additional audiences. But museums need to take into consideration the special needs of those additional audiences, paying particular attention to the need for different descriptive vocabulary and for organizing sets of images contextually. For instance, the large K-12 community probably needs thematic arrangements of images complete with descriptive information in vocabularies much different than those of curators or art historians. Museum consortia should consider encouraging teachers and others to create added-value packages that can then be redistributed to others.
Copyright issues are significant. Museums tend to be cautious about distributing digital images of works unless they are absolutely certain about rights clearance on the original work (though through the MESL project some museums became less strict about this). Because current copyright law leaves reproduction rights for original works with the artist's estate for some period after the artist's death, and because most museums have not explicitly obtained digital reproduction rights when acquiring a work for their collection, very few 20th-century works will be distributed in digital form by museums for some time to come. Proposed legislation on databases and on works in digital form may greatly affect projects involving the distribution of digital image collections.
The value that museums can provide to universities in projects like MESL may lie more in the authoritative metadata than in the digital images themselves, and it is a mistake to view these as merely imaging projects. The expertise of the museum in the form of authoritative metadata describing an object and its context is critical for scholarly research, and appears to be something the museum can copyright, own, and sell. This situation stands in contrast to the lack of agreement in the legal community on whether digital images of art works in the public domain are copyrightable.
Comparisons with Analog Slide Libraries
The study of analog slide libraries has shed some light on certain functions that exist in the analog versus the digital distribution environment, and has also demonstrated how certain cost centers may differ between these environments. Our study of the analog environment was not extensive enough to answer all the important questions, but it answered some and suggested further comparative studies that should be undertaken:
Analog slide libraries provide a valuable set of services, some of which would be lost in currently emerging models for digital distribution. Slide libraries are customized for their local environments and metadata is customized to meet local needs. Acquisition is end-user driven, and responds quickly to local user demands. A research agenda for digital distribution schemes should consider how future models might support these types of services.
Not all images needed by university users are of the sort held by museums. Therefore the university's image needs extend beyond what can be met through museum consortia. Cultural heritage slide libraries often include images from architecture, religious structures (churches), popular culture, private collections, public site-specific art (cemetery art, monuments, fountains), lesser-known and local artists, and community-based art (such as murals). Collections also frequently include other types of images to provide context for a time period, place, style, or theme.
Analog slide libraries are primarily based in individual campus departments. Digital distribution schemes, however, are likely to be housed in or contracted by campus-wide units. Therefore funding schemes and institutional roles and responsibilities will be much different than the departmental models that characterize slide libraries. This makes comparisons and predictions very difficult.
As we discover some of the types of functionality that users of analog slide libraries find useful and perhaps necessary (such as slide-sorting functions), we can outline some of the functions that digital image libraries are likely to need in order to attract and retain users. Further study of user behavior in selecting and arranging slides for classroom presentation will be helpful in determining functional requirements for desktop toolsets.
Circulation statistics from analog slide libraries can give us benchmarks against which to compare likely overall use of digital image libraries, and indicate likely periods of heavy use. We know that digital delivery removes time and location constraints that limit analog slide use, and we expect that use will increase once digital delivery systems are adopted by users. We also know that use will increase as these collections begin to serve users outside of core departments. As long as digital image collections still strain systems resources, this analog use data can help system architects and planners by suggesting times and levels of high activity.
Our study has revealed a small but significant group of analog slide users that come from outside the primary slide library community. We can expect that these numbers will increase in a digital world where gaining access does not involve visiting an analog slide library located in a particular academic department.
We know that some analog slide library costs (such as re-filing) are just not applicable to a digital environment. If we know how significant those costs are, we can begin to discuss likely cost savings in a digital environment. Our analog study suggests that while the amount of time involved in re-filing is significant, the actual cost of this effort is not great due to the use of low-paid personnel. But we have no idea of the impact of misfiled or lost slides on scholarship.
Our study of MESL identified broad cost centers for the image providers in the preparation process (content selection, image preparation, text data preparation, image transmission, and text data transmission) as well as for the image distributors in the delivery and deployment processes (preparing images, preparing structured text data, preparing unstructured data, creating functionality tools, providing security/access control, outreach, usage training, and technical development). 4 This study of MESL cost centers and related studies of analog slide libraries and faculty attitudes shed light on a number of interesting issues. Administrators and others involved in planning digital distribution should be particularly interested in the following observations:
We believe that the MESL Project was one of the first steps in a transition towards digital image libraries, and that digital collections may eventually replace analog slide libraries. Our study of MESL has revealed some of the differences between slide libraries and digital distribution schemes, and has identified some of the problems that must be resolved before digital image distribution is widely accepted. This study has uncovered important information for designers of digital image distribution schemes. We have highlighted issues of cost, content, infrastructure, and user acceptance. We have shown the serious access issues that emerge from combining text records from museums that use different forms of vocabulary control, and have demonstrated that different distribution approaches towards indexing can yield vastly different search results. We have noted how analog slide libraries differ from any digital image distribution scheme proposed thus far. And we have expressed concerns about where digital image distribution schemes might fit within an institutional hierarchy.
We believe that, in the long run, it will be difficult to financially justify repetitive isolated collections of images on different university campuses. Yet, the tailoring of local collections to local needs (provided by analog slide libraries) is critical to the current instructional environment. We think that it is important that analog slide libraries and digital image distribution consortia coexist for many years to come. But we are very concerned that university administrators will be unwilling or unable to support the financial burden of such hybrid systems.
We feel that this is a propitious time for this study, as two museum consortia are currently developing plans to distribute digital images to the museum community. Both the Art Museum Directors Association's AMICO project and the American Association of Museums' Museum Digital Licensing Consortium are currently designing their distribution and delivery schemes. Their business plans could benefit from a better understanding of the cost centers and efforts involved in making digital images and accompanying text available to the educational community. Thus far these consortia have focused their attention on framing issues (such as terms and conditions of use). Their work on models for production, processing, and deployment, as well as development of cost models, can be greatly informed by the results of this Mellon-sponsored study.
We gratefully acknowledge the support of the Andrew W. Mellon Foundation, which funded this study. We would also like to thank MESL Project Director Christie Stephenson who offered valuable assistance at all levels of this project. Project staff Rosalie Lack, Joanne Miller, and Lena Stebley and project consultants Beth Sandore, Christine Sundt, and Nancy Van House all helped with key areas of this project. We are also thankful for the generous amount of time contributed by interested slide librarians and MESL Project Coordinators, all of whom helped us better understand the important work that they've done. Gary Marchionini, Malcolm Getz, Margaret Radin, Clifford Lynch, Marvin Sirbu, and the late Paul Peters provided important suggestions and guidance as members of our grant's Advisory Board. Katherine Falk supplied editorial assistance. The opinions expressed in this work are those of the authors and project team, and are not necessarily reflective of the views of our parent institutions, sponsoring agency, or advisory board members. More details about this study can be found at http://sunsite.berkeley.edu/Imaging/Databases/1998mellon.
While there had been no
significant baseline economic studies of departmental slide libraries,
there are a number of important studies of the costs of running central
library services for a campus (see citations in Mellon Final Report bibliography).
These central library studies may prove valuable in the future as part
of the baseline comparisons for centralized digital image library collections.
1. See Besser and Stephenson, 1996; Besser, 1997:317-25. Back to text
2. We do not mean to imply the necessity of adherence to one single standard in all cases at every level. With controlled vocabulary, for example, a solution might emerge employing "crosswalks" between a limited set of controlled vocabulary lists, with each institution adhering to one of those lists (rather than all institutions conforming to a single one). Back to text
3. In future distribution schemes some of the MESL distribution functions may well be handled by local entities. Back to text
5. For example, the entire $27,000 annual operational budget of slide library #5 couldn't possibly cover the likely infrastructure costs of creating a digital distribution system (where just functionality in the first year cost $24,000). Back to text
6. For example, Functionality costs in the first year of MESL averaged $24,000, but dropped to $8,900 in the second year after most functionality was already in place. Back to text
Cost of Digital Image Distribution:
Howard Besser & Robert Yamashita