A version of this paper appeared in P. Bryan Heydorn and Beth Sandore (eds.), Digital Image Access & Retrieval (Papers Presented at the 1996 Clinic on Library Applications of Data Processing, March 24-26, 1996), Urbana: Univ of Illinois, 1997, pages 11-28
At the time of this writing (1996) Howard Besser was Visiting Associate Professor at the University of California's School of Information Management & Systems, and then and now he consults for libraries, museums, and arts organizations. He has authored more than a dozen articles on image and multimedia databases, is a frequent speaker to both professional and commercial conferences, and he regularly conducts preconference workshops on image databases at the meetings of a number of different professional organizations. He has been on the Management Committee of the Museum Education Site License Project since its inception. Dr. Besser has also written extensively on the social and cultural impact of new information technologies, on distance learning, and on multimedia digital publishing. He has been on the faculty of both the University of Michigan and the University of Pittsburgh.
We have seen an explosion of image database developments in the decade since work began on the first multi-user networked system. This paper explores the state of technology a decade ago, revisits one of the earliest systems, identifies current interesting projects, discusses the major issues that we're facing today, and forecasts issues and trends that will emerge in the future.
This paper reflects the biases of the author whose primary interests lie in building image databases of cultural heritage materials, and who was involved in the development of the Berkeley Image Database System (ImageQuery).
In 1986 the idea of large-scale image databases seemed quite far-fetched. By today's standards, storage capacity was minuscule, networks were unbearably slow, and visual display devices were poor. The market penetration was very low for most of the tools needed for image database development.
In the past several years we have seen a spurt in the growth of image databases. It is now possible to overcome the once insurmountable technological impediments. Recent increases in storage capacity, network bandwidth, processing power, and display resolution have enabled a tremendous growth in image database development. Literally hundreds of such projects have begun in the last few years.
Technical capabilities in 1986 look primitive when viewed from our current perspective. Future forecasters a decade ago wrote about how technological change would eventually make digital image databases viable (Besser 1987a, Besser 1987b, Besser 1987c, Lynch & Brownrigg 1986), but few people (even those forecasters) were certain that this would happen within their lifetimes.
In this section we will examine the technological capabilities a decade ago, both to try to understand the impediments that we faced at that time, and to give us insight into how we might plan today for changes in the coming decade.
Storage: Hard disks had just recently been introduced in personal computers (such as the IBM XT), and were a fairly new idea for desktop machines. A 30 megabyte disk was considered very large for a personal computer. Large disks for mainframe computers (such as the one hosting the University of California's Melvyl system) each had a capacity of about 600 megabytes and were the size of a washing machine. In an environment like this, proposing the development of collections of one megabyte image files sounded impractical, and the advocacy of 50 megabyte files sounded ridiculous.
Today it is hard to find a new personal computer with a disk much smaller than 100 megabytes, and multi-gigabyte disks are commonplace and smaller than the floppy drives of a decade ago.
Processors: The IBM AT was the newest personal computer. IBM XTs and Apple Macintosh Plus machines had the widest penetration, and the most common processor at the time was the 8086. PCs had an internal memory (RAM) limit of 640K. Mainframe computers such as the IBM 4300 had 16M-32M of RAM, executed 2 million instructions/second (MIPS), and cost around $1 million. Image processing (which is unbearably slow if one cannot have quick and easy random access to the entire image) was impractical, and generally confined to specialized machines.
Today most computers come with a minimum of 8M of RAM, and desktop machines with more power than the mainframes of a decade ago are cheap and commonplace. Today's machines are fast enough and have enough RAM to hold and manipulate an image without the purchase of specialized hardware.
Networks: Networking within a site was not very common. Wiring to the desktop was usually twisted-pair wires carrying signals for terminals or terminal-emulation. Ethernet wiring had come out just a few years before and was still rare. Wide area networks had not really penetrated beyond the defense industry and large universities. Sites were connected to the predecessor of the Internet (the Arpanet) at approximately 56 Kilobits/sec.
Today most wiring is designed to carry full-scale networking. The Internet is commonplace, and large to mid-sized organizations tend to be connected to it at speeds of T-1 to T-3 (1.5 Megabits to 45 Megabits/sec).
Display Devices: Few display devices could handle wide ranges of colors. Eight-bit display devices (256 colors) were considered high-end in the PC market, and required a special card and monitor. In public lectures people were surprised to see images of works of art displayed on a computer screen.
Today 24-bit displays (16 million colors) come as a standard feature on new PCs, and no special cards or monitors are required. Onscreen graphic images are frequently used to promote computer and software sales.
Scanners: Scanners were expensive and rare. The only advertisements for scanners appeared in catalogs of instrumentation devices. Scanning software had poor user interfaces, and most scanners required programming skills in order to make use of them. Most software did not permit immediate onscreen viewing of the image, and frequently the user had to scan on one workstation, run programs on the scanned file, and move it to another workstation to view it. Even when attached to a powerful CPU, scanners were slow (a 45 minute scan was not out of the question) and frequently required so much light and accompanying heat that scanning of delicate objects, such as works of art, was impossible.
Today very good scanners sell for under $750 and are available through most sources that sell computer peripherals. Virtually all scanners come with point-and-click software that quickly displays images on the screen. Today a scan that takes more than a few minutes is considered unbearably slow, and light and heat exposure are within tolerance levels for most objects.
Compression: The only image compression scheme with wide implementation was the CCITT Group III standard employed in fax machines. Work on defining compression standards for color images was just beginning. With this lack of sophisticated compression standards, individuals developed their own compression schemes, and images compressed using these schemes could not be decompressed by others.
Today compression schemes such as JPEG and LZW are widely accepted standards, and the capability to decompress these files is included in a wide variety of image display and processing software, as well as in generic viewing and browsing tools, such as Web browsers.
Client-Server Architecture: X-Windows was the only client-server architecture with a significant installed base, but its deployment at the time was very small (limited primarily to a small percentage of Unix-based workstation on major university campuses). Because image database designers could not rely upon distributing processing to the client, most designs had to assume all image processing would be done at the server, and that high bandwidth would be required in order to send compressed files to the client.
Today the widespread deployment of Web browsers permits image display and processing functionality to be off-loaded to the client. This puts less strain on the server and on the use of network bandwidth.
In 1986 UC Berkeley's office of Information Systems and Technology began work on a project to deliver high quality digital images from its Art Museum, Architecture Slide Library and Geography Department. The developers believe that this software (eventually called ImageQuery) was the first deployed multi-user networked digital image database system. The software was first shown publicly at the conferences of the American Association of Museums and the American Library Association in June of 1987.
ImageQuery was an X-Windows based system with a number of features that were relatively new for the time: a graphic user interface, point-and-click searching, thumbnail images to permit browsing and sorting, tools for annotation of images, and the linking of images to locations on maps. In addition, ImageQuery was designed for networked accessibility, had client-server features, and permitted boolean searches. ImageQuery design and features have been described in more detail elsewhere (Besser 1991b, Besser & Snow 1990, Besser 1990, Besser 1988a, Besser 1988b). Here we will focus on some key elements from ImageQuery, and analyze them with the benefit of a decade of hindsight.
ImageQuery featured thumbnail images linked to a list of brief records for each image (see figure 1). Clicking on an image highlighted that image as well as the related text record. Clicking on a text record highlighted the related image. This proved to be a powerful method both for finding the correct image off a list of hits, and for quickly identifying an image displayed on the screen.
figure 1 ImageQuery Screendump
(images courtesy of Phoebe Hearst Museum of Anthropology, UC Berkeley)
Each displayed thumbnail image was linked to both a full text record and a larger version of that image. A pulldown menu (triggered by pointing to a thumbnail image and holding down a mouse button) would give the user the choice of displaying the full image or text (see menu below thumbnail of jacket in figure 1). Again this proved to be a powerful tool to link browsing to fuller information, though in today's environment small buttons appear to be more effective than pulldown menus.
Image Query's architecture was modular (see figure 2). The user interface sent queries to a database that resided separately, so different databases and structures could serve as the "back-end". For a number of years ImageQuery could only support back-end structures that had been collapsed into flat files, but eventually capabilities were added to support SQL-type queries. Another limitation of ImageQuery was that the text database structure had to be pre-identified and coded into a short preferences file, rather than dynamically discovered.
figure 2 ImageQuery's modular structure
figure 3 Generalized structural
model for Image Database
ImageQuery also employed modularization to link in sets of tools for users to view and process images. By pointing to an onscreen image, a user could pull down a menu and choose a variety of image processing tools that could be applied to that image. ImageQuery would then invoke software (such as paint programs for annotation or color-map programs for balancing and altering colors, or processing programs for zooming) that would allow them to analyze or alter the current image.
This idea of linking to external tools is still very important. One can expect that a variety of tools will emerge for image manipulation, for image organization, and for classroom presentation. Image database developers cannot hope to keep up with the latest developments in all these areas (particularly in areas like image processing and display which will respond quickly to software and hardware developments). By providing modular links to external software, image database developers can instead leverage off of the large image processing and consumer markets and the continuous upgrading of functionality that is likely to take place within those markets. But in order to do this effectively, the image database community needs to define standard links it will use to invoke these programs.
The ImageQuery team's idea of links to external tools was part of a broader view of what an image database should be. The team's philosophy was that (particularly in an academic environment) simply providing access to a database was not enough; developers had the responsibility to provide the user with tools to integrate the results of database retrieval into their normal work processes. This was part of a general notion then beginning to emerge within the academic community that libraries, computer centers, instructional designers, and users should be working together to build "scholars' workstations" (Rosenberg 1985; Moran 1987). Over the years these ideas have been implemented in a variety of areas including the capability of downloading records from an online public access catalog into software for handling personal bibliographies and footnotes (Stigleman 1996), or the development of templates to help instructors build instructional material incorporating images from a database (Stephenson & Ashmore). A key factor that has enabled the joining of tools to databases is the adoption of standards (Phillips 1992).
The ImageQuery developers recognized the importance of a client-server architecture, both to assure that the image database could be accessed from a wide variety of platforms, and to put less of a strain on the server and network by off-loading some of the functionality onto client workstations. But the ImageQuery team expected that environment to be an X-Windows based environment. For many years they waited patiently for a variety of developments over which they had no control -- the porting of X-Windows onto Intel and Macintosh platforms, an increase in the installed base of X-Windows machines, and the development of the X Imaging Extensions (MIT X Consortium 1993). No one on the ImageQuery development team anticipated the phenomenal growth in WorldWide Web browsers that would clearly make this the delivery platform of choice. Web browsers not only solved the multi-platform and central database load problems, but they implemented client functionality in a much more sophisticated way than ImageQuery. Web browser helper applications recognize a variety of image file formats, handle decompression, and can spawn external viewing software (all of which combine to lessen the load on the network and the server, and to increase the number of file storage options).
Another key philosophy behind ImageQuery was the implementation of a user interface that would provide a common "look and feel" across all image collections. Prior to ImageQuery, each campus object collection had its own idiosyncratic retrieval system and user interface (Besser and Snow 1990). Users had to make a substantial investment of time to learn to use one of these retrieval systems, and most appeared reluctant to invest the time to learn a second. The ImageQuery team believed that a common user interface would encourage cross-disciplinary use of these collections, so they designed a system that on the surface always appeared the same to the user. Only the names and contents of fields differed from database to database, and an "authority preview" function was developed to permit users (particularly those unfamiliar with valid terms associated with a field name) to view a list of terms that had been assigned within a given field. It is likely that much of the appeal of WorldWide Web browsers lies in the fact that they act as a universal interface, providing a common "look and feel" to anything they access. Though a function to preview the actual contents of a field within a database still appears powerful, this has not yet been widely implemented.
There are a number of areas in which the designs for ImageQuery look naive in retrospect. Though the notion of interoperability still appears important, the functionality to allow searching across image databases of different objects (each having different fieldnames and contents) is vastly more complex than the ImageQuery team anticipated (Besser and Snow 1990; Besser 1994b; Beauregard 1994) The ImageQuery team was also naive in dealing with the issue of scaling up. Though some thought was put into methods for decreasing storage cost and topologies which would limit the impact on a particular server or a particular segment of a network, very little thought was put into issues of how to handle queries that might retrieve thousands of initial hits. ImageQuery did provide for important functionality like visual browsing to narrow down query sets (by clicking on the thumbnail images that the user wanted to save), but by itself this would not help the user whose initial query retrieved more than 100 hits. In retrospect, functions like relevancy feedback look critical to dealing with large image databases (see the section "Where do we go from here?: Retrieval).
The landscape today is far different from that of a decade ago. A combination of technological developments and adventurous pioneering projects has paved the way for serious image database development. In recent years there has been such a rapid explosion in image database projects and developments that any attempt to publish an article compiling these would be outdated before it was printed. Here the author will just make brief mention of the most recent important developments; he sporadically maintains a more current list on the WorldWide Web (Besser Image Database Resources).
Important issues facing image databases in the recent past have been outlined elsewhere (Besser 1995a; 1995b; 1992; 1991a; Cawkell 1993). Guides to building image databases in environments such as cultural repositories have begun to appear (Besser & Trant 1995). A Listserve is now devoted to image database issues (ImageLib Listserv), and the same group at University of Arizona's Library also provides a clearinghouse of image database products (ImageLib Clearinghouse). An online image database bibliography is also available (Besser Image Database Bibliography).
Many hundreds (probably thousands) of collections are at least partially accessible on the WorldWide Web. Photographic stock houses have begun digitizing their images, and there are now well over a dozen commercial vendors with collections of over 100,000 digital images. New competitors (such as Bill Gates' Corbis, Kodak's KPX, and Picture Network Inc's Seymour) are trying to market digital images to a wide variety of markets.
The Museum Educational Site Licensing Project (MESL) has given us the first serious testbed for image databases in a multi-site academic environment. Images from 7 museums are being distributed and deployed on 7 university campuses (Museum Educational Site Licensing Project). This project is already helping us to identify intellectual property issues (Trant this Proceedings), standards and issues needed for image distribution (Besser & Stephenson forthcoming), and the infrastructure and tools needed to deploy an image database in an environment with many users. This project will also help us understand what we will need to incorporate the use of image databases into the instructional environment.
The Computerized Interchange of Museum Information (CIMI) project is designed to define interchange issues for the museum environment. Most of the work thusfar has taken unstructured and database-generated textual information that in some way relates to museum objects, and inserted SGML tags into this text so that it conforms to the structured text standard developed by the project team. CIMI's work is likely to provide us with keen insight into interchange issues involving images and accompanying text.
A number of impediments to the widespread deployment of image databases still remain. Some of these will be solved whether or not the library and information science (LIS) communities choose to participate, while others can only be solved by the LIS communities.
Impediments due to the limitations of storage capacity and cost, bandwidth, client-server functionality, and scanner capabilities will be solved without LIS participation. Storage capacity will continue to increase, storage costs will fall, network speeds will accelerate, and client-server functionality will continue to grow. Scanner throughput and reliability will increase, image capture quality (in terms of resolution, bit-depth, and fidelity) will improve, and scanner software will develop even better user interfaces and increased interoperability with image processing and other software. The driving forces behind these changes are a constituent market that is so large that the LIS community probably couldn't have much of an impact even if it tried to.
The LIS community needs to focus attention where it can play a critical role. One such key area is around issues of image longevity. The LIS community has begun to identify issues of long-term preservation and access to digital information in general. The author has participated in a task force on digital preservation issues co-sponsored by the Commission on Preservation and Access and the Research Libraries Group. This task force has put forward the notion of data migration as far superior to data refreshing, and has made a variety of recommendations to assure long-term preservation and access of materials in digital form. These include: creation of certified storehouses for cultural heritage materials, development of metadata standards, and development of migration strategies (Waters 1996).
The LIS community also needs to work on ensuring integrity and authenticity of digital information. The widespread use of image processing tools has led to widespread dissemination of "altered" images, particularly over the WorldWide Web. Our community needs to find ways to assure users that an image is truly what it purports to be. This is an area where it might be most promising to intervene in industry discussions about security and control over access to digital information. Security tools like digital signatures, encapsulation, and cryptography might also be adapted to ensure integrity and authenticity. Because publishers and technologists are currently experimenting and developing standards for security, it is critical that the LIS community becomes immediately involved in shaping these standards so that they do not preclude extensions which will ensure integrity and authenticity .]
Metadata standards for digital images are critical. Current practices for image header information are sufficient to provide most of today's applications with enough information (about file format and compression) to successfully view the image, but it is doubtful that these will be sufficient to view these images a decade from now (let alone view them a century later). Today it is difficult for applications to recognize or view documents created with the most widely-used word processing program of a decade ago (Wordstar). We must take the steps necessary to ensure that digital images produced today will be viewable well into the future, and a key step in making that happen is the provision of adequate metadata.
The first set of metadata we need to define is technical imaging information. This is the information that applications will need in order to open the image and view it appropriately. For this, we will need to include basic information about the image (dimensions and dynamic range), the scheme used to encode the image (file formats such as TIFF, GIF, JFIF, SPIFF, PICT, PCD, Photoshop, EPS, CGM, TGA, etc.), and the method used to compress it (JPEG, LZW, Quicktime, etc.). We will also need to note information about color, including the color lookup table and color metric (such as RGB or CMYK).
A second area for which we need to develop metadata standards is information about the capture process. We need to store information about what was scanned (a slide, a transparency, a photographic print, an original object), some type of scale to relate the size of the scanned image to the dimensions of the original object and/or the item scanned, and the type of light source (full spectrum or infrared). For quality control and accurate viewing, processing information (such as scanner make and model, date of scan, scanning personnel, audit trail of cropping and color adjustments, etc.) is likely to prove helpful. When color management systems improve their handling of onscreen display, having information about the model of scanner used to create an image will be critical in order to view that image with appropriate colors.
We also need to consider information about the quality and veracity of the image. Who was responsible for scanning (for certain purposes we might need to distinguish between an image scanned by the Metropolitan Museum of Art and an image of the same object scanned by a teenager on her home scanner)? What source image was scanned (the original, a high quality transparency, or a page out of an art book)? It would also be useful to be able to recursively track the source of the image. Our communities have not yet reached a consensus on whether digital copies are equivalent to other digital copies, particularly if they differ in compression scheme, file format, resolution or bit-depth, or if one is close-up derived from a portion of the other. We have just begun to identify the issues in image equivalency (Besser & Weise 1995), and need to come to common agreement on vocabulary with which to discuss this (such as ]versions and editions). This kind of identification is also critical for us to be able to enter a new stage of networked information where we begin to identify digital information as distinct works (which may reside in multiple locations as the same or different versions) rather than the (very dangerous) current situation where we identify networked information as a particular location in the form of a URL. Separating a work from its location (though URNs and URCs) will be a critical development for networked access to information in the next few years.
Another critical factor involving veracity is to develop ways of assuring that the image is indeed what the metadata contends that it is. Today many images on the WorldWide Web purport to be what they are not (Besser Ethics). As mentioned earlier, systems for data encryption, encapsulation, and digital signatures need to be adapted so that they can help assure authenticity and veracity of images.
The final area that will be important is information about rights and reproduction of the image. It would be advantageous for metadata to note basic information such as use restrictions related to viewing, printing, reproducing, etc. Contact information for the rightsholder should also be included. Some of this information should be stored where it cannot be separated from the image (i.e.. in the header or footer), while some of the information should be stored where it can easily be accessed by a retrieval program (i.e.. in an external database). Because each derivative of an image inherits rights restrictions from its parent but may also convey certain rights to the derivative creator, the rights metadata for a given image might be complex (including restrictions on the original, a photographic copy, and a scan of that photographic copy).
Much work still needs to be done in refining each of these areas of image standards. The constituent communities (LIS, commercial imaging, networked information) need to come to some common agreement about these standards. They need to agree on what types of information must be placed in the image header (where it is less likely to become disassociated with the image), what types of information should be placed in an accompanying text record, and what information should be duplicated in both. For each piece of this metadata, these communities must identify a field to house it and define a set of controlled vocabulary or rules for filling in that field. Wherever possible, these communities should adapt existing standards to incorporate the needs of images. In some areas we will have to work with other bodies to make sure the standards they adopt will incorporate our needs, and in other areas we will have to set the standards ourselves. And in many cases we will have to follow the standard-adoption cycle with a strong public relations campaign in order to convince application vendors to implement the standard we adopt.
Because we are still constrained by the technological limitations of storage and bandwidth, we clearly have to separate the issue of the quality of image we capture and save versus from the quality of image we choose to deliver today. It is certainly possible (and perhaps preferable) to capture an image at a higher quality than we can afford to deliver, and derive a lower-quality image that we will deliver today. Then, as our technological capabilities improve, we can go back to those stored images and derive better-quality ones (without having to repeat the more costly step of image capture).
We still know very little about image quality needs. In the area of cultural heritage, there has only been one set of serious studies examining the quality of image we need to provide to users (Ester 1990, 1994). This set of studies (by the Getty Art History Information Program) had a small population, studied a small set of images, and did not examine the effects of compression. But the methodology of this set of studies (identification of the points at which users could not discern differences in image quality, plotting these on discernability/cost axis, and suggesting that delivery systems should choose the quality at the beginning of the various flat points on the curve) is very sound, and should prove useful for further studies.
We must be careful not to let the perceptions of our current users affect our long-term custodianship over digital images. We know that users' perception of image quality changes over time, and is shaped by the quality of the images they see in their daily lives. In the early 1950s, a grainy 6-bit image on a screen would have looked excellent to a viewer used to black & white television. A decade ago, 8-bit images were really impressive; today they look inferior to people who have 24-bit display capabilities. If HDTV comes into widespread use, the average person's idea of what constitutes a quality image will again change significantly.
It is perhaps more relevant to seriously explore the use that is made of images in particular domains. In some domains it will be important for digital images to preserve some of the artifactual nature of the object (such as the paper grain on a manuscript page), while in other domains it will only be important to preserve the information content of the object (such as the words on a page). We need a better understanding of these differences.
We need many more studies like those done at Getty AHIP, stratified by user type (undergraduate student, faculty researcher, curator, research scientist), domain (art history, archeology, coronary medicine, astronomy) and type of object represented by the image (painting, pottery, X-ray). This will give us some guidance as to the level of image quality we need to deliver to current users. And we need to use what we learn from such studies to distinguish between different classes of purposes for image digitization (preservation, scholarly research, consumer access, etc.).
Because most collections of images have very little textual information already accompanying them, our traditional means of retrieval cannot easily be applied to images (Besser & Snow 1990). Museums, which collectively house one of the largest bodies of images that do have accompanying text, often assign terms to an image which are not at all helpful to the average layperson. Vocabulary for scientists, art historians, and doctors appears foreign to the average user searching for images.
Few collections anywhere in the world provide item-level access to images using terminology that is useful to the average person or to anyone outside the very narrow domain for which access was designed. While most collections wish to expand their usefulness to other "markets", very few will be able to afford the cost of assigning terms to each individual image within their collections. Two methods for dealing with this appear to hold promise: user-assigned terminology and content-based retrieval.
If we can develop systems for user-assigned terminology, collection managers can rely upon users to assign terms or keywords to individual images. Under such a system, when a user finds an image, the system would ask them what words they might have used to search for this image. Those words are then entered into the retrieval system, and subsequent users searching on these words will find the image. As the number of people using such a system grows, so do the number of access points for many of the images.
It is essential that such systems allow searches against officially-assigned terms both independently of user-contributed terms and in conjunction with them. We can expect two types of searches: one that only looks at terms assigned by catalogers, and the other that looks at both cataloger-assigned terms and at user-assigned terms. Systems like this will also be able to serve as aids to catalogers. One can envision a system where periodically user-contributed terms will be "upgraded" to officially-assigned terms by a cataloger (and will then be retrievable by both methods).
As systems like this grow, future users may want to limit their searches to terms assigned by people who they trust (perhaps because they come from the same field, or because they assign terms more reliably). So these systems will likely develop both a searchable "ownership" feature for each term assigned, and a "confidence level" that a user can set which applies to a group of owners. Design of systems like this will also have to be sensitive to the privacy of term contributors. Users setting confidence levels for term-assigners may locate these people through basic profiles of their subject expertise and position (but not name), or they may locate them by finding correlations between other term-assigners and how the user him/herself assigns terms to other images (as incorporated in current systems such as ]Firefly).
User-assigned terms are likely to be part of a broader trend that will affect collection access. As resources for cataloging diminish while digitally based material becomes more available, collection managers will begin to rely more heavily upon input from their users. Recently, a professor at the University of Virginia has been contributing information to the Fowler Museum in Los Angeles about the objects pictured in the digital image he is using through the Museum Educational Site Licensing Project. We will have to develop feedback mechanisms to channel information from scholars back into the collections and collection records.
In the past, we have maintained that image browsing functions will help overcome some of the problems associated with the paucity of associated text (Besser 1990). But recent breakthroughs in ]content-based retrieval hold the promise of even more far-reaching effects. Content-based retrieval systems such as Virage (see other paper in this volume), UC Berkeley's Cypress (see other paper in this volume), and IBM's QBIC offer users the opportunity to ask the system to "find more images like this one". The two critical pieces to content-based retrieval are image extraction (the system's capability of automatically finding colors, shapes, texture, or objects within an image) and relevance feedback (the capability to retrieve images in a ranked order in relation to attributes identified [usually as part of the extraction process]).
Currently, some content-based retrieval systems are extending relevance feedback functions to incorporate existing text records in addition to image features, and this will prove to be a very powerful tool for image retrieval. In the coming years these systems will also need to adapt their measures of similarity to work differently for various user populations (for example, the meaning of similarity in color or texture may be different for a graphic designer than for an art historian).
In the future we can expect the emergence of new types of user interfaces. Virtual reality techniques will provide new ways of seeing and navigating through a body of information, and provide us with new metaphors for relating to that information.
Another key issue will be the development of analytical tools to view, recombine, and manipulate images. As was explained in the earlier section on Imagequery, software and learning materials to manipulate images are critical parts in building a Scholar's Workstation. Tools like Mark Handel's CLens (Handel) (which lets a user move a digital magnifying glass over an image and move through different registered images [such as infrared or radiograph versions]) and Christie Stephenson and Lara Ashmore's templates (to help instructors create instructional exercises using images) are critical parts in making image databases useful as more than mere retrieval tools.
A final critical issue is that of scalability. No one has yet built a very large highly used image database. Though we can identify key issues that we know will cause problems (such as how to handle queries that retrive thousands of hits, or how to migrate images between primary, secondary, and tertiary storage), we really don't know how various architectures and functions will scale up.
From reviewing the past, it should be clear that what seem like insurmountable technological impediments can disappear in just a decade. From this we should learn not to let current impediments distract us from seriously moving towards the implementation of image databases for the future. Thinking about how today's impediments might be viewed a decade from now might help us move towards that future without being saddled with the limitations imposed by today's technologies.
This paper has outlined some immediate steps that must be taken in order to move forward. We must move from constructing a collection of discrete images to building a library of material that inter-relates and inter-operates. The digital library of the future will not simply be a collection of discrete objects, but will also provide the tools for analyzing, combining, and repurposing the objects. Digital objects housed in a library will become the raw material used to shape still newer information objects. Builders of image databases must develop a broad vision that goes beyond merely capturing and storing a discrete set of digital images.
Discussions with Christie Stephenson helped expand upon a number of points in this paper. Clifford Lynch, Cecilia Preston, and Janet Vratney helped identify the technical capabilities for 1986. Steve Jacobson, Randy Ballew, and Ken Lindahl wrote the code for ImageQuery. Maria Bonn provided editorial assistance.
Beauregard, Louise, Luc Bissonnette, and Judy Silverman. (1994). Report on Collections and Library Authorities Databases Mapping Project, Montréal: Centre Canadien d'Architecture, 1993-11-10 (unpublished internal document).
Besser, Howard. Ethics and Images on the Net, (WorldWide Web site) [http://sunsite.berkeley.edu/Imaging/Databases/Ethics]
Besser, Howard. Homepage, (WorldWide Web site) [http://www.sims.berkeley.edu/~howard]
Besser, Howard. Image Database Bibliography, (WorldWide Web site) [http://sunsite.berkeley.edu/Imaging/Databases/Bibliography]
Besser, Howard. Image Database Resources, (WorldWide Web site) [http://sunsite.berkeley.edu/Imaging/Databases/]
Besser, Howard. (1995a). Image Databases Update: Issues Facing the Field, Resources, and Projects, in Mimi King (ed.), Going Digital: Electronic Images in the Library Catalog and Beyond (pages 29-34). Chicago: Library Information Technology Association.
Besser, Howard. (1995b). Image Databases, Database 18 (2), April, pages 12-19.
Besser, Howard. (1994a). Image Databases, Encyclopedia of Library and Information Sciences 53 (16), New York: Marcel Dekker, pages 1-15.
Besser, Howard. (1994b). RFP for Library Information Systems for the Canadian Centre for Architecture, Montréal, (unpublished document) Montréal: Centre Canadien d'Architecture.
Besser, Howard. (1992). Adding an Image Database to an Existing Library and Computer Environment: Design and Technical Considerations, in Susan Stone and Michael Buckland (eds.), Studies in Multimedia (Proceedings ofthe 1991 Mid- Year Meeting of the American Society for Information Science), Medford, NJ: Learned Information, Inc, pages 31-45.
Besser, Howard. (1991a). Advanced Applications of Imaging: Fine Arts, Journal of the American Society of Information Science, September, pages 589-596.
Besser, Howard. (1991b). User Interfaces for Museums, Visual Resources 7, pages 293-309.
Besser, Howard. (1990). Visual Access to Visual Images: The UC Berkeley Image Database Project, LibraryTrends 38 (4), Spring, pages 787-798.
Besser, Howard. (1988a). Image Processing Integrated Into a Database for Photographic Images: Applications in Art, Architecture, and Geography, Electronic Imaging '88 vol. 2 (Advanced Paper Summaries), Waltham, MA: Institute for Graphic Communication.
Besser, Howard. (1988b). Adding Analysis Tools to Image Databases: Facilitating Research in Geography & Art History, Proceedings of RIAO 88, March, volume 2, pages 972-990.
Besser, Howard. (1987a). Digital Images for Museums, Museum Studies Journal 3 (1), Fall/Winter, pages 74-81.
Besser, Howard. (1987b). The Changing Museum, in Ching-chih Chen (ed), Information: The Transformation of Society (Proceedings of the 50th Annual Meeting of the American Society for Information Science), Medford, NJ: Learned Information, Inc, pages 14-19.
Besser, Howard. (1987c). Computers for Art Analysis, in R. A. Braden, et al. (ed), Visible & Viable: The Role of Images in Instruction & Communication (Readings from the 18th Annual Conference of the International Visual Literacy Association), Blacksburg, VA: IVLA.
Besser, Howard and Maryly Snow. (1990). Access to Diverse Collections in University Settings: The Berkeley Dilemma, in Toni Petersen and Pat Moholt (eds.), Beyond the Book: Extending MARC for Subject Access, Boston: G. K. Hall, pages 203-224.
Besser, Howard and Christie Stephenson. (forthcoming). The Museum Educational Site Licensing Project: Technical Issues in the Distributiion of Museum Images and Textual Data to Universities, in Proceedings of the 1996 Electronic Imaging and the Visual Arts Conference, Fleet (Hampshire, UK): Vasari Enterprises.
Besser, Howard and Jennifer Trant. (1995). Introduction to Imaging: Issues in Constructing an Image Database, Santa Monica: Getty Art History Information Program. [http://www.ahip.getty.edu/intro_imaging/home.html].
Besser, Howard and John Weise. (1995). "Don't I Already Have that Image?" Issues in Equivalency of Digital Images (unpublished paper).
Cawkell, A. E. (1993). Developments in Indexing Picture Collections, Information Services and Use 13 (4), pages 381-388.
Ester, Michael. (1990). Image Quality and Viewer Perception, Leonardo 23 (1), pages 51-63.
Ester, Michael. (1994). Digital Images in the Context of Visual Collections and Scholarship, Visual Resources X (1), pages 11-24.
Handel, Mark CLens, (Java Applet), [http://www.sils.umich.edu/~handel/java/Lens/]
ImageLib Listserv, (Internet site) [SUB imagelib to email@example.com]
ImageLib Clearinghouse, (WorldWide Web site), [http://www.library.arizona.edu/images/image_projects.html]
Lynch, Clifford A. and Edward Brownrigg. (1986). Conservation, Preservation & Digitization, College and Research Libraries 47, pages379-82
Moran, Barbara. (1987). The Electronic Campus: The Impact of the Scholar's Workstation Project on the Libraries at Brown, College and Research Libraries 48 (1), Jan, pages5-16.
Museum Educational Site Licensing Project. Homepage, (WorldWide Web site) [http://www.ahip.getty.edu/mesl]
Phillips, Gary Lee. (1992). Z39.50 and the Scholar's Workstation Concept, Information Technology and Libraries 11 (3), Sept , pages 261-70
Rosenberg, Victor. (1985). The scholar's workstation, College & Research Libraries News 10, Nov, pages546-9.
Stigleman, Sue. (1996). Bibliography programs do Windows, Database 19 (2), April/May, pages 57-66.
Stephenson, Christie and Lara Ashmore. Examples [using MESL images to teach Art History], (WorldWide Web site) [http://jefferson.village.virginia.edu/uvamesl/example_projects/home.html]
United Kingdom Office for Library & Information Networking and OCLC Online Computer Library Center (organizers). Metadata Workshop II, April 1 - 3, 1996, University of Warwick, UK [http://www.purl.org/OCLC/RSCH/MetadataII/]
MIT X Consortium. (1993). X Image Extenstions Protocol Reference Manual version 4.1.2. The X Resource: a practical journal of the X window system, "Special Issue C", January (O'Reilly & Associates).
Waters, Don et al. (1996). Preserving Digital Information: Report of the Task Force on Archiving of Digital Information, Washington: Commission on Preservation & Access (in press) [http://purl.org/net/archtf/].
Weibel, Stuart, et. al. The Dublin Core Metadata Element Set Home Page, [http://www.purl.org/metadata/dublin_core]
For the purpose of this discussion, what
we call the "LIS community" consists of a number of different
communities: library, information science, cultural heritage, and the
Metadata is "data about data". A cataloging record and a bibliographic citation are both metadata for a book.
At some point in the future, a repository may discover that a particular scanning staff member was colorblind to orange or that a scanning device lost its blue sensitivity. This information will help identify (and possibly even restore) problem images.
This is similar to many OPACs today which permit subject searches against cataloger-assigned subject terms, but also allow keyword searches which run against words in a number of fields (including Subject).
Benjamin C. Ray of the Religious Studies department. The Fowler Museum does not currently have a curator to cover this domain, and in some ways Professor Ray is effectively acting as a remote curator for them.