AN EXAMINATION OF THE UC BERKELEY LIBRARY GOPHER 'INFOLIB'

Hella Heydorn
FINAL PAPER LIS 296A
SPRING 1994
Instructor: Howard Besser

NOTICE

Part I: Information and the structure and organization of InfoLib

  1. Information as a resource

  2. Information as a service

  3. Information as a pattern

  4. How easy is it to get to know InfoLib? A short practical evaluation

Part II: Cataloging issues and procedures

  1. Types of computer files-field: 008/26 and 256-fields

  2. Electronic location and access

  3. Classification

Part III: Who makes decisions for selection?

It seems like a risky endeavor to try to understand the nature of Gophers, a tool that is still evolving but that already has been hailed for its potential as a research instrument[1]. Until now, the Internet "architects" were the guarantors of quality since they include educational institutions, government agencies and other organizations involved in research. This paper will examine one small segment of the Internet, the UC Berkeley Library Gopher InfoLib.

The examination consists of three parts. In part one, I examine the structure and organization of InfoLib, keeping in mind a changing attitude towards information and information retrieval. In part two, I adopt a librarian's point of view in discussing cataloging issues, primarily for electronic journals. In part three, I consider InfoLib as part of a giant reference source. Who makes the decision to include and exclude an information source?

Part I: Information and the Structure and Organization of InfoLib

The information explosion goes hand in hand with the emergence of new information sources. The librarian and the library which s/he creates are affected by the interplay of information, the need for control, and control technologies. A technological and social phenomenon like the Internet invites us to revisit the mechanisms that bring together human beings and recorded knowledge. In the center of it all is information and the filtering and delivering of it from the amazing quantity of information available.

There is a number of different definitions of information and how it is filtered and delivered[2]. Blokdijk unites communication and knowledge in his definition[3]. Langefors includes the concept of value and action[4]. And Davis and Olson add that information reduces uncertainty[5].

The characteristics of each of these types of perception of information and how they affect the structure and organization of InfoLib are discussed below.

1. Information as a resource

This is the most general, accessible and flexible concept of information. Ravault [6] points out that this definition emphasizes the uses people make of information rather than its effects upon people and society. It is the least controversial one and the most widely used.

This is also the definition underlying the understanding of information on the Internet. However, when information is defined as a resource, librarians encounter problems. An array of difficulties stem from the basic fact that, unlike physically available material information in the electronic environment cannot be shelved and physically pointed at. (This raises questions in cataloging procedures which will be dealt with in part II.) Viewing information as a(n electronic) resource, its creators, processors and users appear as discrete and isolated entitites. Information comes in pieces, unrelated to bodies of knowlege or information flows. Jesse Shera called this the "(...) stimulus which we perceive through our senses. This information may be a single isolated fact or it may be a whole cluster of facts; but it is still a unit; it is a unit of thought. It can have any dimension but it is that intellectual entity which we receive, the building block of knowledge" [7]. Shera's thesis is that librarianship is not only built upon documentation but also on dissemination of information: "it means taking the initiative in creating channels along which information may pass quickly to those who can use it" [8].

2. Information as a service

This definition applies to InfoLib insofar as UCB Libraries are offering a service to the UC Berkeley campus and a virtual scholarly community.

Information as a service increases its scope as it involves social, cultural and educational values. It incorporates the exchange and use of information among people (in the widest sense of the word). The social structure, too, is more articulated, comprising librarians, patrons and the university (as the organization that sustains InfoLib). This take on information also alludes to the power of knowledge -- and the role of librarians to contexualize knowledge and therefore to increase power.

3. Information as a pattern

This approach, too, adds context. Seen from this perspective, information has a past and a future. It is affected by changes and may be altered by decisions. To perceive information as a pattern reduces uncertainty. This model finds its most practical application in an economist's point of view (and less in an academic or intellectual setting). It is the underlying assumption for quantifying and valuing information. Information flow as a pattern may be used by those who select and control information resources (this question will be dealt with in part III).

These concepts of information provide us with a framework for understanding the advent of new information technologies (like the Internet and InfoLib) to the library. It broadens the concept of a book storage facility to a social and intellectual phenomenon[9]. New technologies, therefore, are one factor (besides demographic and organizational factors) that bring about changes in the organization of libraries[10]. Computers seemed to pose both a fundamental threat and offer a great potential to librarianship as a profession and libraries as an institution. They created a dilemma for librarians. On the one hand they facilitated circulation and revolutionized catalogs and therefore released librarians from repetitive, low-status tasks. On the other hand, computerization of library services required an almost complete standardization of bibliographic records, cataloging procedures, indexing etc. and profoundly shook up the "area of judgment that made librarians professionals"[11]. Just like computers, the Internet and electronic publishing were not invented by libraians. But they have forced librarians to rethink their professional ethos and not only adapt to but also organize the information chaos thrown up by the Internet[12]. While librarians have developed libraries into a sophisticated form of information control, it is now necessary that they invent forms of control for the Internet in order to bring people and knowledge together. With InfoLib, the library attempts to domesticate at least one portion of the Internet by filtering out the resources that may be most relevant to the academic disciplines of the university.

InfoLib is undoubtedly more part of a virtual library than of the UC Berkeley Libraries. It forces us to view the library as a logical entity and it offers a great opportunity for librarianship to shift the focus from library materials to the librarian.

Gapen[13] suggests viewing the virtual library as a metaphor for a control revolution. Keeping the metaphor of a control revolution in mind, the voices complaining about the Internet's general lack of control and organization have been as loud as those praising it as an invaluable reference source.

InfoLib is designed for use primarily by the UC campus and a wider scholarly community. In the most basic sense it is a tool to help search a variety of online resources. It is a substantial source of information. As with all information sources in the electronic environment, the value of a source is dependent not only upon the intrinisic quality of the information itself but also upon how easily the information can be accessed [14]. For printed materials, access and location are generally well understood (pages, indexes, etc.). Relative to printed matter, Gophers are still a new tool, and conceptual methods for using them efficiently are still evolving. The truly new aspect of Gophers is that they offer more than one possible path through information therefore taking into account that information is multifaceted and that information sources are pluralistic [15]. This pluralistic approach is enticing in its flexibility and openess but it does not offer any concrete guidance in choosing from the many sources available, each one characterized by its own constraints and motivations. InfoLib is a menu-based service. Its hierarchical structure attempts to offer a solution here. InfoLib tries introducing order into the chaos of the Internet by allowing searches to be defined by scope (from broad to narrow) and subject. The conceptual structure is a matrix of functional subsystems, each of which is divided into yet more subsystems. For example, "Other Gophers" offers a whole subset of menus. The user is unaware of the client-server connections being made. S/he only sees the hierarchical menu system - a set of linked menus that are easy to navigate - with an interface that does not change throughout the system.

For example, if a patron of the virtual library wants to look up President Clinton's latest speech, s/he chooses as one possible path: Other Gophers-Domestic Gophers-LC Marvel-Federal Government Information-Federal Information Resources-Information by Agency-Executive Branch-White House-White House Speeches-1993 or February or March or April. The Gopher "prowls" throught the Internet to look up the President's speeches. For the library patron, InfoLib meets inquiries in three ways: 1. as an information resource; 2. as an information service; and 3. as an information pattern [16].

4. How easy is it to get to know InfoLib? A short practical evaluation.

A key source of Gopher's fame is that it is userfriendly and requires minimal instruction.

Shortly after InfoLib's introduction, the UC Berkeley Teaching Library offered a series of instruction sessions which were well attended, mainly by graduate students and by library professionals from the area. Even though according to the head of the teaching library [17] students are fairly open to new technologies involving libraries in general and information retrieval, explanations pertained mostly to lowering the threshold and to drawing attention to the Library Gopher in general. Nearly 30.000 connections were made through the UC Berkeley InfoLib between February 1 and March 1.

InfoLib is conveniently accessible via the Melvyl and Gladis welcome-screen on terminals throughout the Libraries. As we approach the universal catalog, both welcome-screens are somewhat cluttered. InfoLib appears as yet one more menu-item on the screen and is in fact not very obviously or outstandingly introduced as a unique research tool[18].

Once the InfoLib screen pops up, it presents itself as a line by line menu, ranging from the most general "About InfoLib" to more specific "title search techniques in this Gopher". The underlying metaphors which may be useful in understanding the nature of InfoLib are the library and the encyclopedia. The information base is best viewed overall. Like a library patron the user must bring his or her own agenda to the information base. And the non-linear approaches to the information base makes it encyclopedic in style[19] .

If the Internet can be compared to a giant information resource, Gopher can vaguely be compared to a librarian who with the help of a subject index guides the user to the desired part of the virtual library. As Lynch and Preston point out, users do not always need access to the entire universe of information [20]. What really is needed is a subset that applies to the users particular interests [21].

Part II: Cataloging Issues and Procedures

This part presents suggestions to describe and classify networked information sources. Of particular interest are electronic journals because of their phenomenal proliferation. They have sparked heated debate about cataloging issues and procedures[22]. But there is no serious question as to whether librarians should got involved in describing these resources on the Internet (and electronically published material in particular). Quite the opposite. Librarians are called upon to provide "bibliographic" control to this wealth of information because of their reputation as experts in organizing material and their professional ethos of providing services to the academic community.

There are two USMARC-fields that lend themselves to expansion or are genuinely flexible. These are: the types of files-field and the electronic location and access-field (holding codes).

In the library world, these suggestions are accompanied by a more theoretical debate about the usability of AACR2 for networked electronic information, and the agony and excitement that goes with changing well-established rules and practices. In the center of the discussion is "Proposal 93-4"[23], which makes recommendations for changing the bibliographic format in order to house electronic journals (and other electronic resources accessible on the Internet)[24].

1. Types of computer files-field: 008/26 and 256-fields

The 008/26 field is used for retrieval of specific types of computer files. It is suggested that the list of computer file types in this field be enlarged in order to make this data element more useful.

The more specific codes are believed to be necessary to adequately describe types such as: bibliographic data, font, game, sound, and graphic. These suggestions meet some opposition in parts of the library world, primarily due to the method these suggestions were derived and the spectrum of librarians who cataloged in OCLC's computer files-experiment.

There is particularly a clash between catalogers of large research institutions and librarians of smaller college, public or school libraries. For the latter, the term "graphic" for example may be precise enough since they do not catalog a very diverse spectrum of graphic materials. For a research university cataloger, however, defining a file as "graphic" may represent a loss of information. For example, a satellite picture has traditionally been described as "representational" - a term that is more precise than "graphic".

The 256-field (file characteristics) is used as a collation field for computer files and includes a statement of the type of file. To this date, "computer data", "computer program" and "computer data and program" has been used in this field. Some librarians criticize its scope as too limited. For example, patrons looking for an electronic journal would not naturally think of it as computer data. A possible expansion of field 256 would embrace "electronic document" and "electronic journal".

Changes to field 256 have to be preceeded by changes to AACR2. This has to be carefully evaluated. Changes to the cataloging rules, as some librarians argue, should be driven by a careful and precise description of materials. Catalogers have to evaluate if a term like "electronic journal" does not in the final analysis water down the descriptive wealth contained in a term like "computer data" and that its specificity may be deceptive.

2. Electronic location and access

As identified by Dalton and others[25], the most challenging aspect associated with cataloging e-journals is the frequently changing location and means of access of Internet resources[26] . In this respect, their "behavior" is similar to serials[27]. But the necessity of hardware and software locates them in the vicinity of computer files. The decision to catalog computer files as (monographic or serial) computer files or as serials depends ultimately on the local cataloging practice. Unless there are plans to print out on paper or to download on diskette any of the selected materials, these titles will have no physicl location in the sense that other more traditional materials do. They will, instead reside somewhere in the electronic environment of the Internet (or InfoLib). Therefore, creating a new holding code to structure the electronic environment will be helpful to library patrons in a variety of ways.

One of the problems is keeping the holding statement concise enough that it can be included into a bibiographic record. By creating an entirely new holding code for the electronic part of the library collection, offers the opportunity to include instructions for access and sufficient flexibility to add, delete or change this information as needed.

Proposal 93-4 invented a new field for "Electronic Location and Access" (856) that includes data elements for type of access (for example e-mail, FTP, Telnet and other), host name, path name, file name and other information necessary to access or retrieve a source over the Internet. Field 856 is a holdings and location field identifying a particular location of an item. For example, files may be compressed with different file names than the uncompressed file. In this case two 856 fields would be given with different filenames. This may also be the case for very long documents that had to be broken up.

The main TCP/IP protocols would be defined in "indicator 1 - Access method" which will reflect upon how the remaining data in the field will be used. The subfields are: $a Host name, $b IP address, $c compression information, $d path, $f filename, $g name of publication or conference, $h processor of request, $i instruction, $k password, $l logon/login, $m contact person for information, assistance, $n name of location of host, $o operating system, $p port, $q file mode, $s file size, $t terminal emulation, $x non-public note, $z public note, $2 source of access (if "other"). This information is judged valuable since electronic resources can be found in several formats in a number of directories, at a number of hosts, including non-Internet resources[28].

3. Classification

Classification (and subject headings) for computer files is likely to be more general and therefore simpler (than for printed works). In addition, classification of computer files will not result in shelf locations as classification of materials in other formats does. However, there are a number of good reasons to classify these materials. First of all, classification will provide the librarian with a collection of data that will be useful in assessing the library's collection's strengths and weaknesses.

In addition, classification will supply a call number which in turn will constitute an additional access point to material on InfoLib. Call numbers are a constant element that will link titles under the earlier title and the later title (in case of a title change), improving access and assuring continuity[29].

The discussion revolving around cataloging issues, definitions and terminology underline the pluralistic character of information and its retrieval. It confirms that description of networked information resources will ultimately depend upon what users want[30]. Librarians are called upon to alter and refine their techniques so that they evolve simultaneously with the progress of society itself. Without quality cataloging, however, the chances of successfully creating the knowledge link are lowered[31] .

Part III: Who Makes Decisions for Selection?

Library material selection procedures have always been accompanied by internal departmental rivalries revolving around who is in charge and frustrations by patrons who could not fathom that the particular item they are looking for has not been selected and was therefore not included in the library collection.

Selection of material for a large research library is usually regarded as a scholarly activity. Educational and economic considerations come into play as well as a certain vision of the future. The latter is certainly appealed to in selecting (re-)sources for the Library and Information Studies portion of InfoLib. As we move toward the virtual library with universal access it seems to become less and less important who writes the collection policy and if a collection policy is implemented since the "network of networks" will assure access to anything anywhere. Gophers in particular seem a step away from a "corporate mentality" [32] ("a mainframe servicing hard-wired terminals".) They allow individual departments to manage, arrange and distribute information sources in a way that suits them best (as done with InfoLib).

Libarians as selectors of materials have two qualities going for themselves: 1. they are affiliated with an educational institution that stands for a set of values, and 2. they fulfill a public service as they protect open access to information.

Who Controls It?

At second glance, however, it becomes apparent that the information highway is actually controlled by various interest groups (private industry, government agencies, universities etc.). Access to less mainstream databases and resources may be jeopardized at some point in the future.

In a university environment, the question of who controls information and the availability of information does not naturally rise to prominence[33]. The environment is generally considered as "benign", competent, democratic and socially responsible. But in the electronic age where information is predominantly viewed as a commodity, the evolving interdependencies between universities, private industry and Government may be less fruitful for library collections.

On a larger scale, the United States is trying to assume a leadership role in the Information Age. Congress is currently considering a number of bills designed to promote private sector investments in the National Information Infrastructure, while trying to protect the public interest. These bills include proposals to update telecommunications regulations, promote applications of the NII, and develop information policies for the digital era"[34]. Even though many see a role for Government in developing public service applications (for schools and libraries), most material contained in the major research libraries is not digitized and questions remain about copyright issues and who will pay the cost of conversion[35]. In addition, the private sector firmly maintaines that companies, not the Government, already have created much of the information superhighway, and that the private sector has a critical role in the evolution of the National Information Infrastructure. Industry assumes responsibility for developing, planning, designing and implementing the NII for the market place. It expects the Government to create a conducive legal and regulatory environment.

This does not directly and presently affect InfoLib but it certainly may affect the larger picture of the information superhighway and therfore may have recurrences for InfoLib in an attempt to curtail pluralism.

Bibliography

Abbott, Andrew: The system of professions. Chicago: University of Chicago Press, 1988.

Beniger, James R.: The control revolution: technological and economic origins of the information society. Cambridge: Harvard University Press, 1986.

Blokdijk, André and Paul: Planning and design of information systems. London: Academic Press, 1987.

Borah, Eloisa Gomez: "Beyond navigation: Librarians as architects of information tools". In: Research Strategies. A Journal of Library Concepts and Instruction. Vol. 10, no. 3, Summer 1992.

Braman, Sandra: "Defining information. An approach for policymakers." In: Telecommunications Policy. September 1989.

Brett, George H.: "Accessing information on the Internet". In: Electronic Networking. Spring 1992, vol. 2, no.1.

Bridges, Karl: "Gopher your library". In: Wilson Library Bulletin. November 1993.

Caplan, Priscilla: "Cataloging Internet resources". In: PACS Review 1993, vol. 4.

Congressional Research Service.

Convey, John: Online information retrieval. 3rd ed. London: Clive Bingley, 1989.

Dalton, Marian L.: "Does anybody have a map? Accessing information in the Internet's virtual library." In: Electronic Networking. Vol. 1, no. 1, Fall 1991.

The Daily Californian. Vol. CXXIII, no. 68, April 15, 1994.

Dillon, Martin; Jul, Erik; Burge, Mark; Hickey, Carol: "Assessing information on the Internet: Toward providing library services for computer-mediated commulnication". In: Internet Research. Vol. 3, no. 1, Spring 1993.

Gould, Carol C. (ed.): The information web: ethical and social implications of computer networking. Boulder: Westview Press, 1989.

Internet accessible guide to library catalogs. 1992.

Internet resource guide. 1989.

Kalin, Sally W.: "Beyond OPACS...The wealth of information resources on the Internet". In: Database 14(4), August, 1991.

Krol, Ed: The whole Internet. User's guide and catalog. Sebastopol: O'Reilly, 1992.

Langcaster, F. Wilfried; Warner, Amy J.: Information retrieval today. Arlingon, VA: Information Resources Press, 1993.

Langefors, Börje: Theoretical analysis of information systems. Philadelphia, 1973.

Lynch, Clifford A.; Preston, Cecilia M.: "Describing and classifying networked information resources". In: Electronic Networking. Vol. 2, no. 1, Spring 1992.

Machlup, Fritz; Mansfield, Una (eds.): The study of information: interdisciplinary messages. New York: John Wiley & Sons, 1983.

OCLC Research Report 93/1.

Ploman, Edward W.: International law governing communications and information. Westport, Connecticut, Greenwood, 1982.

Riley, M.J.: Management information systems. 2nd ed. San Francisco: Holden Day, 1981.

Ravault, Rene J.: "Information flow: which way is the wrong way?". In: Journal of Communication. Vol. 31, no. 4, 1981.

Saunders, Laverna M.: The virtual library. Visions and realities. Westport: Meckler, 1993.

Shera, Jesse H.: Libraries and the organization of knowledge. Hamden, Connecticut: Archon Books, 1965.

Shera, Jesse H.: Sociological foundations of librarianship. New York: Asia Publishing House, 1970.

Tillett, Barbara: Bibliographic relationships: Toward a conceptual structure of bibliographic information used in cataloging. Ann Arbor: U.M.I., 1987.

U.S. Library of Congress: Delivering electronic information in a knowledge-based democracy. Washington, July 14, 1993.

Witteman, Mark: The feasibility and desirability of cataloging Internet files: A study for the UC Berkeley Libraries. (unpublished), Berkeley, 1993.

Wynar, Bohdan S.: Introduction to cataloging and classification. Littleton, Colorado: Libraries Unlimited, 1985.


[1] Kalin, 1991, p. 28.

[2] There are more than 40 academic fields that deal with information. A survey discovered that more than 100 definitions of information relating to "communications" were being used more than 10 years ago. (Edward. W. Ploman, International law governing communications and information, Greenwood, Westport, CT., 1982.

[3] "what reaches man's consciousness and contributes to his knowledge" (p. 421).

[4] "Information is any kind of knowledge or message that can be used to improve or make possible a decision or action" (p.319).

[5] ..."underlying the use of the term in information systems are several common ideas: information adds to a representation, corrects or confirms previous information, or has "surprise" value in that it tells something the receiver did not know or could not predict" (p.200).

[6] Rene J. Ravault,"Information flow: which way is the wrong way?" In: Journal of Communication, vol. 31, no. 4, 1981, p. 129-134.

[7] Jesse H. Shera: Sociological foundations of librarianship. 1st ed. New York: Asia Publishing House, 1970, p. 96-97.

[8] Shera, 1965, p. X.

[9] Saunders, 1993, p. 5.

[10] Andrew Abbott, 1988, p.220.

[11] Andrew Abbott, 1988, p. 220.

[12] Shera, 1965, p. 54: "...the rapid proliferation of new techniques for information retrieval, many of which are as yet untested; all of these quickly invalidate any attempt to return to standardization (and) neccessitate a re-examination of the fundamental theory of librarianship."

[13] D. Kay Gapen, :The virtual library: knowledge, society, and the librarian". In: The virtual library, 1993, p.2.

[14] Dalton, 1991, p. 31.

[15] Franklin Ford argued that information appears differently when perceived by the individual, the class and the whole. To IBM, information operates simultanesously as an asset, a resource, and a commodity. (Braman, 1989, p. 235.)

[16] Braman offers a fourth category: Information as a constitutive force in sociey. 1989, p. 235.

[17] The Daily Californian, vol. CXXIII, no. 68, april 15, 1994, front page.

[18] There is, for example, no indication as to the walth of resources that InfoLib is a gate to.

[19] InfoLib's potential seems to lie more in its future used than in its present use, particularly as some features are still in experimentalmode or material can not be accessed yet.

[20] They discovered a need for directories of networked information resources in the dizzying array of online library catalogs, electronic documents etc. So far, there is Internet accessible guide to library catalogs (St. George and Larson, 1992) and Internet Resource Guide (Partridge Roubicek, 1989). Though Lynch and Preston do not call into question the value of these print-riented directories, they believe that they cannot solve the long-term problem of describing and classifying networked information sources.

[21] Lynch and Preston, 1992, p.21.

[22] For example, Caplan, 1993, p. 61-66.

[23] OCLC Research Report OCLC/OR/RR 93/1.

[24] In an initial experiment, a group of volunteer catalogers were asked to catalog electronic resources applying the USMARC computer files format and AACR2 as best as they could and to keep a journal as they go along, documenting their difficulties, observations and suggestions.

[25] Dalton, 1991, p. 36; Witteman, 1993, p. 4.

[26] As Lynch and Preston point out, the resources which will be selected for a library collection will be less volatile than a lot of other resources on the Internet (1992, p. 20).

[27] I have not found the argument very convincing that some material on the Internet should not be cataloged because of its ephemeral character. More traditional materials like monographs and serials get lost, or stolen, mutilated, are out of print, receive a new editon, new title, new author, publishing houses cease to exist, merge etc. Needless to say, those are vexing incidences for both users and catalogers but on the whole they don't call into question the value of cataloging principles and the usability of the entire catalog.

[28] In a provocative manner, it can be argued that information on location in the record itself is not necessary. sense. Here is were URI and URL come into play. "Only URI's would be imbedded in the bibliographic description, and computers would associate the URI with one or more URL's in much the same way an Internet host name (HARVADA.HARVAD.EDU) is associated with its IP address (128.103.60.11) by the name server system. (Caplan, 1993).

[29] Lynch and Preston, 1992, p. 20:"One problem with this approach is that it depends on LCSH being universally understood (...) One does not want too many classification schemes since this will "balkanize" the information resources on the network."7

[30] Internet Research, 1993.

[31] Levels, 1993.

[32] Kar Bridges, 1993, p. 36).

[33] Machlup, 1983:"A largely hidden information community has arisen to help feed the enormous and ever increasing appetites of information users. The information community is made up of a variety of institutional participants, including publishers, information clearinghouses, educational institutions, broadcast companies, vendors, brokers, document delivery services, libraries and information centers that serve information creators, processors and users."

[34] Congressional Research Service; CRS Report for Congress).

[35] U.S. Library of Congress, Delivering electronic information in a knowledge-based democracy. Summary of Conference Proceedings, Washington, July 14, 1993.


Impact Main Menu