Web Site Administrators
Laurie Gelb
Administrative Manager, Department of Biomathematics
University of Texas M.D. Anderson Cancer Center
Abstract
The objectives of this paper are:
1. To delineate the difference between information and data.
2. To introduce the difference between using a menu item as a location, and using it as a sign.
3. To explore the process by which the Web user structures a question or need which requires information access.
4. To identify important information and transactional variables, such as:
* type (opinion or perspective, fact, fantasy, creation)
* format (text, graphics, sound, animation, search engine)
* subject (what is described)
* source (where did information come from)
* location (server)
* provider (person/institution who creates access)
* link (how accessed)
* user (planned and actual usage)
and their implications, such as:
>>The need to anticipate and accommodate diverse user decision trees and models of reality.
>>The need to design "right" and "left" turns from each information item, not just "level up" and "level down." Turns should reference data type as well as subject.
>>The necessity of bidirectional feedback systems--users must be able to report inaccurate, obsolete links, information easily; also "wish lists" of features and information. Providers must convey limitations, plans, structure of information to greater degree. Providers must also implement formal mechanisms for receiving and processing user feedback.
Author's note and suggestion: This paper contains no links to other text. As you read through this paper, please note exactly where you would expect to see a hypertext link. Then, show this paper to a colleague and ask her to do the same. The degree to which your answers agree or disagree indicates the degree to which your information models agree. As discussed below, these models profoundly affect the information exchanged via the Web.
Introduction
In the abstract sense, the Web is an immense network of minds, connected on demand by machines. As they create and develop the strands of the Web, those who control the machines "are not modelling reality, but the way information about reality is processed by people (1)." When people communicate and utilize information, they impose their concepts and their models of reality on these outputs. A representational data model, or information model, for the Web proposes that:
1. Data are representations rather than subsets of reality, and, therefore, can neither be affirmed nor denied.
2. Information is data in context--transformed and evaluated by human perceptions of reality. Information need not map exactly to any particular data.
3. The Web functions as a communication medium to transmit information between and among users and providers.
4. The Web functions as a creative tool to facilitate synergistic exchanges of information, in which the products created jointly are superior to those possible without the Web.
5. The Web's role as a commodity exchange is to enable mutually-beneficial exchanges between individuals, utilizing all available information, in the potential absence of conscious contact.
6. Navigating the Web may be analogous to using a two-dimensional road map in which signs represent actual locations, or a three-dimensional radar screen, in which signs are only indicative of locations. Assuming either case is possible, the Web provider should consider the location, content and format of signs from both a driver's and pilot's viewpoint.
1. Data as represesentations of reality
Data are raw, unprocessed representations of some reality--not pieces of any one definitive existence. Consider a photograph, which one might suppose is as close to "reality" as can be. The color palette of the developing service, the quality of the paper on which the photo is printed, the point of view of the photographer, and the type of camera used are only a few determinants of what "reality" appears on the picture. Even beyond these variables, "reality" as seen through a camera's viewfinder is still only one subset of the total scene; objects are cropped out of the picture, the scale differs by photo, etc. Thus, a photo is representational data, not reality.
Data may be graphic, aural, textual, numerical or in some other format, but all are without context. For this reason, data have no truth value. One cannot judge the "accuracy" of a picture. Even numbers, which are often incorrectly assumed to contain their own truth values, are simply humanistic representations of a particular event or phenomenon.
2. Information as data in context
Information is representational data; it is data placed in some sort of context. If Jane sits at her computer, writes a sentence and saves this sentence to the clipboard, it remains data. If she sends it through cyberspace as a contribution to an interactive fiction product, it becomes information. The sentence as data now has gained a context. Information expresses conclusions, concepts, ideas--all of which can be evaluated, refuted, supported or at least endlessly debated.
Information and data need not correspond with each other on any sort of one-to-one basis. A menu item condensing a news article does not represent the original news article. It represents information about the subject and perspective and author of the news article. But this information is not fact; it is perspective. One user may perceive the reporter as biased toward a national health care plan; another may perceive the reporter as objective and fair.
The Web helps providers and users transform data into information. One can submit information and receive information (or merchandise or a certain data set) in return, whether in an individualized message or with an on-line form. One can click an object, receive part of it, and be asked to indicate if the remainder of the download is desired. One can assist a Web provider in maintaining up-to-date links to other information. It is the informational aspect of the Web which holds the most promise for users and providers, even as providers purchase more gigabytes and more sophisticated servers through which they intend to offer massive amounts of data.
3. The Web as a communication medium
Like several other media, the Web permits the exchange of information by using our transmitted images, words and sounds to substitute for a physical and/or real-time presence. Unlike voice or electronic mail, however, the Web allows communication to occur without the specification of an individual or group as the recipient of information. Yet being able to broadcast information does not insure that those who need it most will receive it.
One obvious motivation for maintaining a Web is "sharing and communicating information"(2), just as humans, as social animals, have always sought to do. However, this task requires the willingness to offer a flexible, rather than rigid structure by which this information is communicated. The assertion that "agreeing on standard data exchange protocols and domain-specific vocabularies and codes is our greatest challenge"(3) stresses that the Web is a collection of smaller domains. Yet, as discussed below, a basic flexibility in presenting information is in itself a basic structure.
Often, Web providers are more concerned with what subject-oriented information can be provided than with how it is provided. Thus, a Web site administrator may presume that the classification schemes she uses to structure resources for the user are sufficiently universal, and focus instead on "content." However, as Marshall McLuhan noted, the medium is the message. The information provided by a Web site as to the data classification schemes available to the user is not only descriptive of the medium, but in itself is content.
Thus, when providers focus on the raw material their sites offer, they lose sight of the Web as an information exchange, and of their role in the transformation of data into information. This transformation cannot occur without the provision of a context or framework for the data. Whether minutes, hours or days go into the creation of this framework, the end result is the most important aspect of any Web provider's "product." The framework's accessibility, consistency, and viability may well determine the extent to which those who need or want certain information from the Web actually find and use it. Even the use of robots to extract information involves usage of the context of the programmer who developed the robot.
Fortunately, communication implies only the transmission of information, and does not require that either party share the other's basic models of reality. We do not believe that our classifications of data are necessarily shared by a television newscaster, a newspaper reporter, or even our network administrator. Nor can a user assume that the providers of Web servers with which she interacts support her models. Every menu item on every Web site is information about a Web provider's model, even as it represents the subject to which its sign points. The crucial question is, does the information presented about a provider's model facilitate or hinder users' productive use of the site involved?
4. The Web as a creative tool
Traditionally creation has involved the intersection of physical and mental capabilities. However, the cyberspace age emphasizes the mental aspect. No longer is manual labor required to create; no longer must one face the audience or deliver the product in person. The Web permits infinitely-branching creative efforts, because contributors can enter the process at any point, contribute in several different ways, and leave the process at any point. In the future, as more users become providers, they will find it easier to initiate creative projects via the Web, thus producing a wider variety of possible outputs.
5. The Web as a commodities exchange
Use of the Web for commerce will likely increase geometrically as secure and confidential access becomes a certainty. However, in order to manage user expectations, it should be clear before a Web site is entered whether or not commercial activity is possible at that site. Also, the question of how a site adds value to the transaction remains important. It is difficult to provide on-line product information that is always as aesthetically pleasing and complete as a printed catalog or an in-person shopping trip would provide. The possible availability of instant information and searchable databases regarding a product's specifications, availability and delivery time is attractive in some product categories and redundant in others.
It is now a media cliché that "despite the power of the technology on one's desktop, one can often be frustrated by the small decisions"(4), such as where and how to find needed resources, or even how to begin communicating via e-mail. The Web's development as a commercial medium awaits the conversion of thousands of new users into willing participants.
6. Facilitating Web users' navigation
- Posting signs: some pitfalls
Many models of reality yield a user interface which relies greatly on signs. These signs are only sometimes even somewhat proximal to the locations they are said to indicate. It is easy to assume that a location which is referenced by a sign is also represented by that sign. Of course, that is not always the case. An information provider often equates signs with links, and provides no further path to the information. And in the Web world, an icon is a sign leading to data; a subject heading is a sign leading to an icon. Yet an icon also points to an actual location for certain files; thus, it is convenient for the provider to equate folders, signs and links.
The architecture of some Web browsers and many sites encourage a "level up" and "level down" mentality,e.g. a connection to an ftp site usually offers a "parent directory," departmental gophers may be subordinate to a Main Gopher, etc. However, a user who takes an early wrong turn may not only get lost but lose interest in the destination. In physical travel, if you miss the turnoff, you can turn back and find your destination with a minimum of wasted time, because there is only one possible error. If you choose the wrong Web sign, you may waste time going in the wrong direction, yet be unaware that you are doing so, because no specific landmark alerts you to that possibility.
The Web's tremendous potential for productive exchanges is often thwarted by signage which lacks meaningful information from the traveler's perspective. If a software archive has a directory entitled, "utilities," and a subdirectory entitled, "graphic," it is assumed by the provider that a graphic utility program is being represented as well as pointed to by these two signs. The user, however, may not have the vaguest idea as to what software should, could be or is pointed to by either of these two signs, let alone the pair in sequence. Also, using different models, various software archives often categorize the same package using different signs. Of course, if one knows the name of the program, it is possible to search for it by keyword, but often users do not know what program they want or need, or what "is out there." They lack the basic knowledge necessary to construct a representational scheme for graphic utilities, or software in general, and are now subjected to multiple schemes without any benchmark whatsoever for judging them. Thus, they cannot readily judge which archive might group software in a manner they are most likely to comprehend and utilize easily.
This difficulty illustrates the importance of providing information about the model being used to structure information. The structure of most software archives and other sites is not transparent upon entry. One must go into various sublevels to detect even the beginning of the classification scheme used by the archivist(s). The frequent positioning of an index and/or read-me's in a different menu position than the actual items is another "road block." Nor should users have to avoid certain information sources because they discover, eventually, that these sources use a different model than they do.
- Scope of the map
When you commute to work, you probably have several possible routes in mind, each of these known to you by virtue of your familiarity with both your destination and its location. Those unfamiliar with either the destination or its locale will be unable to use anything but a street map or a verbal description to get there. If a visitor to your city asked how to find your favorite restaurant, would you draw a map showing the route from your house or her hotel? Many current home pages, unfortunately, are routes from the provider's home, with little thought given to the user's starting point or model. If your map covers the wrong region, it is even more difficult for the user to anticipate the result of clicking on a given link or item.. Without a transparent classification structure, the user has no way to know if "Joe's favorite links to educational resources" are anything like "Julie's links to educational resources" in terms of content, format, timeliness, geographical scope, etc. Yet each of these variables could be addressed by a provider.
- Using search engines as a substitute for directional signs
Unless one does a literal search, the search engine provides both the subject list (major signs) and the interpretation of what items are "represented" by what sign. So the notion that users need not be concerned with the actual location of information is somewhat unfounded, because the scheme by which the provider has located certain information is likely to be reflected in the scheme by which s/he provides signs and links to that information. Possible inconsistencies in the underlying classification scheme are therefore evident at each stage of the interface. And, of course, the major reason that more information is not discovered through literal searches is that one must "know the answer" to determine the best keyword. Moreover, those users who prefer to "surf" are unlikely to use search engines except as a last resort, and may have less time then to contemplate the most appropriate search terms.
- Is this a Web or a highway?
It is not surprising that the information superhighway metaphor has reached cliché status, because a highway implies guided travel to a destination. Since the "superhighway" is largely a political creation, its putative benefits have been made implicit in its name. The danger here is that the rush to communicate data will inhibit the development of user-oriented interfaces (i.e. the best possible utilization of provider contexts).
Our responsibility as information providers is to link each strand of the Web to the world outside, and, of course, to provide as many exits as entrances. But right and left turns are equally important. A user who becomes frustrated with a lack of signs, or who is frequently blocked by "dead-ends" loses the willingness to navigate and begins to focus on exits. Worrying about exiting too soon may focus the user's energy on possible exits and deprive her of an optimally-efficient search. On the other hand, that same user could become more than a mere passerby--s/he could become involved in "the global community and consciousness that is the Internet"(5) if appropriately supported and encouraged to do so. Thus, a traveller "passing through" your server can stop and become a contributor to it. The more users who become providers, the tighter the strands connecting users and providers--the stronger the Web.
-"Rules of the road" for providers
A Web contains potentially infinite links from point to point. Thus, it is a viable metaphor on the provider side. But to the user, the Web must appear a finite rather than infinite world. Since a search engine, a home page and a menu item all transmit information about the provider's viewpoint of reality, care must be taken to insure that this viewpoint is communicated to the traveller at every step, every sign and every turn as possible. In designing searches, providers should:
1. Put enough gauges in the vehicle. Yield continuous feedback on the appropriateness of the user's search method. The algorithms used in search processing must become sophisticated enough to comment when a search is demonstrably circular or prolonged, and to offer assistance. Another possibility is to offer a very limited subset of the search result at first, then to offer the user the option to view or not view the complete result. For instance, if a user begins with a search term "paintings" and then adds the word, "art," the computer should be able to inform the user that the second term will not change the results of the first before fetching the entire set of results again. Similarly, if a search on "paintings" leads to 3425 matches, the user should be so informed, and offered the option to modify the search before viewing the results. Even limiting the number of results returned is arbitrary; one user may only want 10, another 100. Much hinges on the motives for and scope of the search.
2. Allow the traveller to disregard signs as needed. Yield continuous feedback on the results of the last search, and offer ways to modify those results (sorting, culling, expanding). If a user's search yields 37 links, and he can tell from the headings that only 25 are relevant, he should ideally be able to cull the unnecessary links from the menu dynamically, and able to save the result as HTML in order to test the links later. Instead of providers'only being able to request that users e-mail providers who have obsolete links, some sort of meta-engine or robot to identify these links sooner will hopefully be developed. Also, those sites which permit users to add links must consider the issue of quality control in maintaining an information structure, as discussed above. Creating additional possibilities for accessing information X is useless if information X is still never found by 60% of users who could benefit from it.
3. Leave the driving to the driver. Offer means for the user's own taxonomy of data to be reflected in every search. A provider might offer a series of directories, such as Art/ Pictures/ Paintings/ Renaissance / Oils. The user might think more in terms of Art/Paintings/ Oils/ Renaissance/ Still-Lifes. So at each level a link must exist to change classification schemes, not just the current label. At the Renaissance oils subdirectory, the user should be able to link to another historical period, another painting type, and other art type of the Renaissance period, e.g. sculpture. She should also be able to jump to a list of all the categories under "Renaissance," all the categories under "painting" and all the categories under "oils." The user should also have a link to the next level, "Art," and an escape from the entire location. In short, the directory must be structured in a way sufficiently flexible to accommodate a variety of art classification models, rather than only that of the provider's.
4. Space signs at frequent intervals. Downloading data is no longer nearly as time-consuming an activity as evaluating "samples" of large sounds, pictures and documents to determine whether or not downloading them is either practical or desirable. Sites should specify item size, and must offer "preview sized" subsets of the information presented.
5. Don't point toward a dead end. Data type and format are as important as subject. Too often, these are not revealed until the lowest level of the menu is reached. But if the user is looking only for pictures, it is wasteful to lead her through a labyrinth of submenus before revealing that only text is available concerning the subject heading of choice. More consistent iconization of menu items which clearly identifies the data formats which will be available at each level is obviously labor-intensive. Hopefully, programmers can develop and distribute more efficient code for creating more informative menus.
6. Distinguish directions from decisions. When information is presented as a matter of taste (when a provider is supplying images of her favorite Renaissance oils rather than a collection based on any objective criterion), the user should be so informed prior to entering the collection. The interface must allow feedback from the provider as to what user expectations will be met at each branch point.
7. Assume that the traveller has not received a pre-printed map. Recognize that the driver most often carries no map, but is creating it as she drives. Don't design sharp turns and one-way streets (let alone air pockets) which inhibit accurate cartography. Since the person who is the best map designer often needs the map the least, help the traveller define the dimensions she should include in her map. A traveller entering a building knows that both floor and room are important. A pilot knows that longitude, latitude and elevation are important. An inexperienced Web traveller lacks this knowledge of the relevant variables. What must a traveller know about the "dimensions" of your site in order to decide (intelligently) whether and how to use it?
Whither the Web?
Consider some sample consequences of users' information models:
1. Numerous individuals' home pages are coming on-line. While these collections can identify fun and important information, the time and difficulty involved in moving between home pages and the information sought cannot be quantified in advance. (It is worth noting that users have very different attitudes toward ten minutes of effortless "surfing" and ten minutes of painful "digging." How do you distinguish the two?) Furthermore, the potential usefulness of home pages varies widely, as many of them point to well-known sites and others point to more "subterranean" sites. This will ultimately lead many users, particularly those in business and the sciences, back to the "known" search procedures, using subject trees and robots.
2. The need to "know the question" before beginning a search for information will encourage many users to read books on the subject before navigating the Web, particularly if they are charged for surfing by the minute. Yet such books can be difficult to understand in their own right, and some reduce the Web to a one-dimensional telephone directory. After exploring these written aids, users' methods for exploring the Web may differ from those who start on their own.
3. "All-in-one" Internet software packages with various preset jump points are popular, but their long-term effect on the manner through which new users approach the Web has yet to be seen. In any case, one must believe that hypertext-supporting browsers encourage rapid, almost random exploration to a degree that menu-driven gopher and ftp have not. Ironically, the relative accessibility of the Web protocol may result in rapid acclimation and equally rapid "burnout" among new users if providers do not present appropriate information about their models, their sites and the Web as a whole.
Conclusion
Many providers are undertaking or planning innovations which address some of the above concerns. The papers submitted to WWWF `94 include projects such as automated updates of HTML pages, the voluntary creation of user profile pages, better search engines for text databases, and much more. Each of these initiatives signals providers' willingness to help provide an environment in which users can utilize their own models and still interact productively within the Web. More conscious enhancements of this type will help insure the Web's continued viability as a communication, creative and commercial medium.
References
1. Kent, William, Data and Reality, Amsterdam: North Holland, 1978, p. 19
2. Orthner WF, Scherrer JR, Dahlen R, "Sharing and communicating health care information: summary and recommendations," International Journal of Biomedical Computing, Jan. 1994, 34(1-4), pp. 303-318
3. Ibid.
4. Frisse ME, Kelly EA, Metcalfe ES, "An Internet primer: resources and responsibilities," Academic Medicine, Jan. 1994, 69(1), pp. 20-24
5. Butler DL, Anderson PS, "The use of wide area computers in disaster management and the implications for hospital/medical networks," Annals of the New York Academy of Sciences, Dec 17, 1992, (670), pp. 202-210
About the Author
Laurie Gelb is Administrative Manager, Department of Biomathematics, at the University of Texas M. D. Anderson Cancer Center. She has also served as Project Director for a major marketing research and consulting firm. She contributes regular book reviews to the Journal of Health Care Marketing. She holds a BA in Philosophy and Business Administration and a Master's Degree in Public Health. Her e-mail address: laurie@biomath.mda.uth.tmc.edu