This is an archive page !!!

Beyond the Back of the Book:
Indexing in a Shrinking World

by Laura Fillmore

Part 1

Presented to the American Society of Indexers, Inc.
and Societe Canadienne pour l'analyse des documents
Montreal, Canada
June 10, 1995

Copyright © 1995 by Laura Fillmore; written permission required to reprint.

"Many that are first shall be last; and the last shall be first." Matthew 19:30

"Everyone has his own Internet" --Bruce Sterling

My first look into the mysterious business of indexing came in the person of Mrs. Blache, who lived in a low, red house in Ramsey, New Jersey, far back from the road, under tall pine trees. When we were kids, we would pay her quiet domain an occasional visit, and she would hurry us through the front part of house, past the seemingly chaotic stacks of index cards, piled all which way, here and there, some in shoeboxes, some balanced on the back of couches, on the arms and in neat rows across the floor. "I'm indexing a geography textbook now," she'd explain, passing the typewriter. "That's Egypt on the couch. Step over Africa and come have a cup of grape juice."

It seems as if geological time applied for most of the lifetime of the indexing business. Maybe twenty years passed between that hot summer afternoon at Mrs. Blache's house, and a broiling August day in 1983 when I came to interview Gordon Brumm, indexer and philosopher, at what he referred to as the "park" in the middle of Central Square in Cambridge. I had just started a company, Editorial Inc., representing freelance professionals to publishers, and needed an indexer. Gordon suggested this pocket park as an interview site, a small triangle with stone benches sporting no greenery whatsoever, and surrounded on three sides by busy streets. A philosophy professor at Northeastern, he proceeded to educate me on what he called the "logic of indexing," an art at once analytic, synthetic, and interpretive.

I strained to hear him through the traffic and after a while, when we got into the torturously niggly negotiations, each argument justified by detailed explanations and calmly stated logic, about whether he would charge his indexing work by the job, by the hour, by the page, by the entry, or by the line (the sun began to set...), and then whether there would be paragraph or columnar style, and how many letters per line for each--he saw that all this logic was taking its toll on my attention span and invited me to his apartment close by for a glass of grape juice. The quiet room held stacks of index cards, some in shoe boxes, some on the folding chairs, the table, the couch, marching orderly across the floor, up to the typewriter. The location had changed since when I was a kid visiting Mrs. Blache, but the business hadn't changed much. I was witnessing evidence of the work of a human being, the process of dissecting a book's idea and information structure, breaking it into bits first on the manuscript itself, and then on cross-indexed and color-coded index cards. The end result would be a page-specific index manuscript, a reader-driven logical construct that identified and opened many doors for many readers into the contents of the book.

I mention Gordon Brumm not because he is the King of Cards (indexing cards), but because his approach to indexing had an effect on the way I think, and also because, when considering our opportunities on the Internet today, we do have to look at the larger issues through the prism of individual experience, to examine why we think certain ways and how we can best communicate those thoughtpaths to others. So while the Internet is vast and exponentially increasing in size and complexity, the subtle shifts in content and direction made by a human here and a human there, in this so crucial and malleable time in Internet evolution, can make all the difference in the ultimate direction of something such as, say, indexing technology and popular information retrieval on the Internet. Thus my drawing on personal experience enables me not only to talk about what I know, but also to illustrate that at this juncture, it is individuals like us who are determining the knowledge structure of information on the Net.

While Gordon pointed out to me the exhaustive complications of indexing, he also inadvertently pointed me in the direction of hypertext, just by the way he thought about his job. In the mid-eighties, when many were trepidations about the new small computers beginning to proliferate, Gordon understood their place in the indexer's life. "...A computer or word processor can help greatly with the clerical functions involved in indexing, but it cannot make any of the innumerable decisions which go into the creation of an index. Only a human being can make those decisions, and computers should not be allowed to distort or interfere with any of the human functions necessary to making a good index. In particular, an index created by computer scan would almost certainly be a disaster.

And finally, only a human being can experience the satisfaction which -- over and above the utilitarian considerations--arises from the pure logical activity of weaving the web of words which is an index. " [Gordon Brumm, "The Logic of Indexing," 1984, unpublished, as far as I know.] He went on to outline the clear distinction, which today has become so blurred, between what the machine can do, and what only the human being can do.

That blurring of the borders of the HMI (Human/Machine Interface) has manifested itself during the last decade in a sea of mediocre indexes published in this country, starting of course with the books and manuals about the very computers which initiated the labor-saving problem in the first place, and ending with today's increasingly great search engines of the Internet, logical constructs which mechanically scan millions of words and links and files at the flick of an "enter" key for any user who happens by sites such as Yahoo, Lycos, Einet, and WebCrawler.

Like their brainless cousins the spellcheckers, key word indexes represent an era yearning for extinction, a blight on language and meaning, a reduction of thought to static and disconnected chunks of letters. Yet at this stage there is nothing better that works, and we need them. Surfing is fun, but wanting in direction, subordination, nuance, interpretation, analysis-- wanting in thought itself actually. The key word lists that launch our surfs may be seen as shovelware at best and useless regurgitations of data at worst--yet, in the absence of intelligent human indexers informing our online search capabilities, in the absence of global indexes working in active-seek mode as opposed to the comfortable idea-retrieval mode familiar to us from the "contained" paper environment, we have nothing better and quite clearly need and are grateful for what we do have.

But even before the Internet began to need and deploy its great self-searching mechanisms, publishers had accepted and even promoted such key-word (one might better say "random word") indexes for the sake of expediency, schedule, and budget. A computer book's automatically generated index will indeed contain computer words attached to page numbers rather than entries dealing with noncomputer subjects such as apple trees and orange groves. But the key words do not reveal the concepts which are their real meaning, which meaning will stay asleep for lack of voice to all but those who read the book. .

Such key word indexes offer bulk to a book, and added value only in the sense that the index offers the reader a semi-useful back door into the book (the front door being a detailed table of contents). More importantly, however, the online key- and boolean search machines offer a significant commercial advantage to those companies who understand the nature of these online search mechanisms as they operate today. For some, they offer the opportunity to be everywhere, all the time-- an optimal position in an environment where "what's for sale" is the rare and migratory bird we call attention, attention of the reader.

Because the online info-real estate is so new, and the politics of pointing still in its nascent stages (bipolar linking as a favor? As a financial exchange? On the per- month, hit-based sponsorship model? The new Gumball machine model?), today's searchable entries of companies on the most frequently accessed search engines, can define corporate success and also ultimately impact significantly the structure of tomorrow's access structure to information. This is because one essential aspect of success on the net is getting your site or your self indexed, then being found, and getting traffic. This self-citing of sites means structuring one's site, structuring oneself, using the proper (search-appropriate) key words in headers, URLs, descriptive paragraphs, and siting these words in the necessary searchable sites around the net. In the last year, and especially the last six months, such efforts have yielded many companies high traffic and high advertising dollar value -- irrespective of their content, only by virtue of their positioning and pointing strategies..

One analogy is to the Oklahoma Land Rush; if you grab turf today, it may be yours to build a mall on tomorrow. Power comes in naming what you publish and what you intend to publish, in staking out your turf, because then people can find you, and that's the big issue when there are an estimated 60 million people milling and clicking around online.. .

In an environment where it seems the forces are gathering behind the big commercial wave and it is all just about to happen, we see some companies making huge content grabs by virtue of pointing and attempting to encompass others, building virtual malls and laying the groundwork for resources that extend through all areas of human knowledge and commerce. It's free, it's open, and anyone can effect this distributive "info-real estate grab" at this point. If you plan to eventually include "Dogs" or "Chocolates" in your repertoire of indexing consulting services, for example, you can get yourself listed under those categories in an online index today. Even if your consulting service means only pointing at someone else involved in this subject--or pointing at nothing at all, just mentioning the word "Chocolate"-- still you can register your named specialty on an online index. We can see this happening today in Lycos and Einet with some mall or mall-like information services who appear to be everywhere by virtue of their staking out areas of content without always having content behind the key words.. I've seen this more than once with both Meckler and GNN, for example. For the reader, this self-promoting practice can prove frustrating.

The ultimate benchmark of course is usefulness to the reader; That's where the door stands open to the "humans" in the HMI equation, where conceptually strong indexers have an opportunity on the Internet that may be missing in print publications. Large companies are making major investments to assume a superset position in information online, and experience in the realm of recorded and reconstructed thought, if packaged correctly to say a phone company or a networking company, could give talented and experienced indexers the competitive edge over their key-word compatriots. In other words, indexers can make money on the Internet if they "play their cards right.".

Book publishers, on the other hand, continue to seek inferior indexes for the most part, it seems to me. Expedience and schedule are no small matters in the vastly accelerated production pace for books in our computer age. Say a publisher is putting out a book about new Apple hardware or software (the kind of Apple with a keyboard), and the new machines are due to ship immediately. The first book in the stores to accompany the release of the machine will get the most sales. In such a case then, would a smart publisher hire a Gordon Brumm to labor over "see also"s and "subsub entries", to create a rich semantic net of meaning for the book--or simply run a program, pull off an index from the latest PageMaker files, and get his book product to market? We all know the answer to this dilemma, and the answer makes good sense from a perspective of market sense and dollars. But who's looking after the thoughtful reader in this equation, and what becomes of the indexer?

Copyright © 1995 by Laura Fillmore; written permission required to reprint.

OBS White Papers Part 2