Content area
Full Text
ABSTRACT
Gerard Salton is often credited with developing the vector space model (VSM) for information retrieval (IR). Citations to Salton give the impression that the VSM must have been articulated as an IR model sometime between 1970 and 1975. However, the VSM as it is understood today evolved over a longer time period than is usually acknowledged, and an articulation of the model and its assumptions did not appear in print until several years after those assumptions had been criticized and alternative models proposed. An often cited overview paper titled "A Vector Space Model for Information Retrieval" (alleged to have been published in 1975) does not exist, and citations to it represent a confusion of two 1975 articles, neither of which were overviews of the VSM as a model of information retrieval. Until the late 1970s, Salton did not present vector spaces as models of IR generally but rather as models of specific computations. Citations to the phantom paper reflect an apparently widely held misconception that the operational features and explanatory devices now associated with the VSM must have been introduced at the same time it was first proposed as an IR model.
INTRODUCTION
In a tribute written for the Journal of the American Society for Information Science (JASIS) (Crouch et al., 1996), Carolyn Crouch declares that Gerard Salton was more than just the leading authority in the field of information retrieval (IR). For thirty years, Crouch writes, "Gerry Salton was information retrieval" (p. 108) During times when the significance of computational IR research was in doubt, Salton defended and supported it "through the sheer force of his own personality and reputation" (Crouch et al., 1996, p. 108). Crouch's sentiments are echoed in the memoriam by Salton's other colleagues and former protégés, who reflect on his many contributions in research, teaching, writing, editing, and service to scholarly societies. They cite the textbooks he wrote, the SMART system developed under his leadership, the scholars that he mentored, and many other contributions. Donna Harman reminds the reader that Salton investigated "the use of the vector space model in clustering, relevance feedback, automatic linking, book indexing, passage retrieval, visualization, and many other areas" (Crouch et al., 1996, p. 108).
It is hardly surprising that Dr. Harman would...