Content area
This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to a work given its set of Library of Congress Subject Headings (LCSH). LCCs are organized in a tree: The root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a model that maps from sets of LCSH to classifications from the LCC tree. We present empirical results for our technique showing its accuracy on an independent collection of 50,000 LCSH/LCC pairs. [PUBLICATION ABSTRACT]
Details
Libraries;
Library cataloging;
Classification;
Machine learning;
Maps;
Topics;
Metadata;
Science;
Subject heading schemes;
Library of Congress Subject Headings;
Library catalogs;
Librarians;
Access to information;
Internet resources;
Information science;
Digital libraries;
Subject headings;
Legislatures;
Hierarchies
