Content area
Full Text
This paper discusses the applications and importance of content-based information retrieval technology in digital libraries. It generalizes the process and analyzes current examples in four areas of the technology. Content-based information retrieval has been shown to be an effective way to search for the type of multimedia documents that are increasingly stored in digital libraries. As a good complement to traditional text-based information retrieval technology, content-based information retrieval will be a significant trend for the development of digital libraries.
With several decades of their development, digital libraries are no longer a myth. In fact, some general digital libraries such as the National Science Digital Library (NSDL) and the Internet Public Library are widely known and used. The advance of computer technology makes it possible to include a colossal amount of information in various formats in a digital library. In addition to traditional text-based documents such as books and articles, other types of materials-including images, audio, and video-can also be easily digitized and stored. Therefore, how to retrieve and present this multimedia information effectively through the interface of a digital library becomes a significant research topic.
Currently, there are three methods of retrieving information in a digital library. The first and the easiest way is free browsing. By this means, a user browses through a collection and looks for desired information. The second method-the most popular technique used today-is text-based retrieval. Through this method, textual information (full text of text-based documents and/ or metadata of multimedia documents) is indexed so that a user can search the digital library by using keywords or controlled terms. The third method is content-based retrieval, which enables a user to search multimedia information in terms of the actual content of image, audio, or video (Marques and Furht 2002). Some content features that have been studied so far include color, texture, size, shape, motion, and pitch.
While some may argue that text-based retrieval techniques are good enough to locate desired multimedia information, as long as it is assigned proper metadata or tags, words are not sufficient to describe what is sometimes in a human's mind. Imagine a few examples: A patron comes to a public library with a picture of a rare insect. Without expertise in entomology, the librarian won't know...