Content area
Full text
Abstract
Purpose - This study aims to examine the differences between web image and textual queries.
Design/methodology/approach - A large number of web queries from image and textual search engines were analysed and compared based on their factual characteristics, query types, and search interests.
Findings - Useful results include the findings that web users tend to input short queries when searching for visual or textual information; that image requests have more zero hits and higher specificity, and contain more refined queries; that web image requests are more focused than textual requests on some popular search interests, and that the variety of textual queries is greater than that of image requests.
Originality/value - This study provides results that may enhance one's understanding of web-searching behaviour and the inherent implications for the improvement of current web image retrieval systems.
Keywords Worldwide web, Visual databases, Query languages, Text retrieval, User studies
Paper type Research paper
Introduction
There has been a substantial increase in the availability of image collections on the web. Several web search engines now attempt to index publicly accessible image files and offer image search capabilities (Tomaiuolo, 2002). For example, Google claims to have more than 11.8 billion indexed images (Google, 2004). Currently, such web image search engines provide keyword search options as in textual information retrieval, that is, by matching users' typed-in keywords with the textual information attached to web images, including their filenames and available captions, HTML ALT tags, and the surrounding text appearing on the same web pages (Smith and Chang, 1997). Using text to search images may not prove successful in every case, however, especially when web images are not organized in controlled collections and adequate annotations are not available for searching. Searches that produce no hits are not uncommon. For example, in our analysis of a three-month log with over 2.4 million real image queries, almost 19 percent of the total queries resulted in no hits.
Undoubtedly, it is necessary to understand the differences between users' queries for image and textual information before a more effective image retrieval system can be developed. Several studies on web textual queries have revealed that users tend to type in short queries, conduct short query sessions, and view few result pages (Jansen et al.,...





