Content area
Full Text
1. Introduction
The dramatic increase in web sites, to around 172,338,726 by June 2008 ([22] Netcraft, 2008), make searching and accessing specific materials harder and almost impossible without any aid. The software robot is one of the most wide spread tools used for such data searching services. For example, search engines like Google ([12] Google, 2008a), Yahoo ([36] Yahoo, 2008a), MSN ([20] MSN, 2008a), etc., all use software robots to collect data on the internet and then provide that information to the public. Whilst having obvious benefits, the software robots do have some underlying disadvantages: the unlimited access from robots may overload the Web server and the re-use of the data collected by the robots may result in copyright and other legal disputes. To diminish such possible conflicts, the Robots.txt and its supplement, Robots Meta tags, are the most common and straightforward instruments that can be used by webmasters to exclude unwelcome robots or, as some recent legal cases suggested, grant a license in terms of digital copyright to permit legitimate access from specific robots.
Even though Robots.txt and Robots Meta tags are taking on more significant roles today, they have not been fully investigated by researchers. Only a few peer reviewed academic papers in relation to this topic have been released ([5] Chau and Chen, 2003) and, as a result, sporadic amendment proposals are based on personal experience rather than general principles ([8] Conner, 1996; [16] Koster, 1994). In fact, with the current popularity of Robots.txt and Robots Meta tags, it is time for a wide-ranging review of the Robot.txt and Robots Meta tags and to recommend a comprehensive mechanism to regulate the relationship between robots and webmasters.
At the beginning of this paper, we will turn our attention to the software robot, with the Robot.txt and Robots Meta tags being viewed in Section 2 and Section 3. Before calling attention to the functions of the Robot.txt and Robots Meta tags, in Section 4, we raise some issues related to the software robots and the webmasters. In the following sections, we will review the original and newly developed functions - expressing online copyright authorization - of the Robot.txt and Robots Meta tags and reveal some uncertainties and a few disadvantages that have arisen. Next,...