Content area
Full Text
ARCHIE: THE WHY
Public access archive sites on the Internet are host machines commonly referred to as "archive servers" that maintain vast collections of electronic files. Types of files include computer software (public domain and shareware), source code, electronic documents, graphics and data sets. The good-natured folk who administer these sites make these resources freely available to anyone on the Internet, via anonymous FTP (File Transfer Protocol). You can log on to a host using the login name "anonymous," and then you are free to download anything of interest, within the reasonable bounds of etiquette.
There are hundreds of archive sites around the Internet that maintain literally millions of files--a huge information resource ripe for the picking--sounds good, doesn't it? Good that is, if only you could find out what is available, which host has it, and the directory location of the file on their machine. With increasing degrees of sophistication this has been made possible with a utility known as Archie.
ARCHIE: THE WHAT
In library-speak, Archie is an online union catalog of the "electronic" holdings of many anonymous FTP archive sites available on the Internet. According to Ed Krol in The Whole Internet 1!, Archie is currently indexing approximately 1,200 sites worldwide representing more than 2.1 million files. This is an enormous resource base that is continually growing.
Archie was originally conceived in 1989 at McGill University Computing Centre in Montreal as a personal project to monitor public domain software at a few archive sites 2!. Its potential as a network resource was quickly recognized and developed in the computing center at McGill. This first electronic directory service for the Internet was commissioned on the network in November 1990 and christened Archie (Archive minus the
In broad terms Archie can be described as an intelligent database system. It is intelligent because it builds and maintains its database both automatically and regularly through its subsystems. Archie comprises three distinct subsystems: the Database Gathering Component (DGC), the Database Maintenance Component (DMC), and the User Access Component (UAC).
The first two components are transparent to the user. In simple terms the DGC logs on to each of its "known" archive sites over a 30-day cycle, and takes a recursive listing of the directory contents of that...