This column is intended to help UNIXoids find free software, documentation, and other interesting files. It will concentrate on Internet archives, but will mention other ways (e.g., email, tangible media) of getting files when appropriate. Each column will cover a different set of freeware.

To assist you in saving this information for future reference, the editor and I have agreed to put each column on a single page, unencumbered by ads, etc. It is our hope that you will find the material to be worth saving, whether in clipping or photocopy form. The material will also be available online, on www.ptf.com/tin/.

As these columns emerge, you may wonder why a particular topic hasn't been covered, become annoyed that your favorite archive has been overlooked, etc. Please write (rdm@cfcl.com). I am always happy to know of new freeware resources, and your comments might well help me create an interesting column.

This month, instead of leaping into any particular set of freeware, I am going to discuss the Internet in general: what it is, how to get onto it, how to use it, etc.

The Internet

The Internet is a world-wide electonic network. Member systems are characterized by the fact that they have unique, official IP (Internet Protocol) addresses. Using these addresses, they are able to exchange packets of information. My machine in San Bruno can send packets to Australia, Japan, and San Francisco with equal ease.

The time taken is not equal, to be sure, nor is the number of intervening systems. As an Internet user, however, I don't have to worry about all that. I simply tell ftp which system I wish to contact, and the bits get routed automagically. Great Stuff.

I don't even need to remember the IP addresses. Internet sites have unique domain-based names. The domain sun.com, for instance, is commercial and belongs to Sun. The machine foo.sun.com, if it exists, is guaranteed to be under Sun's control. Note that it could be located anywhere; proximity and control are quite different things... Most nodes have some way to translate these names into IP addresses. On my machine, this is handled by the /etc/hosts file, which I edit to include interesting sites. Many sites use the Domain Name Service, which looks up IP addresses on demand.

The IP address is the Internet's official designation for a specific machine. You can specify an address, ignoring the name entirely, and things will work just fine. The domain-based name is more mnemonic, however, and may survive while the underlying address changes. So it is best to know both, just in case.

Not all sites with domain-based names are really on the Internet. Many UUCP sites have found it convenient to register their domains with a cooperative Internet host. You can send mail using the domain-based name, but any attempt at an online (packet-based) transaction will fail.

Getting on Board

Most universities, research organizations, and large corporations in the UNIX arena already have Internet access. Check with your system and/or network administrator to see if your firm does. If not, while trying to convince the management to spring for a connection, you can arrange for your own personal Internet access.

Find a local public-access UNIX system that offers Internet access. In this area (Silicon Valley), both Portal Communication and the WeLL do so. In Boston, Software Tool & Die offers the service. In some areas, local universities offer accounts to the general public. Get an account ($10-$20 per month is typical) and start to look around. The first tool you'll need for this is FTP.

File Transfer Protocol (FTP)

FTP allows a user to log in to a system, look around a bit, and pick up or drop off files. It's pretty primitive and can be somewhat frustrating to use. Nonetheless, it is univerally supported on the Internet, and it gives you access to something like 150 GB of publicly available archives.

You can get an idea of FTP's capabilities by reading the ftp(1) manual page. For a much clearer description, however, get a copy of "The Whole Internet User's Guide & Catalog" (Ed Krol, O'Reilly, 1992, ISBN 1-56592-025-2). In fact, this book is a must buy for anyone interested in foraging on the Internet. Get it!

Publicly available archives accept FTP logins by the account "anonymous". (Most archives accept "ftp" as well, saving the fingers of poor typists like myself.) They then ask for the user's real email address, to help track usage and resolve problems.

Once you are logged in, do an ls to see what's there. You will probably want to cd to the /pub directory, where most archives keep their publicly accessible files. The ftp(1) command language contains several commands that are very similar to their UNIX equivalents. This, along with the very limited number of total commands, makes ftp(1) an easy tool to master.

Good Pickings

Some Internet sites have only a few offerings, frequently specializing in only a few areas. The Icon and SR Projects, for instance, reside on ftp://cs.arizona.edu/. Others archives are more like huge and overflowing closets. In this camp, I suggest forays into ftp://ftp.uu.net/, ftp://gatekeeper.dec.com/, and ftp://wuarchive.wustl.edu/.