The Internet FTP archives contain many interesting repositories of text. This month, I will touch upon several of these. Readers are invited, as always, to send email about archives I've overlooked.

Project Gutenberg

Michael S. Hart hart@vmd.cso.uiuc.edu is the Director of Project Gutenberg. For the last two decades, he has been leading an effort to build an electronic library of essential documents. Some of these started out in public domain electronic form. Most, however, fell into the public domain with the passage of time and were then free to be converted, via scanners and OCR, to "etexts" Borrowing liberally from the project's blurb.gut and NEWUSER.GUT files, here is a brief synopsis of the project:

The project is well on its way. The 1991-1993 directories contain, respectively, about 14, 30, and 40 MB of material. The project's scope and influence are growing. As new users arrive and (sometimes) turn into volunteers, it will grow ever more rapidly. FTP to ftp://mrcnext.cso.uiuc.edu/gutnberg.doc/ and crawl around the etext directory.

Open Book Initiative

Barry Shein bzs@world.std.com is a very busy guy. When he's not acting as a Technical Editor for SunExpert Magazine, sitting on the boards of SUG and USENIX, or running The World (a public-access UNIX system), he promotes the idea of electronically accessible text archives.

To check out the results of Barry's efforts, FTP to ftp://ftp.std.com/obi/ There are about two hundred top-level directories, on a wide variety of topics. Check it out!

While you're at it, check out ftp://nctuccca.edu.tw/documents/. Nctuccca is a huge (14 GB) archive of text files, etc. It is provided by the Campus Computer Communication Association of the National Chiao Tung University in Taiwan. Unless you're located in Asia, I suggest that you use this site to find out about interesting items, then FTP them from the original (mirrored) sites.

Technical Reports, etc.

Universities and research laboratories create large numbers of technical reports. Unfortunately, most of these never reach a wide audience. The difficulty of locating the reports, whether in printed or electronic form, is simply too much for most casual inquirers.

Enter Vincent Cate vac@cs.cmu.edu and Alex, with a solution. Alex is a user-mode daemon which allows UNIX systems to "mount" the Internet as an NFS file tree. As a demonstration project, Vincent created a database of computer science technical reports.

The information is somewhat out of date (April, 1992), but still quite interesting. And, because universities don't tend to disappear, most of the information should still be valid. Vincent has also used Alex to index a few dozen other topics, from audio to weather.

To check out Alex, FTP to ftp://alex.sp.cs.cmu.edu/ and look around. The computer science technical reports are in cs-techreports. The links to miscellaneous topics are kept in links.

FAQs, IENs, IETFs, RFCs, etc.

There are numerous archives of Internet- and USENET-related text: far too many to summarize here. I can give you some useful starting points, however. For answers to USENET Frequently Asked Questions (FAQs) and related files, try ftp://rtfm.mit.edu/

PSI's FTP archive, ftp://ftp.psi.com/, contains a wealth of Internet memoranda and related text. The Internet Experiment Notes (IENs) are kept in ien. The Internet Engineering Task Force (IETF) reports are kept in ietf. Requests For Comments (RFCs) are kept in rfc. Et Cetera.

Miscellanea

Dave Lampson lampson@pulse.com maintains a Classical Music Information Archive in ftp://cs.uwp.edu/pub/music/ This includes a CD Buying Guide, with extensive lists of recommended CDs, information on manufacturers and distributors, mail order sources, and publications. There is also a Basic Repertoire List, containing information on music considered to be part of the basic repertoire, organized by musical period and composer. Finally, the Timeline file graphically depicts the life spans of 80+ composers in a timeline format.

Denis Howe dbh@doc.ic.ac.uk maintains a Free On-line Dictionary of Computing in http://wombat.doc.ic.ac.uk/ The dictionary covers programming languages, architectures, domain theory, mathematics, networking, in fact anything to do with computing. Eric Raymond esr@snark.thyrsus.com/ maintains the essential, if somewhat more whimsical "Jargon File" It is available from ftp://prep.ai.mit.edu/pub/gnu

Further Reading

Ed Krol's book, "The Whole Internet: User's Guide & Catalog" (O'Reilly, 1992, ISBN 1-56592-025-2), is a great jumping-off point for finding oddball Internet resources. It is also a useful guide to using the Internet, in general.