Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Internet FAQ Archives

REPOST: Artificial Intelligence FAQ: AI Web Directories & Online Papers 5/6 [Monthly posting]

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - MultiPage )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Schools ]
[[Repost because of rogue cancel by net vandal Gennady Kalmykov / Bloxy / jc]]

See reader questions & answers on this topic! - Help others by sharing your knowledge
Archive-name: ai-faq/general/part5
Posting-Frequency: monthly
Version: 2.1
Maintainer: Ric Crabbe <> and Amit Dubey <>
Size: 46277 bytes, 1114 lines

;;; ****************************************************************
;;; Answers to Questions about Artificial Intelligence *************
;;; ****************************************************************
;;; Written by Amit Dubey, Ric Crabbe, and Mark Kantrowitz
;;; ai_5.faq 

If you think of questions that are appropriate for this FAQ, or would
like to improve an answer, please send email to the maintainers.

Parts 5 and 6 of the FAQ are now under heavy construction.  The FTP & WWW
resources have been combined, since both are browser accessible these
days.  We're also pruning the entries to sites that include
information other than whatever project is being done at that
University, etc.

Part 5 (WWW & FTP Resources):
  [5-1] Weblogs, repositories, web directories and communities not
        aimed primarily at researchers
  [5-2] Repositories and web directories aimed primarily at researchers
  [5-3] AI Bibliographies available by FTP and WWW
  [5-4] Technical Reports available by FTP and WWW
  [5-5] Technical Reports for/by undergraduate students
  [5-6] Where can I get a machine readable dictionary, thesaurus, and
        other text corpora?
  [5-7] Where can I get training sets for machine learning algorithms?
  [5-8] What on-line journals are there?

Search for [#] to get to question number # quickly.

Subject: [5-1] Repositories and directories not aimed primarily at researchers AI Topics: Presented by AAAI, AI Topics is a "...web site provided ... for students, teachers, journalists, and everyone who would like to learn more about what artificial intelligence is, and what AI scientists do. [Their] goal is to offer a limited number of authoritative, non-technical resources that [they] have organized and annotated to provide you with meaningful access to basic information about the AI universe. Each of the AI Topics (see the navigation buttons to the left) will lead you to online and in-print sources of information. There has been an explosion in the number of Websites that catalog locations of AI information in a Yahoo-style directory. Although they often duplicate functionality, in the interest of fairness, I will list all the ones I know about here. Neuron AI Directory Neural Network Information in Polish: Generation 5: "Generation5 is aimed at presenting a website that will educate the viewer on Artificial Intelligence -- whatever the level of expertise. We have essays on the applications and history of AI for those unfamiliar, to essays on programming and philosophy, all the way to full blown mathematically-orientated essays on genetic algorithms and neural networks. Generation5 prides itself also in its interviews sections with exclusive interviews from top AI scientists like Marvin Minsky, Craig Reynolds, Roger Schank, Andre LaMothe and many others. Generation5 also has a comprehensive collection of original programs, all with source included. Demonstration programs like image recognizors, number recognizors, cellular automata creators, NLP demonstrator and more. All programs have an accompanying essay describing the workings of the programs. The aim of Generation5 is not only to educate the viewer, but to allow the viewer to contribute to further other people's knowledge. They can do this through the discussion boards, voting systems, and soon through AI Solutions (a scheme to submit code - with a accompanying monthly competition)." Yahoo Clubs: Yahoo maintains a number of AI clubs. There is the general AI Group, an online community that discusses AI (a resource for beginners). Their website is: There is also a resource for amateur robot enthusiasts at: Hubat: Hubat automatically generates subject hierarchies of links. Interesting because it had to take some AI to build the engine. Pentomino Site: Student run site at T.I.D. Ronse Belgium on searching Pentomino spaces.
Subject: [5-2] Repositories aimed primarily for researchers CMU AI Repository: The CMU Artificial Intelligence Repository was established by Carnegie Mellon University to contain public domain and freely distributable software, publications, and other materials of interest to AI researchers, educators, students, and practitioners. The AI Repository currently contains more than a gigabyte of material and is growing steadily. The AI Repository is accessible from: Artificial Life Online and the Artificial Life BBS: Sponsored by MIT Press and the Santa Fe Institute, Artificial Life Online and the Artificial Life BBS is intended to be a central information collection and distribution site on the Internet for any and all aspects of the Artificial Life endeavor. A special feature of the BBS is a collection of 40 or so local newsgroups dedicated to a wide variety of topics in Artificial Life. Artificial Life Online is accessible by World-Wide Web from Case based reasoning: ai-cbr aims to provide a comprehensive information base to Case-Based Reasoning academics and commercial developers. Through the dissemination of information it is hoped a stronger world-wide community of people interested in Case-Based Reasoning will be fostered and the commercial use of Case-Based Reasoning will increase. (added to the FAQ 2/2/00) A very complete list of resources including tutorials (added to the FAQ 2/2/00) Consortium for Lexical Research: [] Archive containing a variety of programs and data files related to natural language processing research, with a particular focus on lexical research. The file is a good place to start. See the file catalog or for a listing of the contents of the archive. Long descriptions are in the info/ subdirectory. Materials for paid-up members of the Consortium are in the members-only/ subdirectory. Public materials include the Alvey Natural Language Tools, Sowa's Conceptual Graph parser implemented in YACC by Maurice Pagnucco, a morphological parsing lexicon of English, a phonological rule compiler for PC-KIMMO, C source code for the NIST SGML parser, PC-KIMMO sources, the 1911 Roget Thesaurus, and a variety of word lists (including English, Dutch, and male/female/last names). Comments and questions may be directed to There are also some materials in unrelated to the archive. Fuzzy Logic Repositories: [] contains information concerning fuzzy logic, including bibliographies (bib/), product descriptions and demo versions (com/), machine readable published papers (lit/), miscellaneous information, documents and reports (txt/), and programs, code and compilers (prog/). You may download new items into the new/ subdirectory. If you deposit anything in new/, please inform The repository is maintained by Timothy Butler, Genetic Algorithms: The Genetic Algorithms Repository is accessible is also a WWW version at The information files includes Nici Schraudolph's survey of free and commercial GA software (send email to <> to add to the list). The software includes GAC (a simple GA written in C), GAL (a simple GA written in Common Lisp), GAucsd, GECO (a Common Lisp toolbox for constructing genetic algorithms), GENESIS, GENOCOP, Paragenesis (a parallel version of GENESIS that runs on the CM-200), SGA-C (a C implementation/extension of Goldberg's SGA system). Intelliwise: Sergio Navega maintains a large collection of AI links: Funic Neural Nets Archive Site: The Finnish University maintains an archive site containing a large collection of neural network papers and public domain software. The files are available through the web interface at or through FTP from FTP users: see the file 01README for details. There's also a directory for non-neural net AI stuff in the directory /pub/sci/ai. (Web service is still experimental as of 05/29/99). There is a list of mirrored ftp sites is in 04Neural_FTP_Sites. For further information, contact or Marko Gronroos <> (or <>). OSU Neuroprose: [] This directory contains technical reports, mostly from the early 90's, as a public service to the connectionist and neural network scientific community which has an organized mailing list (for info: NL Software Registry: [maintainer's note: links upto this point haven't been checked] The Natural Language Software Registry is a catalogue of software implementing core natural language processing techniques, whether available on a commercial or noncommercial basis. Some of the topics listed include speech signal processing, morphological analysis, parsers, natural language generation systems, and knowledge representation systems. The second edition of the catalog contains more than 100 descriptions of natural language processing software. The catalogue is available from the German Research Institute for Artificial Intelligence (DFKI) in Saarbruecken (Germany) at the URL The email contact for the site is Essex ROBOTS Archive: Contains robotics related information, hasn't been updated since 1995 or so: AI IN DESIGN WEBLIOGRAPHY These web pages contain links to pretty much everything concerned with the application of AI to Design. Miscellaneous AI: Some miscellaneous AI programs may be found on Most are mirrors of programs available at other sites. AI_ATTIC is an anonymous ftp collection of classic AI programs and other information maintained by the University of Texas at Austin. It includes Parry, Adventure, Shrdlu, Doctor, Eliza, Animals, Trek, Zork, Babbler, Jive, and some AI-related programming languages. This archive is available by anonymous ftp from in the directory /pub/AI_ATTIC. For more information, contact The QWERTZ toolbox, a library of Standard ML modules with an emphasis on symbolic Artificial Intelligence programming, (including implementations of heuristic search and an ATMS reason maintenance system) may be obtained by anonymous ftp from For more information, write to Tom Gordon <>.
Subject: [5-3] AI Bibliographies available by FTP General: There are many recent papers at: You can both browse and search; the searching ranks papers based on how often they have been referenced. Fuzzy Logic: A BibTeX database of references addressing neuro-fuzzy issues can be obtained by anonymous ftp from [] as the (ascii) file fuzzy-nn.bib. Genetic Algorithms: Logic Programming, Constraints: A BibTeX bibliography for Constraint Logic Programming is available by anonymous ftp from in the bib/ and papers/ subdirectories. NLP/CL: For information on a fairly complete bibliography of computational linguistics and natural language processing work from the 1980s, send mail to with the subject HELP. The CSLI linguistics bibliography contains 3,300 entries in bib/tib/refer format. The bibliography is heavily slanted towards phonetics and phonology but also includes a fair amount of computational morphology, syntax, semantics, and psycholinguistics. The bibliography can be used with James Alexander's tib bibliography system, which is available from [] among other places. The bibliography itself is available by anonymous ftp from Contributions are welcome, but should be in tib format. For more information, contact Andras Kornai <> NLG: Robert Dale's Natural Language Generation (NLG) bibliography is available by anonymous ftp from [] Note that it is formatted for A4 paper. Stick in a line .94 .94 scale after the %! line to print on 8.5 x 11 paper. For further information, write to Robert Dale, University of Edinburgh, Centre for Cognitive Science, 2 Buccleuch Place, Edinburgh EH8 9LW Scotland, or <> or <>. Mark Kantrowitz's Natural Language Generation (NLG) bibliography is available by anonymous ftp from [] In addition to the tech report, the BibTeX file containing the bibliography is also available. The bibliography contains more than 1,200 entries. A searchable index to the bibliography is available via the URL Additions and corrections should be sent to Neural Nets, Learning: A bibliography of over 1000 entries about Self-Organizing Map (SOM) and Learning vector Quantization (LVQ) studies is available by anonymous ftp from as the files references.bib.Z (BibTeX file) and (PostScript file). Please send additions and corrections to An extensive collection of references on Principal Component Analysis (PCA) neural networks and learning algorithms is available by anonymous ftp from in LaTeX and PostScript formats. The list was compiled by Liu-Yue Wang, a graduate student of Erkki Oja, and updated by Juha Karhunen, all from Helsinki University of Technology, Finland. For more information, contact Erkki Oja <>. A bibliography of PCA algorithms is available by anonymous ftp from as pca.bib. For more information, contact Terry Sanger <>. A 36-page bibliography of connectionist models with symbolic processing is available by anonymous ftp from Neuroprose [] as the file For more information, contact Ron Sun <>. Nonmonotonic Logic, Belief Revision: A bibliography on belief revision and nonmonotonic logics with about 2,000 items is available by anonymous ftp from [] as nonmono.bib or nonmono.bib.Z. The file is also available by WAIS as wais:// and by gopher/WWW. Please send additions and corrections to Raymundo Morado <>. Speech: A bibliography of papers on Silicon Auditory Models (VLSI implementations of auditory representations) is available by anonymous ftp from For more information, write to John Lazzaro <> Multi-agent Systems
Subject: [5-4] Technical Reports available by WWW/FTP This section lists the anonymous ftp sites for technical reports from several universities and other organizations. Some of the sites provide only an online catalog of technical reports, while the rest make the actual reports available online. The email address listed is that of the appropriate person to contact with questions about ordering technical reports. The main source of tech reports is now from Networked Computer Science Technical Reference Library or NCSTRL (pronounced "ancestral"). It's home page is: If that is a problem, you can go directly to: Other general locations for technical reports from several universities include: [] (see Index for an index) AKA [] The uwaterloo archive includes tech reports from the Logic Programming and Artificial Intelligence Group (LPAIG) of the University of Waterloo. There is also a WAIS server containing tech report abstracts that can be searched. To use, create the file ~/wais-sources/cs-techreport-abstracts.src containing (:source :version 3 :ip-address "" :ip-name "" :tcp-port 210 :database-name "cs-techreport-abstracts" :cost 0.00 :cost-unit :free :maintainer "") and invoke your local wais client. To add to it, email abstracts of your papers to in the following format: %TI Title %AU Author (use multiple %AU lines for multiple authors) %PU Published In (citation information) %AV Availability (e.g., ftp %OR Organization (see cs-techreport-archives.src for institution codes) %LT Local title (e.g., tech report number) %DA Date (and, if you want, %MN Month, %YR Year) %AB Abstract If your papers are not available by FTP, you can use a %AV line such as: %AV mail Further instructions are available from [Based on a post by Ashwin Ram.] Also see the Unified Computer Science Technical Report Index [this archive appears to be out of date -ed] A list of FTP sites for technical reports and papers can be found in A list of more than 230 sites publishing CS tech reports may be obtained by anonymous ftp from To receive notification of new tech report sites, send mail to to join the mailing list. An archive of linguistics papers and preprints is available from Contact John Lawler ( or for more information. The Concurrent Engineering Research Center (CERC) at West Virginia University has placed ASCII versions of the concurrent engineering-related abstracts (over 500) that were on CERCnet, ASCII back issues of the Concurrent Engineering Research in Review journal (now discontinued), and Postscript copies of CERC technical reports in the gopher server In addition, many of the CERC technical reports, including journal articles, symposium papers, theses, dissertations, and issues of the Concurrent Engineering Research in Review journal, are available as Postscript versions via anonymous ftp from [] An index to all the reports, including some that are available only in hardcopy, is contained in the file "CERC-TR-INDEX". If you need additional information, contact Mary Carriger, CERC Office of Information Services, at The newsgroup comp.doc.techreports is devoted to distributing lists of tech reports and their abstracts. MIT Artificial Intelligence Laboratory: ftp -- email -- www -- A full catalog of MIT AI Lab technical reports (and a listing of recent updates) may be obtained from the above location, by writing to Publications, Room NE43-818, M.I.T. Artificial Intelligence Laboratory, 545 Technology Square, Cambridge, MA 02139, USA, or by calling 1-617-253-6773. The catalog lists the technical reports ("AI Memos") with a short abstract and their current prices. There is also a charge for shipping. Some recent tech reports (since 1991) are available in the ai-publications/ subdirectory; older technical reports are NOT available by ftp. A bibliography is in the bibliography/ directory. CMU School of Computer Science: ftp -- email -- www -- CMU Software Engineering Institute: ftp -- email -- www -- Yale: ftp -- University of Washington CSE Tech Reports: ftp -- email -- ================ AT&T Bell Laboratories: ftp -- bib.Z contains short bibliography, including all the technical reports contained in this directory. ftp -- [Maintainer's note: I assume these have been moved over to Lucent's domain?] Argonne National Laboratory: ftp -- email -- Contains MCS Division preprints and technical memoranda, available as either .dvi or .ps files. For descriptions of the contents, see the subdirectory pub/tech_reports/abstracts; for the files themselves see the subdirectory pub/tech_reports/reports. Boston University: ftp -- email -- Brown University: ftp -- email -- Cambridge University: Speech, Vision & Robotics Group ftp -- Columbia University: ftp -- email -- DEC Cambridge Research Lab: ftp -- DEC Paris Research Lab: email -- Put commands in Subject: line of the message. To get a list of articles, use send index articles To get a list of tech reports, use send index reports DEC WRL: email -- To get a helpfile, send a message with help in the subject line. DFKI: ftp -- email -- Martin Henz ( Duke University: ftp -- email -- [unknown user, 7/7/93] Edinburgh: A list of available reports can be sent via email. Send requests for information about reports from the Center for Cognitive Science to, and from the Human Communication Research Center to Electrotechnical Laboratory, Japan: Reports from the Cooperative Architecture project (half AI, half software engineering). ftp -- [] See file Index.English. email -- Hideyuki Nakashima <>. Georgia Tech College of Computing, AI Group: ftp -- ( email -- Professor Ashwin Ram <> HCRC (Human Communication Research Centre): ftp -- mail -- Fiona-Anne Malcolm Human Communication Research Centre 2 Buccleuch Place, Edinburgh, UK Illinois: email -- Erna Amerman <> Illinois Genetic Algorithms Laboratory (IlliGAL): email -- Eric Thompson <> phone -- 217-333-2346 (9AM to 5PM CT, M-F) mail -- Illinois Genetic Algorithms Laboratory Department of General Engineering 117 Transportation Building 104 South Mathews Avenue Urbana, IL 61801-2996 ftp -- Includes the GA bibliography and the Messy GA code in C (in /pub/src/) and preprints (in /pub/papers/Publications) www -- Indiana: ftp -- [] ftp -- [] INRIA, France: ftp -- Institute for Learning Sciences at Northwestern University: ftp -- phone -- 708-491-3500 Mechanized Reasoning Group (MRG): ftp -- email -- Fausto Giunchiglia <> Mechanized Reasoning Group, IRST 38050 Povo Trento, Italy Tel: +39 461-314444 (secr.) +39 461-314436 (office) Fax: +39 461-302040 / 314591 National University of Singapore: ftp -- New York University (NYU): ftp -- OGI: ftp -- email -- Ohio State University, Laboratory for AI Research ftp -- email -- OSU Neuroprose: ftp -- ( This directory contains technical reports as a public service to the connectionist and neural network scientific community which has an organized mailing list (for info: Includes several bibliographies. Stanford: ftp -- Very spotty collection. SRI: email -- Donna O'Neal, SUNY Buffalo: ftp -- SUNY at Stony Brook: ftp -- email -- or The /pub/sunysb directory contains the SB-Prolog implementation of the Prolog language. Contact for more information. TCGA (The Clearinghouse for Genetic Algorithms): email -- Robert Elliott Smith <> Department of Engineering of Mechanics Room 210 Hardaway Hall The University of Alabama PO Box 870278 Tuscaloosa, AL 35487 205-348-1618, fax 205-348-6419 Thinking Machines: ftp -- This file contains a list of Thinking Machines technical reports. Orders may be placed by email (limit 5) to, or by US Mail to Thinking Machines Corporation, Attn: Technical reports, 245 First Street, Cambridge, MA 01241. In addition, the directories cm/starlisp and cm/starlogo contain code for the *Lisp and *Logo simulators. Tulane University: ftp -- [] University of Alabama: ftp -- University of Arizona: ftp -- email -- The directory /japan/kahaner.reports contains reports on AI in Japan, among other things, written by Dr. David Kahaner, a numerical analyst on sabbatical to the Office of Naval Research-Asia (ONR Asia) in Tokyo from NIST. The reports are not written in any sort of official capacity, but are quite interesting. University of California/Los Angeles: ftp -- University of California/Santa Cruz: ftp -- email -- University of Cambridge Computer Lab: email -- University of Colorado: ftp -- University of Florida: ftp -- University of Genoa, Mechanized Reasoning Group: ftp -- email -- Fausto Giunchiglia <> University of Georgia: ftp -- University of Illinois at Urbana: ftp -- email -- University of Indiana, Center for Research on Concepts and Cognition: ftp -- email -- University of Kaiserslautern, Germany: ftp -- University of Kentucky: ftp -- University of Massachusetts at Amherst: email -- University of Melbourne, Australia, Computer Vision and Pattern Recognition Laboratory (CVPRL): ftp -- University of Michigan: ftp -- University of North Carolina: ftp -- University of Pennsylvania: ftp -- email -- [email bounced 7/7/93] USC/Information Sciences Institute: email -- Sheila Coyazo <> is the contact. [email bounced 7/7/93] University of Toronto: ftp -- (Cognitive Robotics) email -- University of Virginia: ftp -- University of Western Australia: ftp -- Centre for Intelligent Information Processing Systems (CIIPS) EE Engineering Department University of Wisconsin: ftp -- email -- Some AI authors have set up repositories of their own papers: Matthew Ginsberg:
Subject: [5-5] Technical resources for/by undergraduate students Brainsciences A group of students at Brown University have created a web site to "provide a forum for undergraduates to publish their work. We feature reports of original research, book reviews, term papers, and other work in a similar vein."
Subject: [5-6] Where can I get a machine readable dictionary, thesaurus, and other text corpora? Free: /usr/dict/words Roget's 1911 Thesaurus is available by anonymous FTP from the Consortium for Lexical Research [] It is also available from An old Webster's dictionary is in /text/dict/{DICT.Z,DICT.INDEX.Z}. Project Gutenberg also has Roget's 1911 Thesaurus. The Project Gutenberg archive is at The Project Gutenberg archive collects public domain electronic books. For more information, write to Michael S. Hart, Professor of Electronic Text, Executive Director of Project Gutenberg Etext, Illinois Benedictine College, 5700 College Road, Lisle, IL 60532 or send email to For people without FTP, Austin Code Works sells floppy disks containing Roget's 1911 Thesaurus for $40.00. This money helps support the production of other useful texts, such as the 1913 Webster's dictionary. The Online Book Initiative maintains a text repository on (a public access UNIX system, 617-739-WRLD). See the README file on For more information, send email to, write to Software Tool & Die, 1330 Beacon Street, Brookline, MA 02146, or call 617-739-0202. The CHILDES project at Carnegie Mellon University has a lot of data of children speaking to adults, as well as the adult written and adult spoken corpora from the CORNELL project. Contact Brian MacWhinney <> for more information. The Association for Computational Linguistics (ACL) has a Data Collection Initiative. For more information, contact Donald Walker at Bellcore, Two lists of common female first names (4967 names) and male first names (2924 names) are available for anonymous ftp from Read the file README first. Send mail to for more information. A list of 110,000 English words (one per line, in ASCII) is available in the PD1:<MSDOS.LINGUISTICS> directory on SIMTEL20 as the files WORDS1.ZIP, WORDS2.ZIP, WORDS3.ZIP, and WORDS4.ZIP. Although the list is in MS-DOS files, it can easily be used on other machines (but first you'll have to unzip the files on a DOS machine). The list includes inflected forms of the words, such as plural nouns and the -s, -ed, and -ing forms of verbs; thus the number of lexical stems in the list is considerably smaller than the total number of word forms. These files are available via FTP from WSMR-SIMTEL20.ARMY.MIL []. SIMTEL20 files are mirrored on The Collins English Dictionary encoded as a Prolog fact base is available from the Oxford Text Archive by anonymous ftp from [] The Oxford Text Archive includes many other texts, dictionaries, thesauri, word lists, and so on, most of which are available for scholarly use and research only. See the files for more information, or write to, Oxford Text Archive, Oxford University Computing Services, 13 Banbury Road, Oxford OX2 6NN, UK, call 44-865-273238 or fax 44-865-273275. Chuck Wooters <> has extracted the most likely pronunciation for each of about 6100 words in the hand-labeled TIMIT database, and made them available by anonymous ftp from A list of homophones from general American English is available by anonymous ftp from as the file homophones-1.01.txt. To receive the list by email, send mail to The list was compiled by Tony Robinson. Sigurd P. Crossland <> has been compiling a dictionary of English words, including most common American words, abbreviations, hyphenations, and even incorrect spellings. The most recent version is available by anonymous ftp from The tar file includes 31 text files, one for each word-length from 2 to 32. The compressed tar file takes up just over 4mb of space, and includes approximately 870,000 words. WordNet is an English lexical reference system based on current psycholinguistic theories of human lexical memory. It organizes nouns, verbs and adjectives into synonym sets corresponding to lexical concepts. The sets are linked by a variety of relations. Besides being of scientific interest, it makes a handy thesaurus. WordNet is available by anonymous ftp from If you retrieve a copy of wordnet by ftp, please send mail to Commercial: Illumind publishes the Moby Thesaurus (25,000 roots/1.2 million synonyms), Moby Words (560,000 entries), Moby Hyphenator (155,000 entries), and the Moby Part-of-Speech (214,000 entries), Moby Pronunciator (167,000 entries with IPA encoding, syllabification, and primary, secondary, and tertiary stress marks) and Moby Language (100,000 word word lists in five major world languages) lexical databases. All databases are supplied in pure ASCII, royalty-free, in both Macintosh and MS-DOS disk formats (also in .Z file formats). Both commercial (to resell derived structures as part of commercial applications) and educational/research licenses are available. Samples of each of the lexical databases are available by anonymous ftp from []. For more information, write to Illumind, Attn: Grady Ward, 3449 Martha Court, Arcata, CA 95521, call/fax 707-826-7715, or send email to [Maintainer's note: This contact information is no longer valid. We're working on finding a current address.] The Oxford Text Archive has hundreds of online texts in a wide variety of languages, including a few dictionaries (the OED, Collins, etc.). The Lancaster-Oslo-Bergen (LOB), Brown, and London-Lund corpii are also available from them. For more information, write to Oxford Electronic Publishing, Oxford University Press, 200 Madison Avenue, New York, NY 10016, call 212-889-0206, or send mail to (Their contact information in England is Oxford Text Archive, Oxford University Computing Service, 13 Banbury Road, Oxford OX2 6NN, UK, +44 (865) 273238.) Mailing Lists: CORPORA is a mailing list for Text Corpora. It welcomes information and questions about text corpora such as availability, aspects of compiling and using corpora, software, tagging, parsing, and bibliography. To be added to the list, send a message to Contributions should be sent to Linguistic Data Consortium: The Linguistic Data Consortium was established to broaden the collection and distribution of speech and natural language data bases for the purposes of research and technology development in automatic speech recognition, natural language processing, and other areas where large amounts of linguistic data are needed. Information about the LDC is available by anonymous ftp from []. Documents available in this directory include a paper on the background, rationale and goals of the LDC, a brief list of available data bases, and some tables summarizing these corpora. For further information, contact Elizabeth Hodas, <>, Mark Liberman <>, or Jack Godfrey <>.
Subject: [5-7] Where can I get training sets for machine learning algorithms? UC/Irvine (UCI) AI/Machine Learning Repository: has a variety of AI-related materials, with a special focus on machine learning. For example, contains over 80 benchmark data sets for classifier systems (30mb). MLnet Machine Learning Archive MLnet Online Information Service In 1988 the Special Interest Group on Machine Learning of the German Society for Computer Science (GI e.V.) decided to establish a library of PROLOG implementations of Machine Learning algorithms. By 1994 the library had a sizable collection of GLPed PROLOG software. The site has grown, and now, according to the webpage it "offers a growing collection of ML information, datasets, software and pointers to other ML resources." The homepage is at: Send your contributions to Mathias Kirsten ( at the GMD - German National Research Center, or use the contribution facilities within the MLnet OiS.
Subject: [5-7] What on-line Journals are there? [this question is still in progress] Journal of Artificial Intelligence Research. See [3-2a]. Journal of Machine learning Reasearch. See [3-2n]. --- [ is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to <>, and ] [ ask youa news administrator to fix the problems with your system. ]

User Contributions:

Report this comment as inappropriate
Apr 4, 2023 @ 1:13 pm
You actually said it wonderfully!
english essay writer unique college essay

Comment about this article, ask questions, or add new information about this topic:

Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - MultiPage

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:

Last Update March 27 2014 @ 02:11 PM