![]()
This is a follow up to my whining about more consistency in the
news.answers header usage. Now I can tell you more about the 'why' side.
For quite a while now, I have been playing with html conversion utilities.
They started out being written for very specific needs and were not very
useful from a "general purpose" perspective. I have recently corrected
that situation... ;-)
I have created an automatically generated archive of Usenet postings to
the newsgroups *.answers in Hypertext format. The archive does not require
human intervention at all.
I have not made this public yet but plan on it. I'd like to ask your help
by taking a moment to review your specific FAQs. I've tried _very_ hard to
get things right but I'm sure there is something I've missed. The archive
is located at
At this time I am only announcing it to the faq-maintainers list. I am open
to all suggestions and promise you I will correct any problems in the
html conversion or index generation. Content problems are your concern. ;-)
I am not rushing this out. I'd like it to be totally correct and as usable
as possible before general availability. I've had this up and running for
the last couple weeks and have been incorporating ideas I've received.
As an FYI: A quick way to see if the listed references in your FAQs are valid
is to try to follow them. Some are out-of-date and you can use this facility
to quickly check and see. I'm already using it that way for my FAQs. So by
verifying that my software is doing its job properly, you are also improving
your FAQ for your readership. Can't beat that huh ? ;-)
In keeping with the spirit of FAQs, here are answers to some questions I've
been asked (as well as some of my own).
How is the archive built ?
----------------------------
The entire WWW HTML news.answers archive is rebuilt each day from a
mirrored copy of RTFM's master news.answers archive.
Each FAQ is converted into a single hypertext document. All FAQs are scanned
for various references, and have hypertext links automatically inserted when
such references are found.
The build procedure completely rebuilds the articles after it has updated
the mirrored ftp copy locally. Both the original text versions and the
html versions are available here. The Mirror software is setup to "do
deletes" so if an FAQ is removed from rtfm.mit.edu it will be removed from
my copy as well prior to the next html build. The process is a bit inefficient
but the trade off is having to deal directly with the removal process. This
way it's avoided. ;-)
How is the FAQ archive organized?
----------------------------------
The FAQ archive here is really two different archives, the HTML based FAQs
and an FTP mirrored archive. Take your pick as to which best suits your needs.
There are two different ways to access the HTML based FAQs, via the Index
files or via the full text search capabilities. There are different types of
Index files, the top level index file, by-archive-name.html or the
by-newsgroup.html index files.
* The FAQs by Category Index - This is the default top level index.
* The FAQs by Archive-name Index - This lists the FAQs by the
Archive-name header that was assigned by the *.answers moderation team.
* The FAQs by Newsgroup Index - This lists each newsgroup and the
associated FAQs posted to the newsgroup. !!! CAUTION !!! This is a HUGE
index and will not be nice to slow links... I'm presently redoing
this completely.
What was used to create the HTML FAQs
-------------------------------------
The FAQs were converted to HTML with software written in C, Perl and Shell.
They were completely written from scratch by me. The entire conversion
process is under a second-generation review with a focus on cleaner, quicker
generation and archive performance. This will not affect how they are
accessed or appear. It is purely a behind the scenes efficiency concern.
How can I get my FAQ to appear in the archive
----------------------------------------------
This archive is updated automatically from a local ftp mirror of the master
FAQ archive site at MIT. If your FAQ is distributed via the news.answers
newsgroup then it will automatically appear in this archive.
What if there is an error in my FAQ ?
--------------------------------------
If there has been some mistake in the conversion to HTML, please let me know
immediately at faq-admin@landfield.com. I will assure your FAQ is converted
correctly.
My FAQ has a home page where the community can get the
most current copy. Will that be supported here ?
--------------------------------------------------
Warning - Auxiliary Header Whining!!! - Warning :)
If you have an HTML version of your FAQ and would like us to refer to it then
simply use the URL: Auxiliary header documented in the *.answers submission
guidelines.
If you specify the Url for your FAQ's home page in the URL: header it
will be included automatically in the index files and labeled as the
FAQ's Home Page.
What about FAQs that aren't posted to news.answers?
----------------------------------------------------
If you are unable to post your FAQ via news.answers and would like your FAQ
to appear in this archive then please send it to faq-admin@landfield.com and
let me know of your request. I will insert it and changes into the archive
manually.
PLEASE NOTE: I <STRONG>strongly encourage you to go though the minimal process
of getting it posted to a *.answers newsgroup. You are taking your time to
write and maintain it. Make sure you are getting the widest possible exposure
for your work thus allowing more people to benefit from it.</STRONG>
Can I provide you with a hypertext copy ?
-------------------------------------------
As stated above... it is far better for all if the process is automated. That
way there are no concerns about timely updates. If you have a URL: Auxiliary
header in your *.answers posted FAQ, there is no need to supply hypertext as
the index will point to the location specified in the URL: header.
If the FAQ is not being posted to news.answers (see the encouragement
above), then hypertext is preferred.
Will I have trouble deleting my FAQ when I quit posting it ?
-------------------------------------------------------------
The Mirror software used here to keep an exact copy of the RTFM master
FAQ archive is setup to "do deletes". In other words, if the *.answers
moderation team removes your FAQ from the RTFM FAQ archive, it will be
removed from my copy of the FAQ archive prior to the next HTML archive
build.
(The answer to your question is NO, as long as you notify the *.answers team
that you want it deleted.)
Also note, if you do not wish to post to news.answers and you send me a
document, you need to let me know... (Again, see the encouragement above.)
Is it ok to make links to the FAQs in this archive?
---------------------------------------------------
Encouraged ;-), but totally up to you. There is no problem on this end.
What problems may I encounter reading the pages?
------------------------------------------------
The hypertext links to URLs are added without verification. This may
cause things to be turned into bogus links.
If the information is wrong or out of date in a specific FAQ on RTFM
then it will be here as well.
I will be trying to add more verification to the process but...
What is on the TODO list for the WWW FAQ archive ?
--------------------------------------------------
The following is a brief list of things being considered for enhancing the
the WWW FAQ archive. If you have a pet favorite that is not here, please let
us know and we'll consider it. The focus here is on usability.
* The Index files (by-newsgroup, by-archive-name) are too large. Need to
have a better way to view them.
* Create a by-subject Index file
* Automated creation of a statistics page to keep track of
- How many FAQ files exist
- How many newsgroups they are posted in
- How many bytes of raw text
- How many bytes of html formatted text
- How many url references are encountered
+ ftp
+ gopher
+ http
+ https
+ mailto
+ wais
+ telnet
+ news
+ nntp
+ ...
* Replace known Usenet Info postings everywhere they are referenced. For
example:
- Emily Postnews Answers Your Questions on Netiquette
- Rules for posting to Usenet
- A Primer on How to Work With the Usenet Community
- Answers to Frequently Asked Questions about Usenet
- Hints on writing style for Usenet
- What is Usenet?
* Ability to limit search to
- RFC1036 + auxiliary header information
- FAQ Files names
- Full text
- All the above
Now I have a question for you...
---------------------------------
Some FAQs are formatted in the Digest Message Format specified in RFC1153.
Others support the Minimal Digest Format, while others of you use a format
of your own personal creation.
How would you like to see the FAQs formatted that are in one of the digest
formats ? Should they be split up as was done at Ohio State or as a single
file as I currently have ? Is splitting up the files into section pages
valuable ? As in more pleasant to read ? Or is it not worth the effort ?
Is there a specific type of index or method of access that you see as
missing and would like to see added ?
If you have an opinion on any of these questions or other suggestions,
please let me know. I am trying to determine what people would like to
see. Thanks!
-----------------------------------------
Suggestions that have been incorporated:
-----------------------------------------
1. Strip out the Approved: headers. It just might keep a few unapproved
articles from showing up places they don't belong, and it will almost
certainly cut down on the number of people who write the moderators,
thinking we know anything at all about the contents of any given FAQ.
2. For multi-part postings which follow the usual Archive-name format of
partN, links at the top and bottom of the page to the next and
previous pages would be nice, and a link up to the previous index
level would, too.
3. If a posting has a URL: header, point there or maybe we could setup a
reference page with a pointer to the local copy and a pointer to the
remote copy and the user can select which they want to view.
4. Full Text search capabilities.
-----------------------------------------
-------------------------
Current work in-progress
-------------------------
1. By-* index rewrite so they are not so large.
2. Syncing the dates on the files with the RTFM file dates and times
so the searching can be limited to FAQs modified in the last X days.
-------------------------
-Kent+
-- Kent Landfield Phone: 1-817-545-2502 The Landfield Group FAX: 1-817-545-7650 Email: kent@landfield.com http://www.landfield.com/ Please send comp.sources.misc related mail to kent@uunet.uu.net.
[
Usenet Hypertext FAQ Archive |
Search Mail Archive |
Authors |
Usenet
]
[
1993 |
1994 |
1995 |
1996 |
1997
]
![]()
© Copyright The Landfield Group, 1997
All rights reserved