Re: Maintenance for both flat text and HTML?

---------

Dan Wallach (dwallach@CS.Princeton.EDU)
Thu, 18 May 1995 09:20:31 -0400


I used to use lynx to convert HTML to text, but it had some serious
bugs when I got fancy with markup tags. Instead, I'm using the new
netscape -remote mechanism. I'm attaching the conversion rule from
my Makefile below. I edit .xhtml files, which can have conditional
text (parsed by my simple-pp program).

If you're interested in simple-pp, check out this:

http://www.cs.princeton.edu/~dwallach/simple-pp/

The Netscape Remote API is still a little flakey, even with the
sleep commands between them. Be sure to check your ASCII output
to make sure it's correct before shipping it. You need to be
running Netscape before you type make.

Lastly, I should include a plug for weblint. It's amazing. It's
caught all kinds of bugs which I've tended to introduce when I
do my monthly FAQ edits. 'make lint' is all it takes with this
Makefile.

NOTE: If you're not a programmer, you should be able to get one
to set up a Makefile for you. If you don't know HTML, you can
use one of these editing tools on a Mac or PC, although that's
likely to make it hard to include the directives to simple-pp.
If you're not using Unix, I'm afraid I can't help you.

Enjoy!

Dan (Typing Injury FAQ)
http://www.cs.princeton.edu/~dwallach/

P.S. The stuff below is only a section of the Makefile. If somebody
wants to see the whole thing, send me mail.

P.P.S. If I don't get that mail today, I won't see it for nearly two
weeks because I'm going on vacation.

P^3.S. simple-pp sometimes causes perl5.001 to dump core. perl5.000
and perl4.036 work fine.

HTML_PP_DEFS = -define TIFTP=ftp://ftp.csua.berkeley.edu/pub/typing-injury \
-define TIFAQ=http://www.cs.princeton.edu/~dwallach/tifaq

HTML_PP = simple-pp $(HTML_PP_DEFS)

.SUFFIXES: .html .xhtml .hlint

##
## This rule can convert "foo.xhtml" to "foo.html"
##
.xhtml.html:
rm -f $@
$(HTML_PP) -define html $< > $@
chmod 444 $@

##
## This rule is a phony rule but it lets you say "make lint" and all
## HTML documents are checked -- line numbers are usually wrong because
## of pre-processor jumbling things around.
##
.html.hlint:
weblint $<

lint: $(HTML_FILES:.html=.hlint)

##
## This rule can convert "foo.xhtml" to "foo" (plain ASCII).
##
## note that we take two passes through the pre-processor -- one before
## Netscape and one afterward
##
.xhtml:
rm -f $@
$(HTML_PP) -define news $< > $@-tmp.html
netscape -remote "openFile($(PWD)/$@-tmp.html)"
@sleep 2
netscape -remote "saveAs($(PWD)/tmp.txt, Text)"
@sleep 2
$(HTML_PP) < tmp.txt > $@
chmod 444 $@
rm -f $@-tmp.html tmp.txt $@.hdr



[ Usenet Hypertext FAQ Archive | Search Mail Archive | Authors | Usenet ]
[ 1993 | 1994 | 1995 | 1996 | 1997 ]

---------

faq-admin@landfield.com

© Copyright The Landfield Group, 1997
All rights reserved