Perl HTML conversion script

---------

Frederic Albrecht (fred@calvacom.fr)
Wed, 24 May 1995 17:04:37 +0200 (MET DST)


Here is outline2faq....

A few details about it since there are a few things the program doesn't
cope with :

o The table of contents must be pasted by hand (in both the text and
the HTML format). I couldn't find a simple way of automating this.
o A section (or subsection) title must be on a single line. Any extra
line will be silently discarded (well it will be treated as any other
text line)
o Only two levels are supported (section & subsection), the
script dies w/ an error and the number of the offending line
if there are more. This is trivial to change.
o URLs must have whitespace before and after them, an URL directly
followed by punctuation will create an invalid link. A URL that
is at the beginning of a line will be ignored.
o ULR's are detected by scanning for the string "://", if that
string appears in something that isn't a ULR a weird link will
be generated.

Any feedback would be appreciated since I'm not a very good programmer
and I'm just beginning to learn Perl... If someone besides me actually
finds a use for this program, I'l put the current version on an FTP site
I have access to...
Oh, and this was written for Perl 4.36, I have no idea if it runs with
5.x...

------- outline2doc -----
#!/usr/bin/perl

# outline2faq version 1.0
# Comments : fred@asgard.calvacom.fr

$ext1=".faq"; # extension for document
$ext2=".toc"; # extension for table of contents
$ext3=".html"; # extension for HTML version
$ext4=".html.toc"; # extension for HTML-ized table of contents

$level = -1; # current numbering for TOC level (we cheat to start at 0)
$sublevel = 0; # current numbering for TOC sublevel

$foo = 0; # count of lines in the input file (tourist info)

$nbparam = $#ARGV + 1;

if ($nbparam != 1)
{
die "outline2doc: error: no parameter\n\n";
}

$outline = $ARGV[0];

$docfile = $outline.$ext1;
$tocfile = $outline.$ext2;
$htmlfile = $outline.$ext3;
$htmltocfile= $outline.$ext4;

# IN is input file (outline format)
# DOC is output file w/ numbered sections
# TOC is table of contents to paste in the DOC file
# HTM is HTML version of file
# HTT is HTML table of Contents (to be included in html file)
open (IN, $outline) || die "outline2doc: ERROR: couldn't open $outline\n";
open (DOC, ">$docfile") || die "outline2doc: ERROR: can't create $docfile\n";
open (TOC, ">$tocfile") || die "outline2doc: ERROR: can't create $tocfile\n";
open (HTM, ">$htmlfile") || die "outline2doc: ERROR: can't create $htmlfile\n";
open (HTT, ">$htmltocfile") || die "outline2doc: ERROR: can't create $htmltocfile\n";

print "txt output : $docfile \n";
print "toc output : $tocfile \n";
print "HTML output : $htmlfile\n";
print "HTML toc output : $htmltocfile\n";

print HTM "<HTML>\n<HEAD>\n<TITLE>The PC Clone Operating Systems List</TITLE>\n</HEAD>\n\n<BODY>\n<PRE>\n";

while ($line_in = <IN>)
{
++$foo;
if ($line_in =~ m/^\*+/) # any *'s at the beginning of the line ?
{
if ($& eq "*")
{
$bar=$';
chop ($bar);
$level++;
$sublevel = 0;
print TOC "$level.$sublevel. $bar\n";
print DOC "$level.$sublevel. $bar\n";
print HTM "<H1><A NAME=\"$level.$sublevel\">$level.$sublevel. $bar</A></H1>\n";
print HTT "<A HREF=\"#$level.$sublevel\">$level.$sublevel. $bar</A>\n";
}
elsif ($& eq "**")
{
$bar=$';
chop ($bar);
++$sublevel;
print TOC "\t$level.$sublevel. $bar\n";
print DOC "$level.$sublevel. $bar\n";
print HTM "<H2><A NAME=\"$level.$sublevel\">$level.$sublevel. $bar</A></H2>\n";
print HTT "<A HREF=\"#$level.$sublevel\">$level.$sublevel. $bar</A>\n";
}
else # There were more than two *'s on one line
{
die "outline2doc: ERROR: Too many sublevels on line $foo of $outline\n";
}
}
else # not a title line
{
print DOC $line_in;
if ($line_in =~ m/\s\S*:\/\/\S*\s/) # There is an URL in the line
{
print HTM "$`<A HREF=\"$&\">$&</A>$'";
}
else
{
print HTM $line_in;
}
}
}

print HTM "</PRE>\n</BODY>\n</HTML>\n";

close (<IN>);
close (<DOC>);
close (<TOC>);
close (<HTM>);
close (<HTT>);

print "\nDone parsing \"$outline\" ($foo lines)\n";

---- end of outline2faq -----

Fred.
-----------------------------------------------------------------------------
This text is entirely made of the freshest hand picked electrons
I speak for no one except me and my cat.
---------------------------------- Linux ------------------------------------



[ Usenet Hypertext FAQ Archive | Search Mail Archive | Authors | Usenet ]
[ 1993 | 1994 | 1995 | 1996 | 1997 ]

---------

faq-admin@landfield.com

© Copyright The Landfield Group, 1997
All rights reserved