Welcome to rss2html version 0.2.3 (Stand and Deliver)
If you're unfamiliar with what's going on, here's the explanation:
Many weblogs and websites auto-generate RDF files, in a format called RSS,
which are a meta file which explain HTML documents. It's a really nifty
way for sites to transfer links and such to each other. However,
if you want to use the RSS style RDF files and don't want to install
all kinds of libraries, modules, or the like, it's impossible to
easily parse these things.
rss2html comes in here by replacing the tags in the file with
HTML, so you can go ahead and paste it into your website or use
some script or whatnot to do that all for you. As it comes, it understands
the most basic RSS tags, such as <description>, <title>, <link> and the like,
and also their corresponding closing tags. It works well for rss versions
pre-1.0 (like the common 0.91).
Still confused? There's an example at
http://www.diablonet.net/~mercadal/common/derpsite.php, the output of
which you can most definitely put on your website.
How Do I Start:
On most every unix: make all
That will leave an executable named rss2html in the distribution's
directory. Copy that to /usr/local/bin or wherever you want it to
be. If you get errors, you may need to change the name of your
compiler in the makefile (probably either gcc or cc)
Now that you've got it installed and you want to parse that darned
RSS file that you're friend's got on his site, all you need to do
is download the file and type (for example)
wget http://www.diablonet.net/~mercadal/common/derpsite.php -O - | rss2html > output.html
which would leave you with a snippet of HTML in the output file there.
Maybe I can incorporate part of libcurl or the like in an upcoming
distribution so rss2html would yank the RDF file off the web for you,
process, and output it.
If all you wanted was a basic HTML snippet from an RDF file, you can
stop reading here.
Are there any other options?
Sure, the options you can set for rss2html include:
-i Specify the name of an input file (instead of the standard input)
-o Specify the name of an output file (instead of the standard output)
-q Run quiet (don't print unknown tags to the standard error)
-u Specify the name of a user-defined tags file (see next section)
User-defined Actions for RDF Tags:
If, for some reason, you don't like the HTML that rss2html puts out
for a tag, you can set an alternate tag file for your own use (by using
the -u option) For example, say every time an RSS <item> tag shows up you
want rss2html to display an asterisk in the output file rather than
the default nothing. Well, then, you could make a new file with the line
item*
in it, and feed that into rss2html. Voila, instant asterisks! Notice,
the RSS tag has no little greater-than and less-than signs around it,
whereas if you want to replace an RSS tag with an HTML element
it must be exactly as you want it in the output file, such as:
item<strong>
/item</strong>
You can replace any input tag you like.
If you, like me, often times dislike the descriptions that people put
along in an rss file (they're often times long and pointless), you
could do something like this:
descritiption
/description
This will tell rss2html not to print out the information included between
these two tags. Some built-in rss tags, such as <language>, <docs>,
and <guid> are automatically never put in the HTML output. You can,
of course, change this behavior by defining them in your own user-defined
file.
You should probably see the included myuser-defined file, which is a
pretty lousy example of all you can do with rss2html's user defined
capabilities.
Other than all the tags whose actions you can define (or re-define), you
can also define two things which you'd never find in any RSS file, these
are *RDFSTART* and *RDFEND*, which will let you throw whatever HTML you
want at the beginning and at the end. So, say you wanted to make
everything some ugly font within the output file, you could add this
to your user-defined file:
*RDFSTART*<font face="Chicago,Chitown,Impact">
*RDFEND*</font>
As always, though, if you like the plain stuff that rss2html
usually puts out (and have no desire to learn anything about RDF or
it's RSS subpart), you don't even have to worry about this feature.
Why Did You Choose as a Separator?
Because I wanted something that most likely wouldn't show up in any
HTML tags anyone wants to embed. If, for some reason, this actually
does exist within some HTML (or RSS) tag, well, get in touch with
me and I'll find a more obscure separation symbol.
It's Doesn't Work.
Well, if it simply doesn't work on a given RSS file because the tags
therein are not defined, feel free to re-define them yourself. However...
If you get Segmentation Faults, Bus Errors, Illegal Instructions,
gcc: unable to parse errors, kernel panics, an itching, burning sensation,
or some other task at which rss2html is failing, feel free
to e-mail me at mercadal@khons.diablonet.net Please, if you can,
paste the error into the mail program, and tell me what system you're running
it on, as well as anything else you think might help me (a backtrace would
be appreciated).
What Can I Expect in the Next Major Revisions?
Probably a lot better RSS 1.0 specification support, now that I've found
something approaching a white paper on this. Hitherto, I was just testing
it against whatever RDF output I could snag from friends, so make sure
to keep you eye out for the latest version.
Perhaps further support for running rss2html as a cgi binary. And,
further, perhaps the ability to have rss2html run as a daemon which
will snag a new rdf file from a given address, compare it against the older
version, and update if necessary.
Changelog (since 0.1):
Revision 0.2.3 (14 March 2004)
* Made output ordering correct in cases where description
tag may have come before link tag (patches thanks to
Lyle Hanson)
Revision 0.2.2 (19 February 2004)
* Added ability for users to blank out information
within a given tag (e.g. removing descriptions)
* Internally, handling user-defined tags in a reasonable
way now
Revision 0.2.1 (9 December 2003)
* Second revision today [two in one day, wow]
Offers greater support for a wider variety of
tags, and some minor internal cleanups
* This release may make your output files look
slightly different than previous versions.
Create a user-defined preference to change the
output should the new formatting be bothersome to you.
Revision 0.2.0 (9 December 2003)
* Noticed how poor my spelling was in much of this
README and actually bothered to edit mistakes
* Fixed a possible inconsistency within parser
* Fixed user-defined tags parsing support;
previous versions could have improperly handled
user-defined tags, or not have initialized the
file at all
* Added -q flag to make rss2html run without errors
(unless command line is improperly specified)
* Attempted to more closely adhere to style(9) guidelines
Revision 0.1.4 (20 July 2003)
* Fixed another segfault which only manifested
itself in FreeBSD
Revision 0.1.3 (19 July 2003)
* Removed segfaults/incorrect results on BSD
when standard input read on a pipe
Revision 0.1.2 (1 July 2003)
* Smarter malloc usage
* Looking out for segfaults
* Read/write to standard input or standard output
if no files are specified as options
Revision 0.1.1 (16 February 2002)
* Optimized file reading in BSD
* More tag support
TO-DO:
* Add the ability to take URLs on command line and automatically
retrieve and parse the associated rss file.
* Check for memory leaks
* Make parsing of user defined file more robust.
What License Is rss2html Under?
It's a BSD style license. I'm no lawyer, if you need to know more, though,
read LICENSE for the official explanation of all your rights, my rights, and
gosh knows what else. |