PreNews version 0.02
--------------------
What it is
----------
PreNews is a program designed to assist in your handling of news. It
achieves this simply by processing your undebatched news file and scanning
for stuff you are interested in. That stuff is copied, the rest is junked.
Thus, you only see what you want to see.
IMPORTANT
---------
THIS IS A TEST VERSION OF PRENEWS. IT IS *NOT* COMPLETE NOR FINISHED.
THE NEWS SCANNING MECHANISM ONLY WORKS WITH DYNAMIC AREAS, SO YOU WILL NEED
A RISCPC (OR JUSTIN FLETCHER'S "DUMMY DYNAMIC AREAS" MODULE) AND ENOUGH
MEMORY TO CREATE A DYNAMIC AREA LARGE ENOUGH TO HOLD YOUR NEWS FILE(S).
IF YOU DO NOT HAVE THIS, YOU CANNOT USE PRENEWS. I WILL BE IMPLEMENTING THE
FILE->FILE SCANNING ROUTINE SOONISH.
PROBABLY SOME OTHER STUFF TOO... :-)
Setting it up
-------------
Using Voyager? Skip this bit and read the next section...
Please note, news MUST be in "rnews" format.
Open up your raw incoming news. If the first line is "#! rnews", then you
are okay. Otherwise, you aren't!
This software is, principally, designed to work alongside Voyager's
!NewsAgent software. So if you use !NewsAgent, then don't worry - it will
all work nicely.
THERE ARE THREE MODES OF OPERATION THAT PRENEWS CAN SUPPLY.
These determine how you configure PreNews, and thus how you use it.
1. Original PreNews
A directory is scanned. Any text file in that directory will be
processed, the results written to the output file.
This assumes the output file does not exist, and your news has already
been backed up.
2. Best use scenario (aka 'backup mode')
The source news file will be backed up, processed, and written back to
the relevant location.
The source and output news files may be one and the same, and backing
up your news is irrelevant as this will do it for you.
3. Subject scanner
PreNews will not mess with the news articles. Instead, it will simply
whizz through them and make a note of groups and subjects, and then
output a nice little report for you.
All three may be used together, but my personal recomendation is to use
method 2 for general news processing and have a seperate Obey file for
the subject scanner. I have provided "!RunSubj" for this very purpose.
1. Copy the "!PreNews" directory somewhere suitable.
If you are updating PreNews, simply copy the updated files over the top
of the old ones. Best you make a backup though, in case your download is
corrupt in any way.
2. Open up the "!PreNews.!Run" file.
PreNews$Incoming should be set to point to a DIRECTORY where your
incoming news is stored. By convention, I store mine there as numbered
files ("01", "02", "03"...), but it will search for any *text* file
present. Files that are not text (type &FFF) files will be skipped.
THIS IS IGNORED FOR BACKUP MODE
PreNews$Outgoing is the FILENAME of the file that your outgoing news
should be written to. The default is to replace the existing news file.
PreNews$Source is used by backup mode and the subject scanner to point
to the incoming news file. For Voyager users, this would be the system
variable <INNewsIn$File> which points to !Voyager.Apps.News.News.
With backup mode, do be aware that it is perfectly acceptable (and
wholly intentional) for the source and outgoing system variables to
point to the same thing.
PreNews$Backup is a pointer to the directory used for back ups. Refer
to the details further on in this document if you are interested in how
PreNews constructs it's backup file names.
PreNews$Spool points to the outgoing news articles that you have
written, held in spool directory. This is for a planned future-feature.
At the line:
Run <PreNews$Dir>.!RunImage %*0
Add your selection of how you wish PreNews to work.
For backup mode, add "-backup".
For subject scanning mode, add "-subjects".
For other options, refer to the section on command line parameters.
Examples:
Backup mode: Run <PreNews$Dir>.!RunImage %*0 -backup
Subject scan: Run <PreNews$Dir>.!RunImage %*0 -subjects
If you should have both (-backup -subjects) then the subject scanning
will be invoked.
If you have neither of these parameters, then PreNews will behave in
it's old way.
Save the file when you are happy with it.
3. Open the "!PreNews.Data" directory.
authors:
These entries are matched with the "From: " line in each article.
miscellany:
These entries are matched with *all* lines in each article. They are
useful for including stuff that isn't included in any other way, such
as "egin 644" to include binaries.
Be sure to include your email address and/or your domain name so that
replies to you in subjects that are not included can be retained.
It is suggested that you don't go overboard with entries here, as this
is matched always; and many matches all the time will put a fair lag
on news processing.
newsgroups:
These entries are matched with the "Newsgroups: " line in each
article.
WARNING: This will cause entire newsgroups to be copied.
subjects:
The most useful option, this filters out subjects you are interested
in.
Some clever manipulation will allow you to hold a core set of subjects
for things you are likely to be interested in.
For example, "RISC", "risc" and "Risc" will match RiscPCs and RISC OS
and even risc processor concepts.
It can also be useful to drop off the first letter.
"oyager" can match "Voyager" and "voyager".
Edit these as necessary, then save them.
Your PreNews should now be set up.
For more possibilities, refer to the section "Command line options".
If you come to edit your data files after you have used PreNews, you'll
see PreNews has added some rubbish to the file, it'll look like this:
Risc€263,20010528
CGI€0,X
New Product Details€18,20010528
16+16€5,20010528
On Colours in Buffy€10,20010528
On names in Buffy€11,20010528
To add an entry, simply go to the bottom of the file and type it in. Do not
try to add the status data, PreNews will sort it out automatically.
If you wish to delete an entry, simply delete the entire line.
Setting it up the most sensible way for use with Argo Voyager
-------------------------------------------------------------
In this release of the software, only Dynamic Area support is provided for
news processing.
In simple terms - you'll need a RiscPC.
It is possible to use dynamic areas on older machines, but we are talking
about various degrees of pain.
Back to the original help text...
Now you've read the above, your head is probably spinning. You're thinking
"what the ....?" and wondering if maybe you shouldn't bother with PreNews.
It is, actually, really simple to set it up and use. However there is a
lot of flexibility provided. It is rather hard to explain in general terms.
So now we shall be specific. You are using NNTPfetch and NewsAgent. Don't
worry if you don't know exactly what those are, if you are using an
unmodified copy of Voyager (ie, you are not using Pluto or Messenger) then
you will be using NNTPfetch and NewsAgent.
So here is how to set up PreNews for your system.
1. Install PreNews some place.
I suggest in the same directory as !Voyager, but not within !Voyager
(unless you plan to make it a VIXen?)
2. Open the !PreNews.!Run file.
Set your system variables to be the following:
Set PreNews$Outgoing <INNewsIn$File>
Set PreNews$Source <INNewsIn$File>
Set PreNews$Backup <IN$Dir>.newsbackup
Set PreNews$Spool <INNewsOut$Dir>
The variable PreNews$Incoming is not required.
Ensure the text "-backup" (with the minus sign) is somewhere in the
Run <PreNews... line.
If you are unsure, set it to be:
Run <PreNews$Dir>.!RunImage %*0 -backup -fast
Save and close the !Run file.
3. Open !Voyager, then Apps, then News.
You should see:
!NewsAgent
!Run
!Vixen
Index <-- only if you've got some you have not yet debatched
News <-- ditto the above
In this directory, create a directory called "newsbackup".
Close the directory viewers you've just opened.
4. Open !PreNews, and then the Data directory.
Refer to the section above to see what the four text files are there
for. Set them up as you desire, then save them.
PLEASE NOTE, PRENEWS SCANS EXCLUSIVELY. That means, anything that is NOT
matched to your selections will be skipped. Four empty files will result
in ALL of your news being weeded out.
If you are looking for software to remove messages from specific people,
and keep the rest, then I'm sorry but PreNews does not provide this
facility. PreNews is for more severe news file pruning.
If you've like PreNews to operate as a 'killfile', then please email me
with your request.
5. That should be it. :-)
You have just set up PreNews according to "the backup method", so when I am
talking about the backup method, I'm talking about that you've just done.
Using it - method one - the original way
----------------------------------------
If you are using the backup method, ignore this section.
Before using PreNews, you should have MOVED your incoming news file to
your backup directory (it is suggested you maintain a backup of around a
week, thus you can catch up on stuff if necessary).
Then, COPY your backup into the processing directory; ie, the place where
you have specified in <PreNews$Incoming>.
The above could be automated. A simple BASIC program called in !PreNews'
!Run file would do the trick. I leave it to you to create such a program,
as different people do things in different ways...
When that has all been done, run !PreNews.
PreNews will check to see if the destination news file exists. If it does,
then PreNews will refuse to run. This is so raw or already processed news
does not get overwritten. The only solution to this is to either move the
destination news file into the process directory and process it again, or
to debatch the news. The option you choose depends on whether it has been
processed or not, and whether you wish to debatch now or not.
If we get thus far, PreNews will check for an "Index" data file in the same
directory as the destination news. If it exists and it is a data (type =
&FFD) file, it will be deleted. If it exists and is not data, PreNews will
give a warning message and stop.
The Index file is used by NewsAgent to speed up debatching, each word in
the index points to the beginning of a message so that NewsAgent doesn't
need to search for each article itself.
PreNews does not, currently, generate any such file, but if one from before
exists, your newly processed news would be debatched according to the
index. Big mess.
Now, PreNews will count up the size of all the news awaiting being
processed. It will then try to allocate a dynamic area of that size to
store the news in. If your machine is not capable of creating dynamic areas
(ie, pre-RiscPC) or if you do not have sufficient memory, it will revert to
the slower file-read/file-write method.
[remember: in this release, only dynamic area support is available]
Using it - method two - the backup method
-----------------------------------------
If you are using the original method, read the section above, not this one.
If you have set up PreNews correctly...
Simply double-click on the !PreNews icon.
That's all. :-)
PreNews will copy the specified news file, which should be your raw
incoming news (ie, where your news fetcher puts it), into the backup
directory.
Then your backup news file will be read from, either line by line or loaded
into a dynamic area.
From this, your news will be processed and written back.
Any "Index" data files associated with the original news, or the outgoing
file, will be deleted. NewsAgent uses the Index file to speed up it's
debatching. PreNews does not yet generate Index files, so the one that
would be left there would not match the processed news file. Big mess!
It is entirely possible to process the same news file several times. Doing
so is not really recommended (because it is wasteful of resources, all
those similar backups and the time taken to process). But if you choose to
process the same file forty times, feel free to. :-)
[remember: in this release, only dynamic area support is available]
Using it - method three - the subject scanner
---------------------------------------------
This method does not involve ANY news processing. It simply records the
subjects and newsgroups, and tells you what you have to hand.
DO NOT DOUBLE-CLICK ON !PRENEWS, that is likely to start a news process.
You should have created a link to run PreNews in scan mode.
If not, you can open the !PreNews directory and double-click on the file
called "!RunSubj".
PreNews will either load the file into a dynamic area, or scan directly
from file.
[note: this, amazingly enough, DOES work in either mode in this version!]
When processing is complete, a data file will be created. The contents will
be an alphabetically sorted list of newsgroups, containing within them an
alphabetically sorted list of subjects. The numbering is for tidiness.
Crossposted subjects will ONLY appear in the newsgroup that is first
matched. It will, however, be flagged as having been crossposted by the
inclusion of a little 'X' between the number and the subject title.
Subjects are clipped so the display is tidy. So if you see three tiny dots
in a row (ISO 8859/1 Latin1; or an 'I' with a '^' over it in MS-DOS font)
it simply means the title was trimmed to fit nicely.
In this mode, the command line options "-onlyuseda" and "-noresults" are
ignored.
Command line options
--------------------
You can control how PreNews behaves with command line options. These can
be set up in the !Run file.
-backup
Switch PreNews to backup mode. THIS IS RECOMMENDED.
-noresults
Don't pop up the results file when finished.
-noupdate
Don't update the data files with hit counter information. This is
really only useful if you wish to preprocess a small news file without
messing up the results already held, or if you want to totally disable
the results reporting stuff.
-onlyuseda
This sets PreNews to only use a Dynamic Area. If there is insufficient
memory, PreNews will abort with a message saying how much memory is
required.
-pollless
This will alter PreNews' responsiveness so it pays more attention to
the task of processing your news, and less attention to the other
software running. In geeky terms, it will 'poll' every ten articles
instead of after every article.
This can have dramatic results when you are processing data held in a
Dynamic Area.
For an example, pretty much a worst case scenario for scanning subjects
on my machine (RiscPC700) with a 2491Kb news file, is when scanning off
of disc, with no speed tweaks. It took 99 seconds, processing at a
positively lethargic 25Kb/sec.
With the "-pollless" tweak, this became 31 seconds (80Kb/sec).
Using Dymamic Areas, the timings are 38 seconds (65Kb/sec) and 13
seconds (191Kb/sec) respectively.
There is a command line option to kick PreNews into a mega-turbo mode
where it'll munch it's way 2491Kb file in a mere 7 seconds (355Kb/sec).
Unfortunately, this option is classed as "undocumented".
Fortunately, another of the supplied information files mentions it. As
you only have two text files supplied in !PreNews, I guess that narrows
down your options somewhat.
Moral: Read ALL of the documentation. You might learn a few little
tricks and tips.
-subjects
Switch PreNews to simply scan for subjects.
If this is passed accidently with "-backup", the subjects will be
scanned. This is because PreNews pays attention to "-subjects" before
it comes to the news processor.
Results (of news process)
-------
The top of the results file is self-explanatory.
The part after the "---" seperator is a breakdown of all of the possible
matches, and how often they were matched. The value in brackets is value
from the previous run. This is to help you decide if any subjects are now
out of date.
If both the value in brackets and the hit value are zero, then the date of
last match will be output so you can best decide what is no longer
necessary.
Keeping loads of subjects is not a problem, as PreNews dynamically
allocates memory for all the entries. However it may cause a lag in
processing as there are that many more entries to be matched.
Note that matching is made in the FOLLOWING order:
* miscellany?
* newsgroup?
* subject?
* author?
So if the subject "Polar bears" is raging in argonet.zfc, and you have
chosen to keep all of argonet.zfc; polar bears will have no results as the
checking will not get that far.
Geekisms (your history lesson)
------------------------------
When you have a so-called job, wading through several hundred news postings
just isn't an option. I figured the best way to sort that mess out would be
to write a little program to pull out the news that I wanted while just
losing the rest to the proverbial bit-bucket. Then, I don't even need to
debatch it. Or course, it is easy to miss things you might have been
interested in, but that isn't a big worry. News is time, and the time spent
reading/writing news is shocking. Who says I don't have a social life? :-)
PreNews started off life as a simple kludge called "extractor", which was
set up specifically to my system. Written in BASIC, it was a bit slow. To
give an example: 756681 bytes of news (369 articles), scanning that lot
and copying across 170 articles took 42 seconds (~17K/sec).
Using the exact same settings, PreNews whizzed through that lot in only
6 seconds (-pollless) achieving 123K/sec.
I decided PreNews was due a rewrite. It was slow. On a machine with 33Mb of
memory, it was hard to justify a file-file scanning system. I could drop
the whole lot into a Dynamic Area and work from memory.
Also, it should be written in C, as compiled code is faster than
interpreted. This wasn't a port of 'extractor'. In fact, the only part of
extractor that was retained was the list of subjects to scan for, and the
basic concept.
On the morning (yes, I got up while it was still AM!) of 14/05/2001 I began
to code a command-line utility. No sooner had I written half a screenful of
code, I destroyed that and began again with PreNews fitting into the
multitasking model. I figured, even with the overheads of polling, there
would be a nice speed gain. Just to be sure, I included the "-pollless"
command line option. Using that, it should be possible to whizz our way
through a megabyte in around ten seconds (!).
Possibly the biggest change (as if being completely rewritten in another
language without reference to the original source is a minor triviality) is
that the program now multitasks. This allows life to carry on in the
background, and also a nifty little slider-window thingy.
Multitasking brings with it obvious dangers. But, working on the principle
of Schroedingers cockroach (the thing that killed the cat, all that rubbish
about vials of arsenic is to keep the pro-cockroach lobby from protesting),
PreNews will trap you trying to load NewsAgent's debatcher while your news
is being processed. It will tell you off accordingly.
The file "_results" is the textual version of the results, opened at the
end of scanning (unless you've said otherwise).
The statistics are encoded into the original data files in the following
manner (again, unless you've said otherwise):
Justin Fletcher
being what you type.
After being executed, this line would be written back as:
Justin Fletcher€27,20010514
The number following the € is the hit count, and the number following the
comma is the system date in loose ISO format (CCYYMMDD). If a hit gives
no matches, then this date will be preserved, else it will be updated. So
you get to see how long it has been since something was matched.
The match entries are constrained to 63 characters, and will be clipped if
longer. Matching IS case sensitive.
The maximum line length that can be processed is 4096 bytes. Like BASIC,
if a line is longer it will be read as the next line.
This behaviour does not affect the original news articles, as these are
copied from start and finish offsets, not line by line.
A thread started in one of the argonet newsgroups as a result of my posting
this program to ...binaries.
The ideas are basically:
* Forget PreNews and write a nice front end to Newshound
* Make PreNews a news fetcher that can do all this processing
The first is unlikely to happen. Sorry.
The latter will NOT happen in PreNews as PreNews is NOT a fetcher. That
doesn't mean I'm not looking into making a better fetcher than the
bog-standard NNTPfetch.
I decided that wading through the news file on my days off looking for
stuff that might be interesting isn't my idea of a way to spend a day. So
I made PreNews whizz through the incoming news file and provide me a list
of subjects. I can pick the ones that interest me.
Time saving - MUCH, as PreNews can do it in around seven seconds for two
and a half megabytes!
Then, I decided I did not want to keep manually backing up my news. I'm
really quite lazy y'see. So I implemented a backup system within PreNews so
I just double-click the icon and it's all done for me.
Then I fixed a few bugs. Somewhere in the middle of all of this I had to go
to work. But considering that I work as a care assistant in nursing homes
(elderly dependant), I'm sure you'll appreciate me not going into any
detail whatsoever.
Next time - your next exciting installment!
[and if you believe that, I've got this old clock I can sell you for four
grand. It's in London, and offers you choice views of the houses of
parliament... ]
Where your news is backed up to
-------------------------------
The international standard date notation is YYYY-MM-DD. If you don't
believe me, it is ISO-8601, and you can find some nice information at:
http://www.cl.cam.ac.uk/~mgk25/iso-time.html
This is offered by way of introducing the way that PreNews determines
where backups are stored and what they are named.
On a system before RISC OS 4, there are some inbuilt limits, namely:
1. You can only store 77 news files in any given month. This is quite
sufficient for two debatches a day.
2. You can only store 77 months. That's six and a half years!
I suspect your problems may relate more to disc space than running out of
directory space!
RISC OS 4 E+/F+ formatted discs allow long file names and more than 77
entries in a directory. If you can afford RISC OS 4, you can probably
afford a disc large enough to hold all that news! :-)
<PreNews$Source> points to the 'source' news file. This is possibly the
same as <PreNews$Outgoing>, ie, <INNewsIn$File>, but using a different
variable provides more flexibility.
<PreNews$Backup> points to the base directory. For the purposes of our
example, we shall assume it points to the directory
"IDEFS::Willow.$.newsbackup".
Within this directory, will be a subdirectory constructed from a partial
ISO-8601 date, in the format:
YYYY-MM
Within that subdirectory will be the news file, named as:
DD-HHMMSS
So, the instant that we run PreNews is Monday, 28th May. It is half three
in the morning. You can expect your news to be backed up as:
IDEFS::Willow.$.newsbackup.2001-05.28-032804
or something along those lines.
Development
-----------
PreNews was written by Richard Murray in Zap 1.40.
This was compiled with Norcroft C version 4.00, and linked with the
Norcroft linker version 4.00.
The libraries are the standard Norcroft C libraries (using SharedCLibrary)
and DeskLib version 2.30.
The templates were designed with TemplEd version 1.26.
The code was developed on a RiscPC700 running RISC OS 3.7 on an ARM710
processor.
14/05/2001
On the radio was a wide variety of stuff, my current favourites being
pretty much anything by Dido; and Geri Halliwell's "It's raining men"
because it is a lively song, unlike a lot of the other stuff played.
I was naughty and took a break to watch "Stepsister from planet weird" on
Disney (analogue) because I liked the title of the movie! :-)
28/05/2001
For the entirety of today's work, I've been listening to a tape copy of
the first two Britney Spears albums (forgive me) and The Pet Shop Boys
"Discography" (side 2 is far superior to side 1); in alternation. I meant
to stop at half twelve and go to bed early. It's half three, the sky is
starting to get light. Oh well. :-)
Legal stuff
-----------
The lone programmer, Richard Murray, retains all copyright and interests in
this software.
He (or should that be 'I'?) grants you a licence to use the software with
the following provisos:
1. This software is provided as-is.
Note that early versions are NOT complete.
2. You use this software at your own risk. Richard Murray shall not be
held liable for any problems or damage that may be caused by the use
of this software, whether it trashes your news or starts barking and
eats your homework.
3. You may only disassemble and examine/modify this code as is permitted
by law. You may not redistribute such modified versions.
4. The maximum amount that Richard Murray will be liable for in any case
is the amount that you paid me for this software - exactly £0.00.
5. You may not use this software if you are a supporter of UCITA. If you
don't know what UCITA is, count yourself lucky.
6. This software shall be governed by the law of the United Kingdom.
And the obvious... This software has not had a full logic trace analysis
and neither has the libraries used and, quite likely, the Operating System.
Therefore this software is not to be used in mission-critical applications
or life-support systems (though, to be honest, my mind boggles as to why a
program for sorting out news articles would be used in a life-support
system - unless you are a serious net-head...).
Contacts
--------
PreNews was written by Richard Murray.
Email : support@heyrick.co.uk
Web : http://www.heyrick.co.uk/voyager/prenews/
Return to index
Copyright © Y2K1 Richard Murray