PreNews version 0.02
                             --------------------


What it is
----------
  PreNews is a program designed to assist in your handling of news. It
  achieves this simply by processing your undebatched news file and scanning
  for stuff you are interested in. That stuff is copied, the rest is junked.
  Thus, you only see what you want to see.



IMPORTANT
---------
  THIS IS A TEST VERSION OF PRENEWS. IT IS *NOT* COMPLETE NOR FINISHED.

  THE NEWS SCANNING MECHANISM ONLY WORKS WITH DYNAMIC AREAS, SO YOU WILL NEED
  A RISCPC (OR JUSTIN FLETCHER'S "DUMMY DYNAMIC AREAS" MODULE) AND ENOUGH
  MEMORY TO CREATE A DYNAMIC AREA LARGE ENOUGH TO HOLD YOUR NEWS FILE(S).

  IF YOU DO NOT HAVE THIS, YOU CANNOT USE PRENEWS. I WILL BE IMPLEMENTING THE
  FILE->FILE SCANNING ROUTINE SOONISH.

  PROBABLY SOME OTHER STUFF TOO... :-)



Setting it up
-------------
  Using Voyager? Skip this bit and read the next section...

  Please note, news MUST be in "rnews" format.
  Open up your raw incoming news. If the first line is "#! rnews", then you
  are okay. Otherwise, you aren't!

  This software is, principally, designed to work alongside Voyager's
  !NewsAgent software. So if you use !NewsAgent, then don't worry - it will
  all work nicely.


  THERE ARE THREE MODES OF OPERATION THAT PRENEWS CAN SUPPLY.
  These determine how you configure PreNews, and thus how you use it.


  1. Original PreNews

       A directory is scanned. Any text file in that directory will be
       processed, the results written to the output file.
       This assumes the output file does not exist, and your news has already
       been backed up.


  2. Best use scenario (aka 'backup mode')

       The source news file will be backed up, processed, and written back to
       the relevant location.
       The source and output news files may be one and the same, and backing
       up your news is irrelevant as this will do it for you.


  3. Subject scanner

        PreNews will not mess with the news articles. Instead, it will simply
        whizz through them and make a note of groups and subjects, and then
        output a nice little report for you.


  All three may be used together, but my personal recomendation is to use
  method 2 for general news processing and have a seperate Obey file for
  the subject scanner. I have provided "!RunSubj" for this very purpose.



  1. Copy the "!PreNews" directory somewhere suitable.
     If you are updating PreNews, simply copy the updated files over the top
     of the old ones. Best you make a backup though, in case your download is
     corrupt in any way.

  2. Open up the "!PreNews.!Run" file.

     PreNews$Incoming should be set to point to a DIRECTORY where your
     incoming news is stored. By convention, I store mine there as numbered
     files ("01", "02", "03"...), but it will search for any *text* file
     present. Files that are not text (type &FFF) files will be skipped.
     THIS IS IGNORED FOR BACKUP MODE

     PreNews$Outgoing is the FILENAME of the file that your outgoing news
     should be written to. The default is to replace the existing news file.

     PreNews$Source is used by backup mode and the subject scanner to point
     to the incoming news file. For Voyager users, this would be the system
     variable <INNewsIn$File> which points to !Voyager.Apps.News.News.
     With backup mode, do be aware that it is perfectly acceptable (and
     wholly intentional) for the source and outgoing system variables to
     point to the same thing.

     PreNews$Backup is a pointer to the directory used for back ups. Refer
     to the details further on in this document if you are interested in how
     PreNews constructs it's backup file names.

     PreNews$Spool points to the outgoing news articles that you have
     written, held in spool directory. This is for a planned future-feature.

     At the line:
       Run <PreNews$Dir>.!RunImage %*0

     Add your selection of how you wish PreNews to work.
     For backup mode, add "-backup".
     For subject scanning mode, add "-subjects".
     For other options, refer to the section on command line parameters.

     Examples:
       Backup mode:    Run <PreNews$Dir>.!RunImage %*0 -backup
       Subject scan:   Run <PreNews$Dir>.!RunImage %*0 -subjects

     If you should have both (-backup -subjects) then the subject scanning
     will be invoked.

     If you have neither of these parameters, then PreNews will behave in
     it's old way.
     
     Save the file when you are happy with it.


  3. Open the "!PreNews.Data" directory.

     authors:
       These entries are matched with the "From: " line in each article.

     miscellany:
       These entries are matched with *all* lines in each article. They are
       useful for including stuff that isn't included in any other way, such
       as "egin 644" to include binaries.
       Be sure to include your email address and/or your domain name so that
       replies to you in subjects that are not included can be retained.
       It is suggested that you don't go overboard with entries here, as this
       is matched always; and many matches all the time will put a fair lag
       on news processing.

     newsgroups:
       These entries are matched with the "Newsgroups: " line in each
       article.
       WARNING: This will cause entire newsgroups to be copied.

     subjects:
       The most useful option, this filters out subjects you are interested
       in.
       Some clever manipulation will allow you to hold a core set of subjects
       for things you are likely to be interested in.
       For example, "RISC", "risc" and "Risc" will match RiscPCs and RISC OS
       and even risc processor concepts.
       It can also be useful to drop off the first letter.
       "oyager" can match "Voyager" and "voyager".

     Edit these as necessary, then save them.


  Your PreNews should now be set up.

  For more possibilities, refer to the section "Command line options".


  If you come to edit your data files after you have used PreNews, you'll
  see PreNews has added some rubbish to the file, it'll look like this:
    Risc€263,20010528
    CGI€0,X
    New Product Details€18,20010528
    16+16€5,20010528
    On Colours in Buffy€10,20010528
    On names in Buffy€11,20010528

  To add an entry, simply go to the bottom of the file and type it in. Do not
  try to add the status data, PreNews will sort it out automatically.
  If you wish to delete an entry, simply delete the entire line.



Setting it up the most sensible way for use with Argo Voyager
-------------------------------------------------------------
  In this release of the software, only Dynamic Area support is provided for
  news processing.
  In simple terms - you'll need a RiscPC.

  It is possible to use dynamic areas on older machines, but we are talking
  about various degrees of pain.

  Back to the original help text...


  Now you've read the above, your head is probably spinning. You're thinking
  "what the ....?" and wondering if maybe you shouldn't bother with PreNews.

  It is, actually, really simple to set it up and use. However there is a
  lot of flexibility provided. It is rather hard to explain in general terms.
  So now we shall be specific. You are using NNTPfetch and NewsAgent. Don't
  worry if you don't know exactly what those are, if you are using an
  unmodified copy of Voyager (ie, you are not using Pluto or Messenger) then
  you will be using NNTPfetch and NewsAgent.

  So here is how to set up PreNews for your system.

  1. Install PreNews some place.

     I suggest in the same directory as !Voyager, but not within !Voyager
     (unless you plan to make it a VIXen?)

  2. Open the !PreNews.!Run file.

     Set your system variables to be the following:
       Set PreNews$Outgoing  <INNewsIn$File>
       Set PreNews$Source    <INNewsIn$File>
       Set PreNews$Backup    <IN$Dir>.newsbackup
       Set PreNews$Spool     <INNewsOut$Dir>
     The variable PreNews$Incoming is not required.

     Ensure the text "-backup" (with the minus sign) is somewhere in the
     Run <PreNews... line.
     If you are unsure, set it to be:
       Run <PreNews$Dir>.!RunImage %*0 -backup -fast

     Save and close the !Run file.

  3. Open !Voyager, then Apps, then News.
     You should see:
       !NewsAgent
       !Run
       !Vixen
       Index     <-- only if you've got some you have not yet debatched
       News      <-- ditto the above

     In this directory, create a directory called "newsbackup".

     Close the directory viewers you've just opened.

  4. Open !PreNews, and then the Data directory.
     Refer to the section above to see what the four text files are there
     for. Set them up as you desire, then save them.

     PLEASE NOTE, PRENEWS SCANS EXCLUSIVELY. That means, anything that is NOT
     matched to your selections will be skipped. Four empty files will result
     in ALL of your news being weeded out.

     If you are looking for software to remove messages from specific people,
     and keep the rest, then I'm sorry but PreNews does not provide this
     facility. PreNews is for more severe news file pruning.
     If you've like PreNews to operate as a 'killfile', then please email me
     with your request.

  5. That should be it. :-)

  You have just set up PreNews according to "the backup method", so when I am
  talking about the backup method, I'm talking about that you've just done.



Using it - method one - the original way
----------------------------------------
  If you are using the backup method, ignore this section.

  Before using PreNews, you should have MOVED your incoming news file to
  your backup directory (it is suggested you maintain a backup of around a
  week, thus you can catch up on stuff if necessary).
  Then, COPY your backup into the processing directory; ie, the place where
  you have specified in <PreNews$Incoming>.

  The above could be automated. A simple BASIC program called in !PreNews'
  !Run file would do the trick. I leave it to you to create such a program,
  as different people do things in different ways...

  When that has all been done, run !PreNews.


  PreNews will check to see if the destination news file exists. If it does,
  then PreNews will refuse to run. This is so raw or already processed news
  does not get overwritten. The only solution to this is to either move the
  destination news file into the process directory and process it again, or
  to debatch the news. The option you choose depends on whether it has been
  processed or not, and whether you wish to debatch now or not.

  If we get thus far, PreNews will check for an "Index" data file in the same
  directory as the destination news. If it exists and it is a data (type =
  &FFD) file, it will be deleted. If it exists and is not data, PreNews will
  give a warning message and stop.

  The Index file is used by NewsAgent to speed up debatching, each word in
  the index points to the beginning of a message so that NewsAgent doesn't
  need to search for each article itself.
  PreNews does not, currently, generate any such file, but if one from before
  exists, your newly processed news would be debatched according to the
  index. Big mess.

  Now, PreNews will count up the size of all the news awaiting being
  processed. It will then try to allocate a dynamic area of that size to
  store the news in. If your machine is not capable of creating dynamic areas
  (ie, pre-RiscPC) or if you do not have sufficient memory, it will revert to
  the slower file-read/file-write method.
  [remember: in this release, only dynamic area support is available]



Using it - method two - the backup method
-----------------------------------------
  If you are using the original method, read the section above, not this one.

  If you have set up PreNews correctly...

  Simply double-click on the !PreNews icon.
  That's all. :-)


  PreNews will copy the specified news file, which should be your raw
  incoming news (ie, where your news fetcher puts it), into the backup
  directory.
  Then your backup news file will be read from, either line by line or loaded
  into a dynamic area.
  From this, your news will be processed and written back.

  Any "Index" data files associated with the original news, or the outgoing
  file, will be deleted. NewsAgent uses the Index file to speed up it's
  debatching. PreNews does not yet generate Index files, so the one that
  would be left there would not match the processed news file. Big mess!

  It is entirely possible to process the same news file several times. Doing
  so is not really recommended (because it is wasteful of resources, all
  those similar backups and the time taken to process). But if you choose to
  process the same file forty times, feel free to. :-)

  [remember: in this release, only dynamic area support is available]



Using it - method three - the subject scanner
---------------------------------------------
  This method does not involve ANY news processing. It simply records the
  subjects and newsgroups, and tells you what you have to hand.

  DO NOT DOUBLE-CLICK ON !PRENEWS, that is likely to start a news process.
  You should have created a link to run PreNews in scan mode.

  If not, you can open the !PreNews directory and double-click on the file
  called "!RunSubj".

  PreNews will either load the file into a dynamic area, or scan directly
  from file.
  [note: this, amazingly enough, DOES work in either mode in this version!]

  When processing is complete, a data file will be created. The contents will
  be an alphabetically sorted list of newsgroups, containing within them an
  alphabetically sorted list of subjects. The numbering is for tidiness.

  Crossposted subjects will ONLY appear in the newsgroup that is first
  matched. It will, however, be flagged as having been crossposted by the
  inclusion of a little 'X' between the number and the subject title.

  Subjects are clipped so the display is tidy. So if you see three tiny dots
  in a row (ISO 8859/1 Latin1; or an 'I' with a '^' over it in MS-DOS font)
  it simply means the title was trimmed to fit nicely.

  In this mode, the command line options "-onlyuseda" and "-noresults" are
  ignored.



Command line options
--------------------
  You can control how PreNews behaves with command line options. These can
  be set up in the !Run file.

    -backup
      Switch PreNews to backup mode. THIS IS RECOMMENDED.


    -noresults
      Don't pop up the results file when finished.

    -noupdate
      Don't update the data files with hit counter information. This is
      really only useful if you wish to preprocess a small news file without
      messing up the results already held, or if you want to totally disable
      the results reporting stuff.

    -onlyuseda
      This sets PreNews to only use a Dynamic Area. If there is insufficient
      memory, PreNews will abort with a message saying how much memory is
      required.

    -pollless
      This will alter PreNews' responsiveness so it pays more attention to
      the task of processing your news, and less attention to the other
      software running. In geeky terms, it will 'poll' every ten articles
      instead of after every article.
      This can have dramatic results when you are processing data held in a
      Dynamic Area.
      For an example, pretty much a worst case scenario for scanning subjects
      on my machine (RiscPC700) with a 2491Kb news file, is when scanning off
      of disc, with no speed tweaks. It took 99 seconds, processing at a
      positively lethargic 25Kb/sec.
      With the "-pollless" tweak, this became 31 seconds (80Kb/sec).
      Using Dymamic Areas, the timings are 38 seconds (65Kb/sec) and 13
      seconds (191Kb/sec) respectively.

      There is a command line option to kick PreNews into a mega-turbo mode
      where it'll munch it's way 2491Kb file in a mere 7 seconds (355Kb/sec).
      Unfortunately, this option is classed as "undocumented".
      Fortunately, another of the supplied information files mentions it. As
      you only have two text files supplied in !PreNews, I guess that narrows
      down your options somewhat.
      Moral: Read ALL of the documentation. You might learn a few little
             tricks and tips.

    -subjects
      Switch PreNews to simply scan for subjects.
      If this is passed accidently with "-backup", the subjects will be
      scanned. This is because PreNews pays attention to "-subjects" before
      it comes to the news processor.



Results (of news process)
-------
  The top of the results file is self-explanatory.

  The part after the "---" seperator is a breakdown of all of the possible
  matches, and how often they were matched. The value in brackets is value
  from the previous run. This is to help you decide if any subjects are now
  out of date.
  If both the value in brackets and the hit value are zero, then the date of
  last match will be output so you can best decide what is no longer
  necessary.
  Keeping loads of subjects is not a problem, as PreNews dynamically
  allocates memory for all the entries. However it may cause a lag in
  processing as there are that many more entries to be matched.

  Note that matching is made in the FOLLOWING order:
    * miscellany?
    * newsgroup?
    * subject?
    * author?

  So if the subject "Polar bears" is raging in argonet.zfc, and you have
  chosen to keep all of argonet.zfc; polar bears will have no results as the
  checking will not get that far.



Geekisms (your history lesson)
------------------------------
  When you have a so-called job, wading through several hundred news postings
  just isn't an option. I figured the best way to sort that mess out would be
  to write a little program to pull out the news that I wanted while just
  losing the rest to the proverbial bit-bucket. Then, I don't even need to
  debatch it. Or course, it is easy to miss things you might have been
  interested in, but that isn't a big worry. News is time, and the time spent
  reading/writing news is shocking. Who says I don't have a social life? :-)

  PreNews started off life as a simple kludge called "extractor", which was
  set up specifically to my system. Written in BASIC, it was a bit slow. To
  give an example: 756681 bytes of news (369 articles), scanning that lot
  and copying across 170 articles took 42 seconds (~17K/sec).
  Using the exact same settings, PreNews whizzed through that lot in only
  6 seconds (-pollless) achieving 123K/sec.


  I decided PreNews was due a rewrite. It was slow. On a machine with 33Mb of
  memory, it was hard to justify a file-file scanning system. I could drop
  the whole lot into a Dynamic Area and work from memory.
  Also, it should be written in C, as compiled code is faster than
  interpreted. This wasn't a port of 'extractor'. In fact, the only part of
  extractor that was retained was the list of subjects to scan for, and the
  basic concept.

  On the morning (yes, I got up while it was still AM!) of 14/05/2001 I began
  to code a command-line utility. No sooner had I written half a screenful of
  code, I destroyed that and began again with PreNews fitting into the
  multitasking model. I figured, even with the overheads of polling, there
  would be a nice speed gain. Just to be sure, I included the "-pollless"
  command line option. Using that, it should be possible to whizz our way
  through a megabyte in around ten seconds (!).

  Possibly the biggest change (as if being completely rewritten in another
  language without reference to the original source is a minor triviality) is
  that the program now multitasks. This allows life to carry on in the
  background, and also a nifty little slider-window thingy.

  Multitasking brings with it obvious dangers. But, working on the principle
  of Schroedingers cockroach (the thing that killed the cat, all that rubbish
  about vials of arsenic is to keep the pro-cockroach lobby from protesting),
  PreNews will trap you trying to load NewsAgent's debatcher while your news
  is being processed. It will tell you off accordingly.

  The file "_results" is the textual version of the results, opened at the
  end of scanning (unless you've said otherwise).

  The statistics are encoded into the original data files in the following
  manner (again, unless you've said otherwise):

     Justin Fletcher
  being what you type.
  After being executed, this line would be written back as:
     Justin Fletcher€27,20010514
  The number following the € is the hit count, and the number following the
  comma is the system date in loose ISO format (CCYYMMDD). If a hit gives
  no matches, then this date will be preserved, else it will be updated. So
  you get to see how long it has been since something was matched.


  The match entries are constrained to 63 characters, and will be clipped if
  longer. Matching IS case sensitive.

  The maximum line length that can be processed is 4096 bytes. Like BASIC,
  if a line is longer it will be read as the next line.
  This behaviour does not affect the original news articles, as these are
  copied from start and finish offsets, not line by line.


  A thread started in one of the argonet newsgroups as a result of my posting
  this program to ...binaries.
  The ideas are basically:
    * Forget PreNews and write a nice front end to Newshound
    * Make PreNews a news fetcher that can do all this processing

  The first is unlikely to happen. Sorry.
  The latter will NOT happen in PreNews as PreNews is NOT a fetcher. That
  doesn't mean I'm not looking into making a better fetcher than the
  bog-standard NNTPfetch.

  I decided that wading through the news file on my days off looking for
  stuff that might be interesting isn't my idea of a way to spend a day. So
  I made PreNews whizz through the incoming news file and provide me a list
  of subjects. I can pick the ones that interest me.
  Time saving - MUCH, as PreNews can do it in around seven seconds for two
  and a half megabytes!

  Then, I decided I did not want to keep manually backing up my news. I'm
  really quite lazy y'see. So I implemented a backup system within PreNews so
  I just double-click the icon and it's all done for me.

  Then I fixed a few bugs. Somewhere in the middle of all of this I had to go
  to work. But considering that I work as a care assistant in nursing homes
  (elderly dependant), I'm sure you'll appreciate me not going into any
  detail whatsoever. 

  Next time - your next exciting installment!
  [and if you believe that, I've got this old clock I can sell you for four
   grand. It's in London, and offers you choice views of the houses of
   parliament... ]



Where your news is backed up to
-------------------------------
  The international standard date notation is YYYY-MM-DD. If you don't
  believe me, it is ISO-8601, and you can find some nice information at:
    http://www.cl.cam.ac.uk/~mgk25/iso-time.html
  This is offered by way of introducing the way that PreNews determines
  where backups are stored and what they are named.

  On a system before RISC OS 4, there are some inbuilt limits, namely:
    1. You can only store 77 news files in any given month. This is quite
       sufficient for two debatches a day.
    2. You can only store 77 months. That's six and a half years!

  I suspect your problems may relate more to disc space than running out of
  directory space!
  RISC OS 4 E+/F+ formatted discs allow long file names and more than 77
  entries in a directory. If you can afford RISC OS 4, you can probably
  afford a disc large enough to hold all that news! :-)

  <PreNews$Source> points to the 'source' news file. This is possibly the
  same as <PreNews$Outgoing>, ie, <INNewsIn$File>, but using a different
  variable provides more flexibility.

  <PreNews$Backup> points to the base directory. For the purposes of our
  example, we shall assume it points to the directory
  "IDEFS::Willow.$.newsbackup".

  Within this directory, will be a subdirectory constructed from a partial
  ISO-8601 date, in the format:
    YYYY-MM

  Within that subdirectory will be the news file, named as:
    DD-HHMMSS

  So, the instant that we run PreNews is Monday, 28th May. It is half three
  in the morning. You can expect your news to be backed up as:
           IDEFS::Willow.$.newsbackup.2001-05.28-032804
  or something along those lines.



Development
-----------
  PreNews was written by Richard Murray in Zap 1.40.
  This was compiled with Norcroft C version 4.00, and linked with the
  Norcroft linker version 4.00.
  The libraries are the standard Norcroft C libraries (using SharedCLibrary)
  and DeskLib version 2.30.

  The templates were designed with TemplEd version 1.26.

  The code was developed on a RiscPC700 running RISC OS 3.7 on an ARM710
  processor.

  14/05/2001
  On the radio was a wide variety of stuff, my current favourites being
  pretty much anything by Dido; and Geri Halliwell's "It's raining men"
  because it is a lively song, unlike a lot of the other stuff played.
  I was naughty and took a break to watch "Stepsister from planet weird" on
  Disney (analogue) because I liked the title of the movie! :-)

  28/05/2001
  For the entirety of today's work, I've been listening to a tape copy of
  the first two Britney Spears albums (forgive me) and The Pet Shop Boys
  "Discography" (side 2 is far superior to side 1); in alternation. I meant
  to stop at half twelve and go to bed early. It's half three, the sky is
  starting to get light. Oh well. :-)



Legal stuff
-----------
  The lone programmer, Richard Murray, retains all copyright and interests in
  this software.

  He (or should that be 'I'?) grants you a licence to use the software with
  the following provisos:

    1. This software is provided as-is.
       Note that early versions are NOT complete.

    2. You use this software at your own risk. Richard Murray shall not be
       held liable for any problems or damage that may be caused by the use
       of this software, whether it trashes your news or starts barking and
       eats your homework.

    3. You may only disassemble and examine/modify this code as is permitted
       by law. You may not redistribute such modified versions.

    4. The maximum amount that Richard Murray will be liable for in any case
       is the amount that you paid me for this software - exactly £0.00.

    5. You may not use this software if you are a supporter of UCITA. If you
       don't know what UCITA is, count yourself lucky.

    6. This software shall be governed by the law of the United Kingdom.

  And the obvious... This software has not had a full logic trace analysis
  and neither has the libraries used and, quite likely, the Operating System.
  Therefore this software is not to be used in mission-critical applications
  or life-support systems (though, to be honest, my mind boggles as to why a
  program for sorting out news articles would be used in a life-support
  system - unless you are a serious net-head...).



Contacts
--------
  PreNews was written by Richard Murray.

  Email  : support@heyrick.co.uk
  Web    : http://www.heyrick.co.uk/voyager/prenews/


Return to index
Copyright © Y2K1 Richard Murray