• *.MSG vs. packed help?

    From Mike Tripp@1:382/61 to Mike Luther on Monday, July 09, 2001 02:16:47
    Hello Mike!

    08 Jul 01 19:54, Mike Luther wrote to Bob Jones:

    (3) Use an existing tool to produce the ASCII text straight out
    of the squish database.... I'm sure I've seen such, just don't
    remember where....

    I'm looking more or less around .. The problem is that in order to do
    that it looks like I'll have to extract the whole message scholm, then
    and so on and so forth. That's what I thought I was trying to avoid.


    TopicX, which has already been suggested, will work if you can handle one TXT file that contains multiple messages that meet the filter criteria. If you need message-by-message processing, then you might try to locate an old copy of
    SQFILTER, by Raphaël Vanney. His SQTOOL was a rewrite of SQFILTER, after he lost the source to SQFILTER in a disk crash. SQTOOL got new features, but unfortunately didn't retain =all= of the old ones from SQFILTER.




    Mike

    --- GoldED 2.50+
    * Origin: -=( The TechnoDrome )=- Austin,TX 512-327-8598 33.6k (1:382/61)
  • From Mike Luther@1:117/3001 to Peter Knapper on Monday, July 09, 2001 02:32:42
    OK Peter ..

    I use all Squish message base format here, and I archive the Echo's I want to keep using TOPICX (Topic Extract). It will
    extract ALL or selected messages using a wonderful
    array of criteria, and you can dump the entire message
    into a plain text file. It keeps a log file and uses
    that to find the next starting point for each Area you
    process, only NEW messages are added. Available in DOS
    and OS/2 versions, TOPX_110 and TOPXP111. Its old
    (1992) and reports this year 2001 as 101, but
    otherwise its pretty good for me. Shouldn't be too
    hard to post process the date into the correct format
    with a couple of lines of Rexx.....;-)

    Let me know if you can't find it and want a copy.

    They are both in house now. Found the 110 version and then got a hit on TOPXP111 on Google and got it.

    Thank you. I'll see what I can do. I'd really like to bust off this *.MSG format. It takes to much time and disk thrashing.

    Mike @ 1:117/2001

    --- Maximus/2 3.01
    * Origin: Ziplog Public Port (1:117/3001)
  • From Mike Luther@1:117/3001 to All on Sunday, July 08, 2001 07:54:12
    Looking for suggestions here.

    Long ago and far away I gave a small amount of money for a DOS database analysis tool called askSam. The askSam, at least the professional version,is written totally in assembler and is VERY fast and good. Later on the askSam utility was re-written or Win 3.1 and has, I think, also been ported to WIN-9x,
    but not OS/2. As well, it was merged into a full CGI script driven Web page creation and hosting tool that uses the core capabilities of the askSam engine as well.

    What's neat about askSam? Well, for one thing, you just read the whole mess for any old text into the database. Then *AFTER* you have any database created, you can create fields, manipulate all the fields, and whatever,without
    ever messing with the actual database! Intriquing indeed. The tool has, so said, a fair nitch market in law enforcement, for munching in huge piles of information. Then you go in and research patterns in the information which will not, perhaps, be apparent as to how you can prove how this or that record is related to this or that other record until askSam goes through all the needles in the haystack and shows you.

    The US DEA broke the Manuel Noriega case with it, for example. They crunched in all the surveilance traffic and used it to ferret out who talked to whom and
    about what over megabytes of text .. ;)

    In essence, for about $65 back then on special, I got, all those years back,a full text search engine for all the Fido traffic as well. Preposterous? Not at
    all! For a smaller FidoNet net like Net 117 here, all the local traffic for the last ten years isn't any more than about 36MB or so! That's because the engine is a full relational database engine!

    What I've been doing all these years is simple -- as long as I use a *.MSG format. All I do is take the inbound messages, and export them into a pure ASCI text base dupe file! Then I call askSam as a command line input function.
    I then punch in the entire message .. seen by's .. origin lines,the whole 9 yards into the master database. From that point on I can tell you, for example, the identity of every node that ever used this or that word of profanity .. or whatever. E.v.e.r.y W.o.r.d.

    Interesting. Blows the Hades out of Fido elections and complaints at times.

    Anyway.

    Problem is that as I and a few folks have gotten more interested in other things and higher traffic volume echos, changes are necessary. It matters not what BBS system is involved, the pricipal is the same. One of the reasons we all moved away from *.MSG format is because it takes too long to scan and manipulate the message base in that format as the number off message areas and messages in them grows up!

    I've reached that point now where the system still works fine, but the maintenace operations for the utilties take too long in the *.MSG format for the system.

    What do I do now?

    I need a *.MSG format for the inbound traffic, keyboard or toss produced, it doesn't matter. I need that, or at least I think I do, to export that into the
    askSam ghoul and utilities I hand wrote to convert the *.MSG format to what I want for Uncle Sambo!

    It doesn't really matter if the results are not done in synchronization with the BBS system or not. In-as-much as OS/2 is threaded, if I just have the pile
    of files to smunch into the ghoul, that part of the process can go on irrespective of whether or not the BBS has been returned to service.

    But what is missing, if I go to, for example SQUISH and a SQUISH message base, is the fact that in the toss process, we don't get a new cap and a new stack of
    *.MSG's in a directory to clue the askSam thread task, to "Aha! traffic! Munch
    in all from the old cap to the new one!" That's how I wrote the interface code
    hard coded untility which is called now to do the task I wrote ass MSG2ASK.EXE all these years ago.

    At 36 MB of file askSam isn't even cooking yet. It can handle a 2 GB ull relational database and 4 GB under some file systems! Split across an OS/2 LVM
    oriented drive, who knows where the limit is! I suspect it's actually larger still, but at the time it was written, getting to 4GB was a feat!

    Advice?

    Suggestions?

    Mike @ 1:117/3001

    --- Maximus/2 3.01
    * Origin: Ziplog Public Port (1:117/3001)
  • From Bob Jones@1:343/41 to Mike Luther on Sunday, July 08, 2001 07:50:26
    I've reached that point now where the system still
    works fine, but the maintenace operations for the
    utilties take too long in the *.MSG format for the system.

    What do I do now?
    ...
    But what is missing, if I go to, for example SQUISH and
    a SQUISH message base, is the fact that in the toss
    process, we don't get a new cap and a new stack of
    *.MSG's in a directory to clue the askSam thread task,
    to "Aha! traffic! Munch in all from the old cap to the
    new one!" That's how I wrote the interface code hard
    coded untility which is called now to do the task I
    wrote ass MSG2ASK.EXE all these years ago.

    Ah..... Do you code your own some times?

    Suggestions:

    (1) Pickup the MSGAPI documentation for squish and roll a new program for MSG2ASK.EXE.

    (2) Toss all areas of interest to a point, and have have the tool much the point data.

    (3) Use an existing tool to produce the ASCII text straight out of the squish database.... I'm sure I've seen such, just don't remember where....

    Advice?

    Suggestions?

    See above....

    Good luck.....

    Max when used with squish databases has a fairly good search engine. I take it
    you want more of a relational database you can manipulate....

    Bob Jones, 1:343/41


    --- Maximus/2 3.01
    * Origin: Top Hat 2 BBS (1:343/41)
  • From Mike Luther@1:117/3001 to Bob Jones on Sunday, July 08, 2001 12:54:10
    Thanks .. Bob

    Ah..... Do you code your own some times?

    Unfortunately .. yes. I spend most of my life at it .. gloom.

    Suggestions:

    (1) Pickup the MSGAPI documentation for squish and
    roll a new program for MSG2ASK.EXE.

    I was hoping to avoid that by something like ..

    (2) Toss all areas of interest to a point, and have
    have the tool much the point data.

    what you have just suggested above! Thanks! Maybe I can toss it to a point like you say, then on completion of the scan .. just clean the whole area involved in the toss! Good idea.

    (3) Use an existing tool to produce the ASCII text straight out of the squish database.... I'm sure I've seen such, just
    don't remember where....

    I'm looking more or less around .. The problem is that in order to do that it looks like I'll have to extract the whole message scholm, then and so on and so
    forth. That's what I thought I was trying to avoid.

    Max when used with squish databases has a fairly good
    search engine. I take it you want more of a
    relational database you can manipulate....

    Yes .. I want to keep it in the same setup that's been here all these years

    Thanks for the thoughts.


    Mike @ 1:117/3001



    --- Maximus/2 3.01
    * Origin: Ziplog Public Port (1:117/3001)
  • From Peter Knapper@3:772/1.10 to Mike Luther on Monday, July 09, 2001 14:36:30
    Hi Mike,

    Long ago and far away I gave a small amount of money
    for a DOS database analysis tool called askSam. The
    askSam, at least the professional version,is written
    totally in assembler and is VERY fast and good.

    I use all Squish message base format here, and I archive the Echo's I want to keep using TOPICX (Topic Extract). It will extract ALL or selected messages using a wonderful array of criteria, and you can dump the entire message into a
    plain text file. It keeps a log file and uses that to find the next starting point for each Area you process, only NEW messages are added. Available in DOS and OS/2 versions, TOPX_110 and TOPXP111. Its old (1992) and reports this year 2001 as 101, but otherwise its pretty good for me. Shouldn't be too hard to post process the date into the correct format with a couple of lines of Rexx.....;-)

    Let me know if you can't find it and want a copy.

    Cheers..............pk.


    --- Maximus/2 3.01
    * Origin: Another Good Point About OS/2 (3:772/1.10)
  • From Jonathan de Boyne Pollard@2:440/4.3 to Mike Luther on Tuesday, July 10, 2001 00:35:50
    But what is missing, if I go to, for example SQUISH and a SQUISH
    message base, is the fact that in the toss process, we don't get a
    new cap and a new stack of *.MSG's in a directory to clue the askSam thread task, to "Aha! traffic! Munch in all from the old cap to the
    new one!" That's how I wrote the interface code hard coded untility
    which is called now to do the task I wrote ass MSG2ASK.EXE all these
    years ago.

    Am I to gather than the problem is not the SQUISH messagebase format, but the triggering of the scan itself ?

    Why not trigger it with the same mechanism that you use to trigger SQUISH ? At
    some point you know when to run the tosser. So just before you run the tosser,
    make copies of the ARCmail and *.PKT files, decompress them, convert the contents to "askSam" format, and import them into the database.

    An alternative, and very different, approach is to essentially have a purpose designed "scanner", that maintains its own set of "last read" pointers for each
    area in your messagebase. When run, it scans each area, updating the last read
    pointers as it goes, exporting newly entered messages from the messagebase into
    the "askSam" database. All that you then need do is trigger it whenever something new is added to your messagebase, or simply run it at regular intervals.

    » JdeBP «

    --- FleetStreet 1.22 NR
    * Origin: JdeBP's point, using Squish <yuk!> (2:440/4.3)
  • From Mike Luther@1:117/3001 to Jonathan de Boyne Pollard on Tuesday, July 17, 2001 03:24:24
    Yes and No, John ..

    JdBP> Am I to gather than the problem is not the SQUISH
    JdBP> messagebase format, but the triggering of the scan
    JdBP> itself ?

    The problem is because my custom executable that I wrote to do this called MSG2ASK.EXE was written to go through the new message directory and cull off only the new *.MSG files. It carefully extracted each of the *.MSG files in order sequence into the pure ASCI format which is used to import them into the askSam.

    It suffers from the same seizure mode that *.MSG suffers from as the message base gets bigger and bigger. OS/2 is a WONDERFUL game with HPFS. And, you really can get away with murder compared to what you would see on a pure DOS 386 BBS box! But as the number of message bases goes way up and the number of messages in the directory goes way up, even HPFS bogs.

    Which isn't as bad with SQUISH .. obviously.

    JdBP> Why not trigger it with the same mechanism that you
    JdBP> use to trigger SQUISH ? At some point you know when
    JdBP> to run the tosser. So just before you run the
    JdBP> tosser, make copies of the ARCmail and *.PKT files,
    JdBP> decompress them, convert the contents to "askSam"
    JdBP> format, and import them into the database.

    That was the substance of a couple suggesstions here. It's probably the easiest of the solutions. In that OS/2 is threaded, we could do that extraction run at precisely the point you suggest. Then we can start a separate threaded session doing the askSam import, which takes time. While that is going on, we let the BBS system go back on line for the next use,even while the askSam import is still running.

    I'm working at how best to do this now.

    JdBP> An alternative, and very different, approach is to essentially have a
    JdBP> purpose designed "scanner", that maintains its own
    JdBP> set of "last read" pointers for each area in your
    JdBP> messagebase. When run, it scans each area, updating
    JdBP> the last read pointers as it goes, exporting newly
    JdBP> entered messages from the messagebase into the
    JdBP> "askSam" database. All that you then need do is
    JdBP> trigger it whenever something new is added to your
    JdBP> messagebase, or simply run it at regular intervals.

    Curiously, I already have that exact code dormant in the .CMD file (was an old .BAT file!) that runs BINK/MAX/SQUISH! Currently, the part of it which is still active in re askSam, is only the part which keeps the message cap pointers for the askSam MSG2ASK.EXE control file updated! What is already necessary is to make sure that after each MR call in the MAX operations, we massage the file MSG2ASK.CTL to reset the cap number for the re-numbered message areas we are capturing to askSam. The askSam doesn't, at least here, capture NETMAIL, and can be selectively employeed just line configuring SQUISH or AREAS.BBS or MSGAREA.CTL for example.

    This update is done as part of the DAILY housekeeping segment.

    I hadn't thought about making the actual import only a daily game until you posted this.

    Now I'm sitting here pondering if it would be better to do that, and suffer the
    delay time in being able to find what all the dog did today in a near-real-time
    deal, or just chain Fido to the doghouse once a day until he sings for his dog bowl!

    Your way would only disable the system for the time needed for that at some normally un-used time anyway. Maybe that's an acceptable answer.

    Thanks!

    Sleep well; OS/2's still awake! ;)

    Mike @ 1:117/3001



    --- Maximus/2 3.01
    * Origin: Ziplog Public Port (1:117/3001)
  • From Mike Tripp@1:382/61 to Mike Luther on Monday, July 09, 2001 04:03:35
    Hello Mike!

    08 Jul 01 14:54, Mike Luther wrote to All:

    But what is missing, if I go to, for example SQUISH and a SQUISH
    message base, is the fact that in the toss process, we don't get a new
    cap and a new stack of *.MSG's in a directory to clue the askSam
    thread task, to "Aha! traffic! Munch in all from the old cap to the
    new one!" That's how I wrote the interface code hard coded untility
    which is called now to do the task I wrote ass MSG2ASK.EXE all these
    years ago.

    You could use something like NETMGR or SQTOOL to dupe messages from their Squishbases to *.MSG copies for your utility to process. Depending on your code, you could probably eliminate the watermark test conditions in your util so that it always processes all .MSGs and then nuke them afterwards, eliminating the expire/renumber maintenance that you're switching to avoid.

    You're still going to suffer from load with volume, generating one file per message, though. You could eliminate the rest of the drawbacks of *.MSG (all those file open/close/find-next ops) if you rewrote the util to handle a multi-message ASCII dump (like you could get from the previously suggested TopicX) as input. If you so desire, this extract could contain all the new messages from all areas, in one file. Of course, that would involve swapping human time for computer time...and most of us have more of the latter than the former to burn. :)

    However, a *.MSG-free life contains more spare time of both types. I gave up on them back when the BBS box was 4.77Mhz XT w/2 x 20mb ST225's. The differences in overhead were = p a i n f u l l y = obvious on that gear. ;)

    Many thanks to Scott Dudley for writing code efficiently enough to run on that box and for providing the incentive to explore OS/2 (and Netware, for that matter) as soon as I could get my hands on 386's.

    .\\ike

    --- GoldED 2.50+
    * Origin: -=( The TechnoDrome )=- Austin,TX 512-327-8598 33.6k (1:382/61)