• mis poll get 'stuck'

    From Biite@21:3/120 to All on Thursday, July 22, 2021 23:34:29
    Hi all,

    Got the weirdest thing here:
    - Mystic 1.12 A46 running on Ubuntu 20.04.2
    - Experienced a filesystem full on the /mystic filesystem
    - Now sometimes a 'mis poll' gets stuck:
    mbbs@svr1:/mystic/semaphore$ ps fax|grep mis
    182272 ? SLsl 0:57 /mystic/mis daemon
    183902 ? S 0:00 \_/bin/sh -c ./mis poll 21:3/100 1> /dev/null
    /dev/null
    183903 ? RLl 1201:23 | \_ ./mis poll 21:3/100

    When killing the 'mis poll' (e.g. kill 183903) the whole process runs again.

    Any ideas?

    Martien

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Friday, July 23, 2021 10:11:21
    Hello Biite!

    On 22 Jul 2021, Biite said the following...
    - Experienced a filesystem full on the /mystic filesystem
    - Now sometimes a 'mis poll' gets stuck:

    Really strange (shouldn't be affected by the earlier disk full problem, one would think)...

    I had problems with mis poll hanging (and consuming lots of CPU) previously; when tracing the process it was looping around in gettimeofday() I think it was.

    You might want to use the timeout command to safeguard against hangs, e.g.:

    timeout -k 300 --preserve-status -v 300 ./mis poll 21:3/100

    ...which would allow mis to run for 5 minutes, then sends a TERM signal, waits up to 5 more minutes for it to finish, then sends a KILL signal.

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/18 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Friday, July 23, 2021 23:34:06
    Hi Zip,

    You might want to use the timeout command to safeguard against hangs,

    Set up the timeout command as you suggested and will keep monitoring the
    logs.
    Thanks for the suggestions!

    Regards,
    Martien

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Saturday, July 24, 2021 08:06:52
    Hello Biite!

    On 23 Jul 2021, Biite said the following...
    Set up the timeout command as you suggested and will keep monitoring the logs.
    Thanks for the suggestions!

    You're very welcome!

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/18 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Sunday, July 25, 2021 23:01:54
    On 24 Jul 2021, Zip said the following...

    You're very welcome!

    Looks like your timeout command did the trick, haven't had a 'hang' in a few days!

    Regards,
    Biite

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Sunday, July 25, 2021 23:33:00
    Hello Biite!

    On 25 Jul 2021, Biite said the following...
    Looks like your timeout command did the trick, haven't had a 'hang' in a few days!

    Glad to hear that! :)

    Still wondering what could be the cause of the hangs -- the strace output showed that it was waiting for some kind of monotonic timer event if I recall correctly...

    Thinking if it could be that time went backwards or something (but that's usually not the way time adjustments are made, I think, except *maybe* during boot when syncing with the hardware clock, and if so, hopefully *before* all services are started)...

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/07/23 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)
  • From Biite@21:3/120 to Zip on Monday, July 26, 2021 17:35:27
    On 25 Jul 2021, Zip said the following...

    Still wondering what could be the cause of the hangs

    Same here, haven't checked here with strace (not really familiar on how to do that ;) )

    Thinking if it could be that time went backwards or something

    My mis poll got stuck sometime after a logrotate at midnight. This logrotate stops and starts the mis server. It got stuck a few hours (about 3) after
    that.
    And it suddenly started to get stuck, after I've got my filesystem full. Did not upgrade versions or anything else. Running 1.12 A46 for more than a year without issues before this started.

    Maybe I can check if I've the same issue if you send some explanation on how
    to 'strace' mis :)

    Regards,
    Biite

    --- Mystic BBS v1.12 A46 2020/08/26 (Linux/64)
    * Origin: Westland BBS (21:3/120)
  • From Zip@21:1/202 to Biite on Wednesday, September 15, 2021 23:21:08
    Hello Biite!

    I got this first now -- or my Mystic's message pointers are acting up. =)

    Anyway...

    On 26 Jul 2021, Biite said the following...
    Same here, haven't checked here with strace (not really familiar on how
    to do that ;) )

    To run mis poll manually from the command line and trace what it's doing, you could use something like:

    strace -f -s32768 -vvv ./mis poll 2>&1 | tee /tmp/trace.txt

    ...and then break with Ctrl+C and look at /tmp/trace.txt.

    But it's not a good "solution" for intermittent errors.

    As for logrotate and stopping/starting Mystic, I use the "copytruncate" logrotate option instead, so that the "rotated" files are copies of the current logs and the current logs are truncated (and Mystic continues to write to them, so no need for stopping/starting Mystic).

    Something like:

    /mystic/logs/*.log {
    daily
    notifempty
    missingok
    rotate 7
    compress
    delaycompress
    copytruncate
    su bbsuser bbsgroup
    }

    Best regards
    Zip

    --- Mystic BBS v1.12 A47 2021/09/07 (Linux/64)
    * Origin: Star Collision BBS, Uppsala, Sweden (21:1/202)