• binkd crashes when reloading after file change detection

    From mark lewis@1:3634/12.73 to all on Wednesday, January 19, 2022 10:06:58

    + 10:05 [19951] [path]/binkd-networks.conf changed!
    + 10:05 [19951] Reloading configuration...
    10:05 [19951] previous config is no longer in use, unloading
    - 10:05 [19951] servmgr listen on *:24554
    + 10:05 [19952] [path]/binkd-networks.conf changed!
    + 10:05 [19952] Reloading configuration...
    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    $ binkd -vv
    Binkd 1.1a-113 (Dec 7 2021 07:17:01/Linux)
    Compilation flags: gcc, zlib, bzlib2, https, ntlm, bwlim.
    Facilities: fts5004 ipv6


    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... Memories...tucked between the pages of your mind... ;*)
    ---
    * Origin: (1:3634/12.73)
  • From Michiel van der Vlist@2:280/5555 to mark lewis on Wednesday, January 19, 2022 16:59:43
    Hello mark,

    On Wednesday January 19 2022 10:06, you wrote to all:

    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    $ binkd -vv
    Binkd 1.1a-113 (Dec 7 2021 07:17:01/Linux)
    Compilation flags: gcc, zlib, bzlib2, https, ntlm, bwlim.
    Facilities: fts5004 ipv6

    I have reported this issue several times YEARS ago. No response. I gave up and just stopped using this "feature".


    Cheers, Michiel

    --- GoldED+/W32-MSVC 1.1.5-b20170303
    * Origin: http://www.vlist.eu (2:280/5555)
  • From mark lewis@1:3634/12.73 to Michiel van der Vlist on Wednesday, January 19, 2022 11:48:32

    On 2022 Jan 19 16:59:42, you wrote to me:

    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    $ binkd -vv
    Binkd 1.1a-113 (Dec 7 2021 07:17:01/Linux)
    Compilation flags: gcc, zlib, bzlib2, https, ntlm, bwlim.
    Facilities: fts5004 ipv6

    I have reported this issue several times YEARS ago. No response. I
    gave up and just stopped using this "feature".

    i know... it can't hurt to keep reporting it, though... i might even go looking and see if i can find a repository and file an issue there... that might be a more "official" method that would gain some traction...

    at least this time it seems to have told me where the problem is... i think previously all i saw was "double free"...

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... Unicorns aren't mythical - virgins are!
    ---
    * Origin: (1:3634/12.73)
  • From Alexander Kruglikov@2:5053/58 to mark lewis on Thursday, January 20, 2022 11:11:26
    Good ${greeting_time}, mark!

    19 Jan 22 10:06, you wrote to all:

    + 10:05 [19951] [path]/binkd-networks.conf changed!
    + 10:05 [19951] Reloading configuration...
    10:05 [19951] previous config is no longer in use, unloading
    - 10:05 [19951] servmgr listen on *:24554
    + 10:05 [19952] [path]/binkd-networks.conf changed!
    + 10:05 [19952] Reloading configuration...
    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    It helped me to add the following line to the binkd config

    rescan-delay 10

    With best regards,
    Alexander.

    --- "GoldED+/OSX 1.1.5-b20180707" ---
    * Origin: 24 hours in a day, 24 beers in a case, Hmmm... (2:5053/58)
  • From mark lewis@1:3634/12.73 to Alexander Kruglikov on Thursday, January 20, 2022 04:41:14

    On 2022 Jan 20 11:11:26, you wrote to me:

    10:05 [19951] previous config is no longer in use, unloading
    - 10:05 [19951] servmgr listen on *:24554
    + 10:05 [19952] [path]/binkd-networks.conf changed!
    + 10:05 [19952] Reloading configuration...
    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    It helped me to add the following line to the binkd config

    rescan-delay 10

    sysname "SouthEast Star (binkd)"
    location "central North Carolina, USA"
    sysop "waldo kitty"
    nodeinfo 115200,CM,IBN
    address 1:3634/12@fidonet 1:3634/0@fidonet 1:123/0@fidonet 1:18/0@fidonet 1:1/120@fidonet 432:1/139@vkradio
    #######################################
    # include all known FTN networks list #
    #######################################
    include /mybbs/ftn/nodelist/binkd-networks.conf
    pid-file /mybbs/ftn/binkd-console.pid
    iport 24554
    oport 24554
    oblksize 4096
    maxservers 1
    maxclients 1
    timeout 1m
    connect-timeout 10s
    call-delay 1s
    rescan-delay 10s
    try 1
    hold 10m
    log /mybbs/ftn/binkd-console.log
    loglevel 4
    conlog 4
    percents
    printq
    prescan
    inbound /mybbs/ftn/in/secure
    inbound-nonsecure /mybbs/ftn/in/unsecure
    temp-inbound /mybbs/ftn/tmp
    minfree 2048
    minfree-nonsecure 2048
    kill-dup-partial-files
    kill-old-partial-files 12h
    kill-old-bsy 2h
    flag /mybbs/sema4s/fidoin.now *.[pP][kK][tT]
    flag /mybbs/sema4s/fidoin.now *.[sS][uU]? *.[mM][oO]? *.[tT][uU]? *.[wW][eE]? flag /mybbs/sema4s/fidoin.now *.[tT][hH]? *.[fF][rR]? *.[sS][aA]?
    flag /mybbs/sema4s/tickit.now *.[tT][iI][cC] #######################################
    # set default node opts #
    # and include locally built nodelists #
    # fidonet.binkd.txt #
    # vkradio.binkd.txt #
    # plus our connections lists #
    # fidonet.connections.txt #
    # vkradio.connections.txt #
    #######################################
    # default to nd (No Dupes) mode and IPv4 only
    defnode -nd -4 *
    include /mybbs/ftn/nodelist/fidonet.binkd.txt
    include /mybbs/ftn/nodelist/vkradio.binkd.txt
    include /mybbs/ftn/nodelist/fidonet.connections.txt
    include /mybbs/ftn/nodelist/vkradio.connections.txt
    #[EOF]

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... The closest I can come to a brainstorm is a drizzle.
    ---
    * Origin: (1:3634/12.73)
  • From Oli@2:280/464.47 to mark lewis on Thursday, January 20, 2022 11:27:15
    mark wrote (2022-01-20):

    On 2022 Jan 20 11:11:26, you wrote to me:

    10:05 [19951] previous config is no longer in use, unloading
    - 10:05 [19951] servmgr listen on *:24554
    + 10:05 [19952] [path]/binkd-networks.conf changed!
    + 10:05 [19952] Reloading configuration...
    10:05 [19952] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:05 [19951] client manager (pid=19952) exited by signal 6

    It helped me to add the following line to the binkd config

    rescan-delay 10

    sysname "SouthEast Star (binkd)"
    location "central North Carolina, USA"
    sysop "waldo kitty"
    [...]
    timeout 1m
    connect-timeout 10s
    call-delay 1s
    rescan-delay 10s

    You are saying the proposed rescan-delay workaround isn't working for you?

    Someone has to fix the bug or the only reliable workaround is to disable rescan. Does someone know, if this bug also affects binkd running in client-only mode?

    ---
    * Origin: Birds aren't real (2:280/464.47)
  • From mark lewis@1:3634/12.73 to Oli on Thursday, January 20, 2022 07:10:54

    On 2022 Jan 20 11:27:14, you wrote to me:

    It helped me to add the following line to the binkd config

    rescan-delay 10

    sysname "SouthEast Star (binkd)"
    location "central North Carolina, USA"
    sysop "waldo kitty"
    [...]
    timeout 1m
    connect-timeout 10s
    call-delay 1s
    rescan-delay 10s

    You are saying the proposed rescan-delay workaround isn't working for you?

    yes... it has been in my configs for decades through quite a few different binkd versions and flavors... i had it in place on my old OS/2 system and simply copied that config over to this new linux setup with a few minor modifications (mainly for directory paths)...

    the automatic configs reloading feature (-C command line option) used to work just fine... i don't know what code changed or when that caused it to break but it has been a defect for a long while now... when mvdv first reported it, i was able to replicate it instantly which also lead me to realizing that my automated update process was flawed at the time because each new nodelist arrival should have been triggering it at least once a week... then the daily nodelists came out and i switched to them which means the defect is or could be triggered every day now depending on some factors i've not fully sussed yet...

    i can trigger the defect by editing one of my configs with mcedit and saving the changes but it does not seem to be triggered when my script updates the fidonet.binkd.txt file every night about local midnight when the Z1C's system delivers it... all three dates and possibly the file's size change when it is updated but binkd doesn't seem to notice it... at least, i don't find any entries in the log when the nodelist is updated like i saw when i edited my binkd-networks.conf file yesterday to see if the defect still existed...

    [edit1]
    hummm... i see that the binkd.faq has changed a little bit for the "-C" command line option... note the last paragraph below...

    =====>8 snip 8<=====
    11. I Have Changed binkd Configuration File On-The-Fly. When Will It Be Reloaded?

    Starting with the version 0.9.1 binkd could feel that its configuration file changed. It exited with code 3 if it had been started with option -C. Modification time was checked after each ingoing session. Here is the batch file for starting binkd versions 0.9.1-0.9.3 and 0.9.4-0.9.6/w32:

    ====
    :aaa
    binkd -C binkd.cfg
    if errorlevel 4 goto end
    if errorlevel 3 goto aaa
    :end
    ====

    In the versions 0.9.4/unix and /os2-emx (and in these ones only) binkd restarts automatically if it is started with -C command line option.
    Besides that starting with version 0.9.4 the files included into the configuration file with the help of 'include' keyword are tested not only
    on incoming sessions but also in every 'rescan-delay' seconds.

    If you install binkd 0.9.4/w32 as a Windows NT service you should use it with -C command line option. Then binkd re-reads its configuration file.

    Before version 0.9.4 changes in the configuration file were not tested if binkd was started in client-only mode (-c command line option).

    In the unix versions configuration file is re-read on SIGHUP signal
    by the command
    kill -HUP `cat /var/run/binkd.pid`

    In the version 1.0 configuration file is re-read automatically if
    changed. binkd tests on changes at every 'rescan-delay' seconds.
    =====>8 snip 8<=====

    so that last paragraph now makes me wonder if my continued use of the "-C" command line option is why i'm seeing this double free defect being triggered... in other words, is binkd noticing it automatically and then the "-C" is causing it to do the reload again? idk :thinking: it seems to only happen when the client side reloads when running as both server and client at the same time... the server part reloading seems fine... are they sharing the same in-memory config states and the client one is not noticing that the server side has already reloaded the state and dropped the old pointer?
    [/edit1]

    Someone has to fix the bug or the only reliable workaround is to
    disable rescan.

    that option seems to be only(?)/mainly(?) for rescanning the outbound directories... at least from what binkd.conf-dist says...

    =====>8 snip 8<=====
    # Delay of calls and outbound rescans in seconds
    #
    #call-delay 1m
    #rescan-delay 1m
    =====>8 snip 8<=====

    [edit2]
    but in light of what the FAQ now states (above), this has me wondering about using "-C" any more... at least on my 64bit native linux builds... hummm :thinking:
    [/edit2]

    [edit3]
    as a test i stopped my binkd, edited my startup script to remove the "-C" option, and restarted it... then i edited my binkd-networks.conf file which is included as previously shown... 5 minutes later and binkd still has not noticed the change contrary to what the FAQ's last paragraph above states about the configs being re-read automatically...

    i /HAD/ to use the -HUP method to trigger binkd to notice the changes... i tested this a few times and the -HUP method is the only way it works without the "-C" command line option... ugh... that's really not how automated updates should be handled :( but at least binkd doesn't crash so i guess there is a bit of a wry smile needed, too :?
    [/edit3]

    Does someone know, if this bug also affects binkd running in
    client-only mode?

    idk...

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... You're scary. Why are you so "Out There"? Take a deep breath.
    ---
    * Origin: (1:3634/12.73)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Thursday, January 20, 2022 13:51:20
    Hi mark,

    On 2022-01-20 07:10:54, you wrote to Oli:

    as a test i stopped my binkd, edited my startup script to remove the
    "-C" option, and restarted it... then i edited my binkd-networks.conf
    file which is included as previously shown... 5 minutes later and
    binkd still has not noticed the change contrary to what the FAQ's last paragraph above states about the configs being re-read
    automatically...

    I use a self compiled binkd:

    # binkd -vv
    Binkd 1.1a-111 (May 27 2020 16:10:25/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    I start it with '-Cq' options, and a 'rescan-delay 3' in the config (not '3s' btw!). It never crashes when I edit one of the config files, and never did so with older versions...


    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Michiel van der Vlist@2:280/5555 to mark lewis on Thursday, January 20, 2022 14:18:08
    Hello mark,

    Let me add my 2 cts.

    In an attempt to provoke the crashes I created an empty include file for my binkd config file. With the Windows task manager I created an event that touched the date/time stamp of this empy file every five minutes. The rescan delay for my binkd is set to 60.

    This was about five years ago. I let it run for a month or so. Binkd reloaded its config every 5 minutes and that did not cause a crash.

    So I thought "hmmm this version of binkd seems to have fixed the problem". But alas when I made changes in the actual config files the crashes reappeared. So it looks like it is not just the reloading the config files as such, some of the config files must actually change to make it crash.

    I gave up and stopped using the -C option.

    I did not formally document it at the time, so I am relying om memory.

    Just my 2 cts.

    Cheers, Michiel


    --- GoldED+/W32-MSVC 1.1.5-b20170303
    * Origin: http://www.vlist.eu (2:280/5555)
  • From mark lewis@1:3634/12.73 to Wilfred van Velzen on Thursday, January 20, 2022 08:23:16

    On 2022 Jan 20 13:51:20, you wrote to me:

    as a test i stopped my binkd, edited my startup script to remove the
    "-C" option, and restarted it... then i edited my binkd-networks.conf
    file which is included as previously shown... 5 minutes later and
    binkd still has not noticed the change contrary to what the FAQ's last
    paragraph above states about the configs being re-read
    automatically...

    I use a self compiled binkd:

    same here...

    # binkd -vv
    Binkd 1.1a-111 (May 27 2020 16:10:25/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    $ binkd -vv
    Binkd 1.1a-113 (Dec 7 2021 07:17:01/Linux)
    Compilation flags: gcc, zlib, bzlib2, https, ntlm, bwlim.
    Facilities: fts5004 ipv6

    I start it with '-Cq' options,

    'q'?? hummm... ahhh... quiet mode... ok... i run in a console terminal so i want to see all of that :)

    -C is not working for me and is what my original message showing the "double free in " defect...

    and a 'rescan-delay 3' in the config (not '3s' btw!).

    [quote=binkd.conf-dist]
    # Suffixes for time intervals are w for weeks, d for days,
    # h for hours, m for minutes, s or no suffix for seconds.
    # You can mix the suffixes, i.e. 1d12h is the same as 36h.
    [/quote]

    ;)

    It never crashes when I edit one of the config files, and never did so with older versions...

    that's interesting... what is your linux and compiler, please?

    $ uname -a; echo; gcc --version
    Linux sestar 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

    gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
    Copyright (C) 2017 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    my configure and compiling output appear to be ok for the most part... configure doesn't complain loudly about anything and compiling shows a few warnings about format-overflow where some things may possibly write more than 12 bytes to a region or destination of only 12 bytes in readcfg.c... the only other warning is a misleading indention in https.c where an if statement is apparently missing the "{}" around the two Log() lines it appears to be protecting... lines 372-374...

    just to be sure, i've just pulled and built again but there were no changes since my last pull and update...

    [remote "upstream"]
    url = https://github.com/pgul/binkd.git
    fetch = +refs/heads/*:refs/remotes/upstream/*


    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... Everyone has the right to act like an idiot. Some abuse the privilege.
    ---
    * Origin: (1:3634/12.73)
  • From mark lewis@1:3634/12.73 to Michiel van der Vlist on Thursday, January 20, 2022 08:55:16

    On 2022 Jan 20 14:18:08, you wrote to me:

    In an attempt to provoke the crashes I created an empty include file for my
    binkd config file. With the Windows task manager I created an event that touched the date/time stamp of this empy file every five minutes. The rescan delay for my binkd is set to 60.

    i remember you posting about this... it is one of the things that lead to me attempting to reproduce the defect here on my OS/2 setup at that time... now i'm on linux and the defect still exists...

    This was about five years ago. I let it run for a month or so. Binkd reloaded its config every 5 minutes and that did not cause a crash.

    So I thought "hmmm this version of binkd seems to have fixed the
    problem". But alas when I made changes in the actual config files the crashes reappeared. So it looks like it is not just the reloading the config files as such, some of the config files must actually change to make it crash.

    I gave up and stopped using the -C option.

    i've continued to use it but have always stopped my binkd manually before making and saving edits... at this time, though, i wanted to test for this defect specifically because i had been talking with some other operators about binkd and had noted that it can automatically reload its configs when changes have been made to them... since i had not been seeing crashes every day when my fidonet.binkd.txt nodelist is updated, i wondered about updating the other config files and if things were working there... as it happens, i triggered the defect and here we are with this current thread about it...

    i must say that i find it quite strange that updating the contents of one included file will trigger the defect but updating another included file isn't even noticed at all...

    I did not formally document it at the time, so I am relying om memory.

    my memory of your original posting appears to match what you write here :)

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... Margarine? I only use it recreationally. I wouldn't eat it!
    ---
    * Origin: (1:3634/12.73)
  • From mark lewis@1:3634/12.73 to Michiel van der Vlist on Thursday, January 20, 2022 09:12:42

    On 2022 Jan 20 08:55:17, I wrote to you:

    I gave up and stopped using the -C option.

    i've continued to use it but have always stopped my binkd manually before making and saving edits... [...]

    earlier i had remove the "-C" to test the automatic reload that the FAQ seems to indicate is done with binkd v1.xx... i've just put it back and run a short but similar test to your's using touch on my main configuration file as well as each of the various included files... in every case, binkd did detect the timestamp updates and said it reloaded the configure for both server and client instances... there was no crash...

    ok, so that works... let's try changing some content...

    first test is manually changing the case of a comment in my main binkd.conf file... both instances reloaded just fine... tested this several times changing only the case of some comment text... no problems... hummm...

    second test is manually changing the case of a comment in the first include file, my binkd-networks.conf... again, both instances reloaded just fine... tested this several times, too... no problems... hummm...

    third test is manually changing the case of a comment in another include file... same as the two previous tests... no crash... ok... hummm...

    4th test... now we'll add a blank line to the end of the main config file... no crash... ok, remove that blank line... no crash... alright...

    5th test... just like 4th, we'll add a blank line to the end of the first included file... no crash... ok... remove that blank line... and again, no crash...

    this is getting curiousier and curiouser...

    6th test... we'll change the data somewhere... we'll change "timeout 1m" to "timeout 45s" in the main conf file... that'll add one byte to the file, itself...no crash... ok, change it back... again, no crash...

    7th test... now we'll change a data line in the first include file... that's the file that triggered my initial post in this thread... we'll comment out a domain line for a network known to no longer exist... all we're doing is adding a '#' to the beginning of that domain line... that'll add one byte and remove some data when binkd processes the contents... damn! again no crash... remove the content so that domain line is valid again and no crash... WTAF is going on??

    8th test... instead of commenting out that domain line, i've completely removed it... no crash... put it back by copying a backup copy of the file back over the one we've edited... again no crash...

    i don't get it...

    [time passes]

    ahhh... if i'm reading the output of "ps aux" properly, there is a bit of a memory leak going on... perhaps the crashes i was able to trigger are the result of a combination of some memory leakage over a "long" running time coupled with changing a config or included file? i'll try to set something up to test this possibility further but i'm done manually beating on it by editing files for now ;)

    clean start
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sbbs 17629 0.6 0.2 13980 2932 pts/1 S+ 09:48 0:00 binkd: server manager (listen 24554)
    sbbs 17630 0.0 0.2 14028 2244 pts/1 S+ 09:48 0:00 binkd: client manager

    remove domain line as in 8th test
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sbbs 17629 0.1 0.3 14648 3684 pts/1 S+ 09:48 0:00 binkd: server manager (listen 24554)
    sbbs 17630 0.1 0.2 14708 2988 pts/1 S+ 09:48 0:00 binkd: client manager

    copy back file over edited file restoring domain line as in 8th test
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sbbs 17629 0.1 0.3 14784 3844 pts/1 S+ 09:48 0:00 binkd: server manager (listen 24554)
    sbbs 17630 0.2 0.3 14868 3204 pts/1 S+ 09:48 0:00 binkd: client manager


    the above may also not be quite scientific as there was at least one mail connection at one stage... i don't recall if there was any actual files moved, though... so i've whipped up a quick script that will monitor and log binkd's memory usage... it'll take a reading as above once every 5 minutes... let's see what we find...

    where's my asprin?

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... There is no such thing as arugula yet many gourmet recipes call for it.
    ---
    * Origin: (1:3634/12.73)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Thursday, January 20, 2022 16:49:31
    Hi mark,

    On 2022-01-20 08:23:16, you wrote to me:

    I start it with '-Cq' options,

    'q'?? hummm... ahhh... quiet mode... ok... i run in a console terminal so i
    want to see all of that :)

    Mine is run on system startup as a systemd.service. So there is no console to look at. If I want to see what's going on I have a look at the log...

    It never crashes when I edit one of the config files, and never did
    so with older versions...

    that's interesting... what is your linux and compiler, please?

    $ uname -a; echo; gcc --version
    Linux sestar 4.15.0-166-generic #174-Ubuntu SMP Wed Dec 8 19:07:44 UTC 2021
    x86_64 x86_64 x86_64 GNU/Linux

    gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

    # uname -a; echo; gcc --version
    Linux wilnux5 4.7.5-1.gc7aed11-default #1 SMP PREEMPT Sat Sep 24 11:41:43 UTC 2016 (c7aed11) x86_64 x86_64 x86_64 GNU/Linux

    gcc (SUSE Linux) 4.8.5

    my configure and compiling output appear to be ok for the most part... configure doesn't complain loudly about anything and compiling shows a
    few warnings about format-overflow where some things may possibly
    write more than 12 bytes to a region or destination of only 12 bytes
    in readcfg.c... the only other warning is a misleading indention in https.c where an if statement is apparently missing the "{}" around
    the two Log() lines it appears to be protecting... lines 372-374...

    That was a long time ago in my case, and I didn't save the output. ;)

    just to be sure, i've just pulled and built again but there were no changes since my last pull and update...

    I might try that later...


    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Björn Felten@2:203/2 to mark lewis on Thursday, January 20, 2022 17:10:06
    i remember you posting about this... it is one of the things that lead
    to me attempting to reproduce the defect here on my OS/2 setup at that time... now i'm on linux and the defect still exists...

    I don't know if it's the same, but I run a *very* old version directly on a WinXP, and a few times the binkd has exited after a new nodelist compile. But considering that it's worked perfectly for more than seven years now, I didn't bother to spend any time on it.


    Binkd 1.1a-65 (Sep 21 2014 18:59:33/Win32)
    Compilation flags: mingw32, perldl, https, ntlm, amiga_4d_outbound, bwlim, ipv6, fsp1035.
    Facilities: fsp1035 ipv6




    ..

    --- Mozilla/5.0 (Windows; U; Windows NT 5.1; sv-SE; rv:1.9.1.16) Gecko/20101125
    * Origin: news://eljaco.se:4119 (2:203/2)
  • From Oli@2:280/464.47 to mark lewis on Thursday, January 20, 2022 18:15:43
    mark wrote (2022-01-20):

    ahhh... if i'm reading the output of "ps aux" properly, there is a bit of a memory leak going on... perhaps the crashes i was able to trigger are the result of a combination of some memory leakage over a "long" running time coupled with changing a config or included file? i'll try to set something up to test this possibility further but i'm done manually beating on it by editing files for now ;)

    IIRC the problem with non-standard ports in the nodelist / node line (e.g. 24555) was also not always and on every system reproducible. For reference:

    https://github.com/pgul/binkd/issues/15

    (There was more discussion somewhere else, but I cannot find it).

    Interestingly I believe the commit bug also fixed another bug with the node -pipe option as a side effect. Weird things happening seems to be not unusual for binkd.

    ---
    * Origin: Birds aren't real (2:280/464.47)
  • From mark lewis@1:3634/12.73 to Wilfred van Velzen on Thursday, January 20, 2022 16:28:16

    On 2022 Jan 20 16:49:30, you wrote to me:

    I start it with '-Cq' options,

    'q'?? hummm... ahhh... quiet mode... ok... i run in a console terminal
    so i want to see all of that :)

    Mine is run on system startup as a systemd.service. So there is no console to look at. If I want to see what's going on I have a look at the log...

    i can understand that... i have both going on... when i'm really digging into something, i set the conlog level higher and make liberal use of my scrollback buffer ;)

    that's interesting... what is your linux and compiler, please?
    [...]
    gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

    gcc (SUSE Linux) 4.8.5

    ahh... i know there are a lot of changes between gcc4 and gcc7...

    my configure and compiling output appear to be ok for the most
    part... configure doesn't complain loudly about anything and
    compiling shows a few warnings about format-overflow where some
    things may possibly write more than 12 bytes to a region or
    destination of only 12 bytes in readcfg.c... the only other warning
    is a misleading indention in https.c where an if statement is
    apparently missing the "{}" around the two Log() lines it appears to
    be protecting... lines 372-374...

    That was a long time ago in my case, and I didn't save the output. ;)

    my build script logs all output via tee... makes it easy for me see things as they happen and enables me to look back at previous builds output :)

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... <smack> <smack> <whip> <whip> Oh! Ms. Co-Moderator!
    ---
    * Origin: (1:3634/12.73)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Thursday, January 20, 2022 22:50:07
    Hi mark,

    On 2022-01-20 16:49:31, I wrote to you:

    just to be sure, i've just pulled and built again but there were no
    changes since my last pull and update...

    I might try that later...

    I just did:

    # binkd -vv
    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    And did a couple of config file change tests: No crash occured...

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Thursday, January 20, 2022 23:00:32
    Hi mark,

    On 2022-01-20 16:28:16, you wrote to me:

    my configure and compiling output appear to be ok for the most
    part... configure doesn't complain loudly about anything and
    compiling shows a few warnings about format-overflow where some
    things may possibly write more than 12 bytes to a region or
    destination of only 12 bytes in readcfg.c... the only other warning
    is a misleading indention in https.c where an if statement is
    apparently missing the "{}" around the two Log() lines it appears to
    be protecting... lines 372-374...

    That was a long time ago in my case, and I didn't save the output. ;)

    my build script logs all output via tee... makes it easy for me see things as they happen and enables me to look back at previous builds output :)

    On the compile I just did (it wasn't a full build), just a make on the new files from 111 to 113. I only got a couple of warnings:

    Compiling md5b.c...
    md5b.c: In function ?MD5Update?:
    md5b.c:134:4: warning: call to function ?MD5_memcpy? without a real prototype [-Wunprototyped-calls]

    And then a couple of dozen more on other functions in the md5b.c file. So nothing related to the reloading-config-crash problem...

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Alan Ianson@1:153/757 to Wilfred van Velzen on Thursday, January 20, 2022 14:32:26
    # binkd -vv
    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    And did a couple of config file change tests: No crash occured...

    I have also never experienced a crash. I start binkd with the -DC options and binkd has always happily reloaded the config if it changes.

    I don't include any files in my binkd.conf and I notice that others do. The crash could be related to these included files somehow?

    --- BBBS/Li6 v4.10 Toy-5
    * Origin: The Rusty MailBox - Penticton, BC Canada (1:153/757)
  • From Wilfred van Velzen@2:280/464 to Alan Ianson on Friday, January 21, 2022 00:10:21
    Hi Alan,

    On 2022-01-20 14:32:26, you wrote to me:

    # binkd -vv
    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    And did a couple of config file change tests: No crash occured...

    I have also never experienced a crash. I start binkd with the -DC options and binkd has always happily reloaded the config if it changes.

    I don't include any files in my binkd.conf and I notice that others do. The
    crash could be related to these included files somehow?

    I include some files from my main config, and did the tests on all...

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From deon@3:633/509 to mark lewis on Friday, January 21, 2022 09:59:19
    Re: binkd crashes when reloading after file change detection
    By: mark lewis to Michiel van der Vlist on Thu Jan 20 2022 08:55 am

    In an attempt to provoke the crashes I created an empty include file for my
    binkd config file. With the Windows task manager I created an event that touched the date/time stamp of this empy file every five minutes. The rescan delay for my binkd is set to 60.

    i remember you posting about this... it is one of the things that lead to me attempting to reproduce the defect here on my OS/2 setup at that time... now i'm on linux and the defect still exists...

    FWIW, I'm running binkd in linux (in docker) - and I edit my config files all the time, and I've never once seen that crash or the double-free error message.

    Wonder what's different with my binkd...?


    ...δεσ∩
    --- SBBSecho 3.14-Linux
    * Origin: I'm playing with ANSI+videotex - wanna play too? (3:633/509)
  • From Dan Cross@3:770/100 to mark lewis on Friday, January 21, 2022 12:57:30
    On 20 Jan 2022 at 09:12a, mark lewis pondered and said...

    earlier i had remove the "-C" to test the automatic reload that the FAQ seems to indicate is done with binkd v1.xx... i've just put it back and run a short but similar test to your's using touch on my main configuration file as well as each of the various included files... in every case, binkd did detect the timestamp updates and said it reloaded the configure for both server and client instances... there was no crash...

    ok, so that works... let's try changing some content...

    [snip]

    that'll add one byte and remove some data when binkd processes the contents... damn! again no crash... remove the content so that domain
    line is valid again and no crash... WTAF is going on??

    Based on your experiments, I imagine what's going on is
    a serialization failure when accessing heap-allocated
    memory in a multi-threaded context. You're not seeing
    crashes because the crashes are non-deterministic.

    These sort of annoying "heisenbugs" are the worst. But
    fortunately there are tools that can help out. Perhaps
    compile with something like TSAN enabled and try running
    that way?

    --- Mystic BBS v1.12 A47 2021/11/06 (Linux/64)
    * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (3:770/100)
  • From Dan Cross@3:770/100 to Wilfred van Velzen on Friday, January 21, 2022 12:58:40
    On 20 Jan 2022 at 04:49p, Wilfred van Velzen pondered and said...

    Mine is run on system startup as a systemd.service. So there is no
    console to look at. If I want to see what's going on I have a look at
    the log...

    Does `journalctl` not show useful data from the output
    of binkd?

    --- Mystic BBS v1.12 A47 2021/11/06 (Linux/64)
    * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (3:770/100)
  • From Rob Swindell to Dan Cross on Thursday, January 20, 2022 18:11:32
    Re: Re: binkd crashes when reloading after file change detection
    By: Dan Cross to mark lewis on Fri Jan 21 2022 12:57 pm

    These sort of annoying "heisenbugs" are the worst. But
    fortunately there are tools that can help out. Perhaps
    compile with something like TSAN enabled and try running
    that way?

    Or run it under valgrind.
    --
    digital man (rob)

    Rush quote #48:
    The point of the journey is not to arrive. Anything can happen.
    Norco, CA WX: 65.4°F, 36.0% humidity, 7 mph NNW wind, 0.00 inches rain/24hrs
  • From Wilfred van Velzen@2:280/464 to Dan Cross on Friday, January 21, 2022 10:50:40
    Hi Dan,

    On 2022-01-21 12:58:40, you wrote to me:

    Mine is run on system startup as a systemd.service. So there is no
    console to look at. If I want to see what's going on I have a look at
    the log...

    Does `journalctl` not show useful data from the output
    of binkd?

    Forgot about that one. I never use it... But indeed you can have a look at it, but is it usefull?

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Dan Cross@3:770/100 to Wilfred van Velzen on Saturday, January 22, 2022 02:13:59
    On 21 Jan 2022 at 10:50a, Wilfred van Velzen pondered and said...

    On 2022-01-21 12:58:40, you wrote to me:

    Mine is run on system startup as a systemd.service. So there is no
    console to look at. If I want to see what's going on I have a look a
    the log...

    Does `journalctl` not show useful data from the output
    of binkd?

    Forgot about that one. I never use it... But indeed you can have a look
    at it, but is it usefull?

    I dunno...I don't run binkd. I've found it very useful for
    seeing what Direwolf is doing (or not doing) on my packet
    radio station, though. YMMV.

    --- Mystic BBS v1.12 A47 2021/11/06 (Linux/64)
    * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (3:770/100)
  • From Dan Cross@3:770/100 to Rob Swindell on Saturday, January 22, 2022 02:14:40
    On 20 Jan 2022 at 06:11p, Rob Swindell pondered and said...

    Or run it under valgrind.

    Great idea! I forgot all about valgrind.

    --- Mystic BBS v1.12 A47 2021/11/06 (Linux/64)
    * Origin: Agency BBS | Dunedin, New Zealand | agency.bbs.nz (3:770/100)
  • From Wilfred van Velzen@2:280/464 to Dan Cross on Friday, January 21, 2022 14:18:10
    Hi Dan,

    On 2022-01-22 02:13:59, you wrote to me:

    On 2022-01-21 12:58:40, you wrote to me:

    Mine is run on system startup as a systemd.service. So there is
    no
    console to look at. If I want to see what's going on I have a
    look a
    the log...

    Does `journalctl` not show useful data from the output
    of binkd?

    Forgot about that one. I never use it... But indeed you can have a
    look at it, but is it usefull?

    I dunno...I don't run binkd. I've found it very useful for
    seeing what Direwolf is doing (or not doing) on my packet
    radio station, though. YMMV.

    When you use the -q (quiet) option for binkd, like I do, it isn't very talkative on the console. But the things that still show up in journalctl, might be interesting in that case...

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From mark lewis@1:3634/12.73 to Wilfred van Velzen on Friday, January 21, 2022 09:17:32

    On 2022 Jan 20 22:50:06, you wrote to me:

    I just did:

    # binkd -vv
    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    noice!

    And did a couple of config file change tests: No crash occured...

    even changing included files?

    i've not tried messing with mine since i stopped yesterday... i put a monitor in place running "ps aux" in a loop once every five minutes and logging the output... i've not looked at it yet, today... mail is still flowing so binkd must still be running ;) :lol:

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... URA True Northerner if you and your wife have matching 4X4s.
    ---
    * Origin: (1:3634/12.73)
  • From mark lewis@1:3634/12.73 to Dan Cross on Friday, January 21, 2022 10:40:14

    On 2022 Jan 21 12:57:30, you wrote to me:

    These sort of annoying "heisenbugs" are the worst.

    you can say that again! :lol:

    But fortunately there are tools that can help out. Perhaps compile
    with something like TSAN enabled and try running that way?

    i have thought about attempting to build a debug version of binkd and running the core file through gdb to get a backtrace when the fecal matter gets flung about... unfortunately my basic knowledge and use of the make system gets in the way of doing that... i've also thought about doing the build and running it with valgrind... but again, my basic knowledge of make gets in the way...

    i don't have a problem providing backtraces to the devs as long as getting them is easy to do and doesn't require much in the way of gymnastics... i do this now with several other projects... most of them use cmake which i also know next to nothing about...


    [edit]

    FWIW: it looks like there is some sort of memory consumption in the client section of the code... whether it is a leak or not, i don't really know... the client side is the one that has crashed on me when reloading the configs in the past...

    the following is from my monitoring script i whipped up yesterday... the first reading is immediately after loading binkd... the bottom one is after binkd has been running for just over 24 hours servicing some 40 or 50 links and whatever random connections came in... it stays quite busy transferring mail and files... in the config i have set for 1 server and 1 client which makes it easier on me to keep track of sessions in the log...

    ********** Program start at 2022-01-20 10:22:39 -0500 **********
    Current time: 2022-01-20 10:22:39 -0500
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sbbs 23128 1.0 0.3 13980 3104 pts/1 S+ 10:22 0:00 binkd: server manager (listen 24554)
    sbbs 23129 0.0 0.2 14028 2412 pts/1 S+ 10:22 0:00 binkd: client manager

    [...]

    Current time: 2022-01-21 10:27:46 -0500
    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sbbs 23128 0.0 0.1 13980 1696 pts/1 S+ Jan20 0:01 binkd: server manager (listen 24554)
    sbbs 23129 0.1 1.1 25496 11752 pts/1 S+ Jan20 2:44 binkd: client manager


    from another terminal, i manually triggered a compile of the nodelist that arrived just after local midnight and Boom! we triggered the crash in the client manager again... the server manager didn't even get a chance to notice the change and update its config this time...

    + 10:30 [23129] /sbbs/ftn/nodelist/fidonet.binkd.txt changed!
    + 10:30 [23129] Reloading configuration...
    10:30 [23129] previous config is no longer in use, unloading
    free(): double free detected in tcache 2
    ! 10:30 [23128] client manager (pid=23129) exited by signal 6

    this seems to add more credence to the problem being somehow related to long running times and possibly some sort of memory leak or corruption...

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... I multitask... I read in the bathroom.
    ---
    * Origin: (1:3634/12.73)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Friday, January 21, 2022 18:21:36
    Hi mark,

    On 2022-01-21 09:17:32, you wrote to me:

    I just did:

    # binkd -vv
    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    noice!

    I will read that without the 'o'! ;-)

    And did a couple of config file change tests: No crash occured...

    even changing included files?

    Yes.

    And this also happens automatically when the daily nodelist comes in and is converted to binkd readable format.

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From mark lewis@1:3634/12.73 to Wilfred van Velzen on Friday, January 21, 2022 12:59:42

    On 2022 Jan 21 18:21:36, you wrote to me:

    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    noice!

    I will read that without the 'o'! ;-)

    does it mean something other than "nice" in other languages you speak?

    And did a couple of config file change tests: No crash occured...

    even changing included files?

    Yes.

    And this also happens automatically when the daily nodelist comes in and is
    converted to binkd readable format.

    ok... i'm still keeping an eye on mine, here...

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... In the beginning there was nothing, which exploded.
    ---
    * Origin: (1:3634/12.73)
  • From Wilfred van Velzen@2:280/464 to mark lewis on Friday, January 21, 2022 23:46:58
    Hi mark,

    On 2022-01-21 12:59:42, you wrote to me:

    Binkd 1.1a-113 (Jan 20 2022 22:42:34/Linux)
    Compilation flags: gcc, zlib, bzlib2.
    Facilities: fts5004 ipv6

    noice!

    I will read that without the 'o'! ;-)

    does it mean something other than "nice" in other languages you speak?

    Nope, but it "sounds" like noise...

    Bye, Wilfred.

    --- FMail-lnx64 2.1.0.18-B20170815
    * Origin: FMail development HQ (2:280/464)
  • From Paul Quinn@3:640/1384 to Wilfred van Velzen on Saturday, January 22, 2022 10:59:56
    Hi! Wilfred,

    On 21 Jan 2022, Wilfred van Velzen said the following...
    Hi mark,

    On 2022-01-21 12:59:42, you wrote to me:
    noice!
    I will read that without the 'o'! ;-)
    does it mean something other than "nice" in other languages you speak

    Nope, but it "sounds" like noise...

    I think it's supposed to reproduce a New Yorker accented 'nice'. I've used it at least once in either email or net/echo-mail. I deserve a smack!

    Cheers,
    Paul.

    ... Top secret! Burn before reading!
    --- Mystic BBS v1.12 A47 2021/12/24 (Linux/32)
    * Origin: Quinn's Rock vBox - sunny-side up on the desktop (3:640/1384)
  • From mark lewis@1:3634/12.73 to Wilfred van Velzen on Saturday, January 22, 2022 04:46:22

    On 2022 Jan 21 23:46:58, you wrote to me:

    noice!

    I will read that without the 'o'! ;-)

    does it mean something other than "nice" in other languages you speak?

    Nope, but it "sounds" like noise...

    because you're using 'z' where 's' should be used ;)

    nice - /nis/
    noice - /nois/

    noise - /noiz/


    https://www.urbandictionary.com/define.php?term=noice

    Beyond the boundaries and exceeding the limits of nice. Spoken with emphasis when describing something particularly awesome.

    "That freshly waxed fire truck that was rolling down the street looked really noice!"


    https://knowyourmeme.com/memes/noice

    *About*
    Noice also spelled Nooice, is an accented version of the word "nice", used online as enthusiastic, exclamatory internet slang to declare approval or sarcastic approval of a topic or achievement. It is often associated with the Australian or English accents or bros.

    *Origin*
    While the origin of this phrase is unknown, it was first defined on Wiktionary on Feb 1, 2001, as being an English dialectical version of nice. It was first defined on Urban Dictionary by user "Cracka-B A.K.A Billy Blam" on March 16, 2003, who claimed that it meant "To be beyond the regular limits of nice. To be nice, and then exceed the status." The user claimed that the term was introduced by the Beastie Boys song "Three MCs & One DJ", which may or may not be the first recorded usage.

    However, this definition is not the most popular of the 36 definitions submitted for noice. According to users, the term is more likely to mean "Beyond the boundaries and exceeding the limits of nice. Spoken with emphasis when describing something particularly awesome."

    [/OT] ;)

    )\/(ark

    "The soul of a small kitten in the body of a mighty dragon. Look on my majesty, ye mighty, and despair! Or bring me catnip. Your choice. Oooh, a shiny thing!"
    ... Outlaw junk mail, and save the trees!
    ---
    * Origin: (1:3634/12.73)