• pesky moose and squirrel

    From Maurice Kinal@2:280/464.113 to Natasha Fatale on Friday, July 05, 2019 00:41:21
    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
  • From mark lewis@1:3634/12.73 to Maurice Kinal on Thursday, July 04, 2019 20:57:52

    On 2019 Jul 05 00:41:20, you wrote to Natasha Fatale:

    @MSGID: 2:280/464.113 5d1e9cb1
    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    these all arrived here intact via the following path...

    @PATH: 280/464 103/705 218/700 261/38 3634/12

    )\/(ark

    And to this end they built themselves a stupendous super-computer which was
    so amazingly intelligent that even before its data banks had been connected
    up it had started from "I think therefore I am" and got as far as deducing
    the existence of rice pudding and income tax before anyone managed to turn
    it off.
    ... A smoking section in a restaurant is like a peeing section in a pool!
    ---
    * Origin: (1:3634/12.73)
  • From Rob Swindell to Maurice Kinal on Friday, July 05, 2019 01:31:48
    Re: pesky moose and squirrel
    By: Maurice Kinal to Natasha Fatale on Fri Jul 05 2019 12:41 am

    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    Can you send again with the CHRS: UTF-8 control paragraph? Thanks,

    -Rob
  • From Maurice Kinal@2:280/464.113 to Rob Swindell on Friday, July 05, 2019 19:02:52
    Hallo Rob!

    Can you send again with the CHRS: UTF-8 control paragraph?

    Yes I can. I'll insert it in this reply below;

    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
  • From mark lewis@1:3634/12.73 to Maurice Kinal on Friday, July 05, 2019 16:26:14

    On 2019 Jul 05 19:02:52, you wrote to Rob Swindell:

    @REPLY: 30261.ftsc_pub@1:103/705 2184b729
    @MSGID: 2:280/464.113 5d1f9edc
    @CHRS: UTF-8 4
    Hallo Rob!

    Can you send again with the CHRS: UTF-8 control paragraph?

    Yes I can. I'll insert it in this reply below;


    just so you know and can verify... here's a screen shot of your message i'm replying to...

    https://www.dropbox.com/s/whqhf5xtle1opkx/20190705-fidonet-ftsc_public-maurice-utf8-testing-01.jpg?dl=1

    https://tinyurl.com/y6lqhovs
    https://preview.tinyurl.com/y6lqhovs

    you can easily see the multibyte codes in the hex table even though the character is not formed properly in the text side... that is because it is a hex character being viewed in a CP437 terminal... if i type in characters like o diaresis and similar, they are displayed in the same manner... some byte followed by a tilde followed by a character... generally a capital one in what i've seen...

    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113) SEEN-BY: 103/705 153/7001 154/10 30 40 700 203/0 221/0 6 227/201 400 229/310
    SEEN-BY: 229/426 240/5832 280/464 464 5003 5006 5555 292/854 310/31
    320/219
    SEEN-BY: 340/800 396/45 423/120 712/848 770/1 2452/250 3634/12 5020/545 123/25
    SEEN-BY: 123/50 150 755 135/300 153/7715 261/38 3634/15 24 27 50 119 123/115
    SEEN-BY: 14/6 3634/0 18/0 123/0 1/120
    @PATH: 280/464 154/10 3634/12



    )\/(ark

    And to this end they built themselves a stupendous super-computer which was
    so amazingly intelligent that even before its data banks had been connected
    up it had started from "I think therefore I am" and got as far as deducing
    the existence of rice pudding and income tax before anyone managed to turn
    it off.
    ... We do not recognize our souls until they are in pain.
    ---
    * Origin: (1:3634/12.73)
  • From Maurice Kinal@2:280/464.113 to mark lewis on Saturday, July 06, 2019 00:19:33
    Hallo mark!

    just so you know and can verify... here's a screen shot of your
    message i'm replying to

    Got it. From what I see it is intact. It would be nicer and more practical to
    have a text based hexdump instead. For example if I dump the msg_body to a file called temp.data and farm that out to xxd I get this output, suitable for insertion into a reply, which I am now doing;

    -={ xxd temp.data starts }=-
    00000000: 4861 6c6c 6f20 526f 6221 0a0a 2052 533e Hallo Rob!.. RS>
    00000010: 2043 616e 2079 6f75 2073 656e 6420 6167 Can you send ag
    00000020: 6169 6e20 7769 7468 2074 6865 2043 4852 ain with the CHR
    00000030: 533a 2055 5446 2d38 2063 6f6e 7472 6f6c S: UTF-8 control
    00000040: 2070 6172 6167 7261 7068 3f0a 0a59 6573 paragraph?..Yes
    00000050: 2049 2063 616e 2e20 2049 276c 6c20 696e I can. I'll in
    00000060: 7365 7274 2069 7420 696e 2074 6869 7320 sert it in this
    00000070: 7265 706c 7920 6265 6c6f 773b 0a0a 4861 reply below;..Ha
    00000080: 6c6c 6f20 4e61 7461 7368 6121 0a0a 556e llo Natasha!..Un
    00000090: 6963 6f64 6520 2043 6861 7261 6374 6572 icode Character
    000000a0: 2020 2020 6865 7820 2020 2020 2020 2020 hex
    000000b0: 2020 2020 206e 616d 650a 552b 3034 3044 name.U+040D
    000000c0: 2020 2020 20d0 8d20 2020 2020 2020 2020 ..
    000000d0: 6430 2038 6420 2043 5952 494c 4c49 4320 d0 8d CYRILLIC
    000000e0: 4341 5049 5441 4c20 4c45 5454 4552 2049 CAPITAL LETTER I
    000000f0: 2057 4954 4820 4752 4156 450a 552b 3034 WITH GRAVE.U+04
    00000100: 3444 2020 2020 20d1 8d20 2020 2020 2020 4D ..
    00000110: 2020 6431 2038 6420 2043 5952 494c 4c49 d1 8d CYRILLI
    00000120: 4320 534d 414c 4c20 4c45 5454 4552 2045 C SMALL LETTER E
    00000130: 0a55 2b30 3438 4420 2020 2020 d28d 2020 .U+048D ..
    00000140: 2020 2020 2020 2064 3220 3864 2020 4359 d2 8d CY
    00000150: 5249 4c4c 4943 2053 4d41 4c4c 204c 4554 RILLIC SMALL LET
    00000160: 5445 5220 5345 4d49 534f 4654 2053 4947 TER SEMISOFT SIG
    00000170: 4e0a 552b 3034 4344 2020 2020 20d3 8d20 N.U+04CD ..
    00000180: 2020 2020 2020 2020 6433 2038 6420 2043 d3 8d C
    00000190: 5952 494c 4c49 4320 4341 5049 5441 4c20 YRILLIC CAPITAL
    000001a0: 4c45 5454 4552 2045 4d20 5749 5448 2054 LETTER EM WITH T
    000001b0: 4149 4c0a 0a50 6c65 6173 6520 6e6f 7420 AIL..Please not
    000001c0: 7468 6520 2243 5952 494c 4c49 4320 4341 the "CYRILLIC CA
    000001d0: 5049 5441 4c20 4c45 5454 4552 2045 4d20 PITAL LETTER EM
    000001e0: 5749 5448 2054 4149 4c22 213f 213f 213f WITH TAIL"!?!?!?
    000001f0: 213f 213f 2020 4f68 2074 6865 2068 756d !?!? Oh the hum
    00000200: 616e 6974 792e 0a0a 4865 7420 6c65 7665 anity...Het leve
    00000210: 6e20 6973 2067 6f65 642c 0a4d 6175 7269 n is goed,.Mauri
    00000220: 6365 0a0a 2e2e 2e20 4875 696c 206e 6965 ce..... Huil nie
    00000230: 7420 6f6d 206d 696a 2c20 696b 2068 6562 t om mij, ik heb
    00000240: 2076 692e 0a2d 2d2d 2047 4e55 2062 6173 vi..--- GNU bas
    00000250: 682c 2076 6572 7369 6f6e 2035 2e30 2e37 h, version 5.0.7
    00000260: 2831 292d 7265 6c65 6173 6520 2878 3836 (1)-release (x86
    00000270: 5f36 342d 7063 2d6c 696e 7578 2d67 6e75 _64-pc-linux-gnu
    00000280: 290a 202a 204f 7269 6769 6e3a 204c 6974 ). * Origin: Lit
    00000290: 746c 6520 4d69 6b65 7927 7320 4575 726f tle Mikey's Euro
    000002a0: 506f 696e 7420 2d20 4c61 6479 736d 6974 Point - Ladysmit
    000002b0: 6820 4243 2c20 4361 6e61 6461 2028 323a h BC, Canada (2:
    000002c0: 3238 302f 3436 342e 3131 3329 0a 280/464.113).
    -={ xxd temp.data ends }=-

    Note that the actual characters in the text side of the output are periods but if you match them up with the hex you get d0 8d, d1 8d, d2 8d, and d3 83, which
    is perfect. Also for a quick check this should work on the console;

    Or even easier and vastly less bytes in the output would be to delete all ascii
    and see whats left;

    -={ tr -d "\0-\177" < temp.data | xxd - }=-
    00000000: d08d d18d d28d d38d ........
    -={ end of stream }=-

    As you can see above all of the characters in question have survived intact and
    the intergity of the "CHRS: UTF-8 4" kludge has been preserved, not that it matters any as it is a totally useless entity. Obviously the work of DOS-think
    FTN types. :-)

    The above all aside, everything looks fine. I seem to have better luck posting
    from the EuroPoint but once it reaches here all bets are off. I think it is someone close to here.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
  • From Maurice Kinal@1:153/7001 to mark lewis on Saturday, July 06, 2019 01:04:38
    Hey mark!

    Your latest screenshot message to me did not survive intact. Note that I am on
    the node 1:153/7001 and replying to "MSGID: 1:3634/12.73 5d1fb3d0" with "PATH:
    3634/12 154/10 280/464 770/1 153/250 757".

    Using the divide and conquer method yields;

    -={ tr -d "\0-\177" | xxd - }=-
    00000000: d0d1 d2d3 ....
    -={ stream ends }=-

    As you can see those are no longer utf8 since all the trailing bytes were deleted.

    Life is good,
    Maurice

    ... Don't cry for me I have vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's Brain - Ladysmith BC, Canada (1:153/7001)
  • From Paul Quinn@3:640/1384.125 to Maurice Kinal on Saturday, July 06, 2019 16:53:14
    Hi! Maurice,

    On 07/05/2019 07:02 PM, you wrote to Rob Swindell:

    Can you send again with the CHRS: UTF-8 control paragraph?

    Yes I can. I'll insert it in this reply below;
    [ ...trimmed... ]
    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    I don't know if this may help at all...

    I had an experience with your note to Bob that could not be repeated with the original to Mme Fatale. The version to Bob in my Thunderbird browser (an old one) showed garbage characters however by viewing the underlying source (with CTRL+U), the Cyrillic characters were displayed. *Surprise*.

    Does this make any sense to you? I can produce *.png happy snaps of these screens on demand, if it'll help.

    Cheers,
    Paul.

    --- Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
    * Origin: Insert new disk for drive C: Press ENTER when ready. (3:640/1384.125)
  • From Paul Quinn@3:640/1384.125 to Maurice Kinal on Saturday, July 06, 2019 18:28:17
    Hi! Maurice,

    On 07/06/2019 04:53 PM, I wrote to you:

    The version to Bob in my Thunderbird browser (an old one) showed
    garbage characters however by viewing the underlying source (with
    CTRL+U), the Cyrillic characters were displayed.

    On second thought, don't bother. I re-read your descriptions of the intended characters and what I saw doesn't match, now, not even remotely.

    Greatcoats off.

    Cheers,
    Paul.

    --- Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
    * Origin: Who killed Laura Palmer? (3:640/1384.125)
  • From mark lewis@1:3634/12.73 to Maurice Kinal on Saturday, July 06, 2019 10:28:26

    On 2019 Jul 06 00:19:32, you wrote to me:

    just so you know and can verify... here's a screen shot of your
    message i'm replying to

    Got it. From what I see it is intact. It would be nicer and more practical to have a text based hexdump instead. For example if I dump
    the msg_body to a file called temp.data and farm that out to xxd I get this output, suitable for insertion into a reply, which I am now
    doing;

    yeah, if i had the original PKT, i might have done the same... the screenshot is how it is in my message base, though... i don't want to try to dump it like that and pick it out of the houndreds of other messages ;) ;) ;)

    The above all aside, everything looks fine. I seem to have better
    luck posting from the EuroPoint but once it reaches here all bets are
    off. I think it is someone close to here.

    yes, at least one is... there are other throughout the network, though...

    )\/(ark

    And to this end they built themselves a stupendous super-computer which was
    so amazingly intelligent that even before its data banks had been connected
    up it had started from "I think therefore I am" and got as far as deducing
    the existence of rice pudding and income tax before anyone managed to turn
    it off.
    ... Aliens Invaded Los Angeles (And No One Noticed)
    ---
    * Origin: (1:3634/12.73)
  • From Maurice Kinal@2:280/464.113 to Paul Quinn on Saturday, July 06, 2019 17:37:01
    Hallo Paul!

    Does this make any sense to you?

    Sort of. However in this case the most important is the retention of the trailing 0x8d byte. Once that has been stripped then none of the four multibyte characters are multibyte anymore and only the four leading bytes remain.

    I can produce *.png happy snaps of these screens on demand, if
    it'll help.

    I suppose it is possible help but again, in this case, there are only four bytes that *need* to be there in order for it to be a true utf8 message. They can be readily searched for in a text file. In fact for a regular 8-bit codepage the 0x8d character will also get deleted. This is not unique to utf8.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
  • From Maurice Kinal@2:280/464.113 to mark lewis on Saturday, July 06, 2019 18:09:50
    Hallo mark!

    the screenshot is how it is in my message base, though

    Understood. I realize that many (most?) will see the characters in question in
    their two byte forms, with the characters presented to them within their native
    codepage, cp437 in your case. The important thing is that whenever there is a 0x8d character it shouldn't ever get deleted.

    Het leven is goed,
    Maurice

    ... Huil niet om mij, ik heb vi.
    --- GNU bash, version 5.0.7(1)-release (x86_64-pc-linux-gnu)
    * Origin: Little Mikey's EuroPoint - Ladysmith BC, Canada (2:280/464.113)
  • From Rob Swindell to Maurice Kinal on Saturday, July 06, 2019 18:16:53
    Re: pesky moose and squirrel
    By: Maurice Kinal to Rob Swindell on Fri Jul 05 2019 07:02 pm

    Hallo Rob!

    Can you send again with the CHRS: UTF-8 control paragraph?

    Yes I can. I'll insert it in this reply below;

    Hallo Natasha!

    Unicode Character hex name
    U+040D Ѝ d0 8d CYRILLIC CAPITAL LETTER I WITH GRAVE
    U+044D э d1 8d CYRILLIC SMALL LETTER E
    U+048D ҍ d2 8d CYRILLIC SMALL LETTER SEMISOFT SIGN
    U+04CD Ӎ d3 8d CYRILLIC CAPITAL LETTER EM WITH TAIL

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh the humanity.

    On topic:
    https://youtu.be/ZlHG86U3IHs
  • From mark lewis@1:3634/12.73 to Rob Swindell on Sunday, July 07, 2019 06:15:00

    On 2019 Jul 06 18:16:52, you wrote to Maurice Kinal:

    Please not the "CYRILLIC CAPITAL LETTER EM WITH TAIL"!?!?!?!?!? Oh
    the humanity.

    On topic:
    https://youtu.be/ZlHG86U3IHs

    noice! can't wait to see it in action... might have to start using the shell terminal since i don't do much uploading or downloading any more :)

    )\/(ark

    And to this end they built themselves a stupendous super-computer which was
    so amazingly intelligent that even before its data banks had been connected
    up it had started from "I think therefore I am" and got as far as deducing
    the existence of rice pudding and income tax before anyone managed to turn
    it off.
    ... That's a bad idea. But then again I'm all about bad ideas.
    ---
    * Origin: (1:3634/12.73)