16 bytes??? Some kind of internal header?? I'm not sure I fully
understand what Tops-20 is doing with the buffer; I'll have to
investigate further to see whether my suspicion about the use of
meta-data at the beginning.
Would it help to see more of the messages that I grabbed with tcpdump??
I did a tcpdump -e -i en0 --immediate-mode -n decnet and have about 92K
of traffic that I could send off-list.? I'll let Bob get you traces.
When fixing this, could you put in some kind of configuration command to
change the fixed behavior back to the current?? I might need this when
chasing down my suspicion, above.
This is probably happening to _all_ 20's on HECnet that are getting
messages from PyDECnet.? I'm checking further into my logs, which has
been tedious.? Beyond the DTEKPA crashes, the size of the error log can
cause SPEAR to trigger a problem with PA1050; you get an Illegal
reference to address 301000 at user 547217. And the extracts are so huge
that not even the Tops-20 gnuemacs can hold them.? You have to transfer
them to some other system. If you're not running my Extended mode FTP
server, that will take days.
I only have access to one (TWENEX::), but when I checked, its
SERR:ERROR.SYS file was over 70K pages, which is what you'd expect.? I
didn't have time yesterday to run SPEAR to actually pull and now I can't
seem to get to it. TWENEX::'s DECnet configuration does not appear to be
up to date.? When I looked at available DECnet nodes, it didn't know
any.? It also didn't know what node my CTERM was coming in on.
Does anybody have a Tops-20 node that is talking to PyDECnet that can
either check or give me a guest account?? It would be instructive to
double check.
------------------------------------------------------------------------
On 1/18/21 12:39 PM, Paul Koning wrote:
No, 1478 doesn't make any sense.
Looking at my code, I use a local buffer size on Ethernet of 591. But then I track what
buffer sizes are reported by the router neighbors in their hello messages, and limit the
routing message size to the smallest of all these numbers.
Then I noticed I subtract 16 from that when calculating the size of the update messages.
Why that is I don't remember.
So in any case, 1478 should never be a routing protocol message size coming out of
PyDECnet.
I'd like to see messages traces. A trace level log from A2RTR would do the job.
Somthing is very strange here.
paul
> ------------------------------------------------------------------------
> On Jan 18, 2021, at 1:54 AM, Johnny Billquist <bqt at softjar.se> wrote:
>
> Thomas, this is pretty much exactly what I expected (and I suspect Paul expected as
well).
>
> The level 1 routing messages are (as we said) the ones that can grow big. And the
advertised length are not used by the other side to limit what they send. It essentially
hints how large messages you send.
>
> And Paul also noted that on ethernet the Python code is using larger buffer size
(essentially the size an ethernet frame can be) instead of putting any lower limit on it.
While this is perfectly legal from a protocol point of view, both TOPS-20 and VMS, it
would seem, can't really control the size of the low layer buffer, and therefore fails
if you use large packets without also having a large DECnet segment buffer size.
>
> So Paul's PyDECnet works the same as I have managed to have RSX work here. And
you get the same problem towards some OSes.
>
> The obvious, and easy fix is to just lower the buffer size used over ethernet to more
closely match what the DECnet segment buffer size is.
>
> The sad thing with that is that, at least for RSX, it means you run the risk of
hanging the ethernet when running TCP/IP. The best would be if all OSes could separate the
two buffer sizes properly.
> But I just realized that I might just hack RSX DECnet here, to not use the large
buffer size for the link messages... Hmm... Gotta look into this.
>
> Meanwhile, the fix that Paul already mentioned that he has prepared and ready should
fix this for you.
>
> Alternatively, if you change that 1504-%RTEHS to instead actually say something like
1500, or 1504, you should probably also be good. (My guess would be 1500.)
>
> Johnny
>
> ------------------------------------------------------------------------
> On 2021-01-18 04:45, Thomas DeBellis wrote:
>> I think I may have finally gotten to the bottom of this. It's a level 1
routing message that I'm getting from 2.1023 (A2RTR) that does not appear to be
respecting lengths, viz:
>> *22:04:30*.749823 aa:00:04:00:ff:0b > ab:00:00:03:00:00, ethertype DN
(0x6003), length *1478*: lev-1-routing src 2.1023 {ids 0-726 cost 0 hops 0
>> This is two (2) bytes over the maximum that Tops-20 can accept.
>> NCP>*SHOW LINE NI-0 CHARACTERISTICS *
>> NCP>
>> 22:16:04 NCP
>> Request # 23; Show Line Characteristics Completed
>> Line = NI-0
>> Receive Buffers = 6
>> Controller = Normal
>> Protocol = Ethernet
>> Hardware Address = 00 1F 16 EC CE 47
>> Receive buffer size = *1476*
>> It would appear that the 20's are advertising this length in their layer 1
hello messages:
>> 22:04:21.018507 aa:00:04:00:0a:0a > ab:00:00:03:00:00, ethertype DN (0x6003),
length 60: router-hello l1rout vers 2 eco 0 ueco 0 src 2.522 blksize *1476* pri 5 hello
15
>> 22:04:21.082680 aa:00:04:00:08:0a > ab:00:00:03:00:00, ethertype DN (0x6003),
length 60: router-hello l1rout vers 2 eco 0 ueco 0 src 2.520 blksize *1476* pri 5 hello
15
>> About two seconds after the message comes in from A2RTR, the following appears in
the error log:
>> ***********************************************
>> DECNET ENTRY
>> LOGGED ON 17-Jan-2021 *22:04:32*-EST MONITOR UPTIME WAS 1 day(s)
>> 1:17:54
>> DETECTED ON SYSTEM # 3691.
>> RECORD SEQUENCE NUMBER: 70952.
>> ***********************************************
>> DECNET Event type 5.15, Receive failed
>> From node 2.520 (TOMMYT), occurred 17-JAN-2021 22:04:08
>> Line NI-0-0
>> Failure reason = Frame too long
>> Ethernet header = AB 00 00 03 00 00 / AA 00 04 00 0A 0A
>> So... no way I can get around this without some /serious/ hacking of DNADLL and
ROUTER (see below), which would probably take me a few months to learn and debug. Of
course, then maybe I could put level 2 routing into Tops-20, which I been daydreaming
about...
>> Paul, what does this suggest to you?
>>> ------------------------------------------------------------------------
>>> On 1/17/21 7:39 PM, Johnny Billquist wrote:
>>>> ------------------------------------------------------------------------
>>>> On 2021-01-18 00:17, Thomas DeBellis wrote:
>>>>
>>>> Well, the frames certainly won't be larger than 1,500 bytes, right?
So I'm guessing they'll be the maximum. Problem is, all of that stuff is hidden
under several layers of drivers, so I'm not sure how I'm going to get the overage
passed back. And I also need to put in some BUGINF logic to alert if I get more of these
than whatever I decide the interval to be.
>>> That depends on what they count. Like I said - ethernet payload is 1500. Then
you have the ethernet headers which is 14 bytes, plus the crc trailer, which is 4 bytes.
If you count them, you end up at 1518 bytes.
>>> Depends on the hardware I guess. I have no idea what the NIA-20 expose.
>> I meant the maximum frame size; I suspect this is 1500 for the NI, but I
don't actually know. My speculation is that DECnet is using part of the buffer to
piggy back node and and other information into it instead of holding this meta-data,
separately. I don't know what Multinet does, but there you can configure the NI to
have a packet size of 1500.
>>>> If you are a DDP (LD.DDP), then you are not CPU dependent and you go
ahead always, otherwise, you have to be on the CPU that owns the device (.CPCPN) So
I'm not sure if it makes any difference, but DDP is not CPU dependent; not sure if
that is a synonym for 'shared'. If I stumble over something more, I'll report
it.
>>> It's actually the same in RSX. The DDCMP layer is sort of between the
hardware driver and the higher level protocols, and it's not tied to any specific
CPU.
>>>
>>> But that code would suggest that LD.DDP is just an indication of whether
something is CPU dependent or not, and would have anything to do with DDCMP.
>> From looking at the routing code, seems LD.DDP is used when something is getting
handed to the NSP to play with, I guess that would be goig through some kind of layering.
> --
> Johnny Billquist || "I'm on a bus
> || on a psychedelic trip
> email: bqt at softjar.se || Reading murder books
> pdp is alive! || tryin' to stay hip" - B. Idol