When we where chasing another problem, it was found that packets
"disapear" in the Ethernet fabric at Update, sometimes.
I think you are referring to when we played with transfers and the
performance drops to the floor. That is lost packets because of
interface speed differences. They are not really lost, but there only so
much that can be queued up in the fabric between interfaces with
different speeds.
Maybe Johnny can make a small map of the current topology with all
involved things and the lan-speeds?
That would actually be useful either way, but I'm not sure I can do it
easily.
Johnny
If it was only ONE switch used for all the DECnet speaking things at
Update, it would be simple to make a drawing. And with only one device
per port, I think any resonable switch has enough buffers to make it
work way better.
--P
(I might even find a switch that does it right and send it to Update)
Area 44 is missing from the list.
Johnny, did you turn off the bridge link to my LAN?
Verzonden vanaf mijn BlackBerry 10-smartphone.
Origineel bericht
Van: Bob Armstrong
Verzonden: dinsdag 27 mei 2014 22:50
Aan: hecnet at Update.UU.SE
Beantwoorden: hecnet at Update.UU.SE
Onderwerp: RE: [HECnet] "Dropped by adjacent node" ....
I appreciate the suggestions, but I'm not really interested in trying to
re-architect HECnet. That's a losing battle - I just want my node to work
:-)
I tried sending the USR1 signal to the bridge program and got this -
0: legato 0.0.0.0:0000 (Rx: 827 Tx:11017 (Drop rx: 6)) Active: 1
Throttle: 278(114)
1: psilo 130.238.19:4711 (Rx:11044 Tx: 821 (Drop rx: 27)) Active: 1
Throttle: 115(332)
Hash of known destinations:
aa0004000d04 -> 1 (1.13)
aa0004000f04 -> 1 (1.15)
aa0004001404 -> 1 (1.20)
aa0004001504 -> 1 (1.21)
aa0004005e05 -> 1 (1.350)
aa000400c205 -> 1 (1.450)
aa0004000108 -> 0 (2.1)
aa000400f810 -> 1 (4.248)
aa000400ff17 -> 1 (5.1023)
aa0004009021 -> 1 (8.400)
aa0004009121 -> 1 (8.401)
aa000400bc21 -> 1 (8.444)
aa000400f421 -> 1 (8.500)
aa000400022c -> 1 (11.2)
aa000400032c -> 1 (11.3)
aa000400642c -> 1 (11.100)
aa0004000138 -> 1 (14.1)
aa000400284c -> 1 (19.40)
aa000400294c -> 1 (19.41)
aa0004000470 -> 1 (28.4)
aa0004002970 -> 1 (28.41)
aa000400feab -> 1 (42.1022)
aa0004002fbc -> 1 (47.47)
aa0004004dbd -> 1 (47.333)
aa0004002bbe -> 1 (47.555)
aa0004002cbe -> 1 (47.556)
aa0004002fbe -> 1 (47.559)
aa000400bcbe -> 1 (47.700)
aa000400bdbe -> 1 (47.701)
aa000400d7be -> 1 (47.727)
aa0004000bec -> 1 (59.11)
aa00040005f8 -> 1 (62.5)
aa0004007dfa -> 1 (62.637)
Note that I restarted it about 30 minutes ago, so this data is just for
that period of time. My first observation is that's a lot of routing nodes,
but that's probably neither here nor there.
Also I'm struck by the asymmetry in the number of packets sent vs
received. I guess that makes sense, since I'm transmitting routing and
hello messages for only one node (LEGATO) and receiving them from beaucoup
nodes.
And I'm not sure that the throttling in the bridge is about...
Bob
I appreciate the suggestions, but I'm not really interested in trying to
re-architect HECnet. That's a losing battle - I just want my node to work
:-)
I tried sending the USR1 signal to the bridge program and got this -
0: legato 0.0.0.0:0000 (Rx: 827 Tx:11017 (Drop rx: 6)) Active: 1
Throttle: 278(114)
1: psilo 130.238.19:4711 (Rx:11044 Tx: 821 (Drop rx: 27)) Active: 1
Throttle: 115(332)
Hash of known destinations:
aa0004000d04 -> 1 (1.13)
aa0004000f04 -> 1 (1.15)
aa0004001404 -> 1 (1.20)
aa0004001504 -> 1 (1.21)
aa0004005e05 -> 1 (1.350)
aa000400c205 -> 1 (1.450)
aa0004000108 -> 0 (2.1)
aa000400f810 -> 1 (4.248)
aa000400ff17 -> 1 (5.1023)
aa0004009021 -> 1 (8.400)
aa0004009121 -> 1 (8.401)
aa000400bc21 -> 1 (8.444)
aa000400f421 -> 1 (8.500)
aa000400022c -> 1 (11.2)
aa000400032c -> 1 (11.3)
aa000400642c -> 1 (11.100)
aa0004000138 -> 1 (14.1)
aa000400284c -> 1 (19.40)
aa000400294c -> 1 (19.41)
aa0004000470 -> 1 (28.4)
aa0004002970 -> 1 (28.41)
aa000400feab -> 1 (42.1022)
aa0004002fbc -> 1 (47.47)
aa0004004dbd -> 1 (47.333)
aa0004002bbe -> 1 (47.555)
aa0004002cbe -> 1 (47.556)
aa0004002fbe -> 1 (47.559)
aa000400bcbe -> 1 (47.700)
aa000400bdbe -> 1 (47.701)
aa000400d7be -> 1 (47.727)
aa0004000bec -> 1 (59.11)
aa00040005f8 -> 1 (62.5)
aa0004007dfa -> 1 (62.637)
Note that I restarted it about 30 minutes ago, so this data is just for
that period of time. My first observation is that's a lot of routing nodes,
but that's probably neither here nor there.
Also I'm struck by the asymmetry in the number of packets sent vs
received. I guess that makes sense, since I'm transmitting routing and
hello messages for only one node (LEGATO) and receiving them from beaucoup
nodes.
And I'm not sure that the throttling in the bridge is about...
Bob
On May 28, 2014, at 12:17 AM, Peter Lothberg <roll at Stupi.SE> wrote:
On May 27, 2014, at 3:17 PM, Bob Armstrong <bob at jfcl.com> wrote:
in Phase IV the hello timer value is sent in the hello message
and that value times 3 (or 2) is used for the listen timer
=20
That would explain why having LEGATO send hello messages more often
doesn't make any difference=85
If you send hellos more often, you time out sooner; either way, you time ou=
t if you lose 3 hellos in a row (or, probably just barely, if you lose two =
in a row). The assumption is that no LAN is that badly broken.
The way to run DECnet over a flaky long distance network is to use point to=
point mode with a data link layer that deals with packet loss. DDCMP is a=
n example of such a data link layer; X.25 is also (after a fashion). That=
=92s what the DECnet design intended.
paul
Putting a PTP DECnet link in TCP would do the same as DDCMP/X.25/Lapb
-P
The earlier DMC emulation in SIMH 3.9 (payload only, not DDCMP) was essentially point to point over TCP. Yes, that gives you what DDCMP does. It s better than X.25/LAPB because those suffer from having a two way rather than three way handshake.
paul
On May 28, 2014, at 12:18 AM, Peter Lothberg <roll at Stupi.SE> wrote:
Do anyone run any links using TCP?
You can tell Multinet to use TCP instead of UDP.
That s the theory, but I have been unable to find any description of what the protocol would look like in that case. If anyone can describe the protocol I would be grateful, then I can implement in in DECnet/Python.
paul
On May 27, 2014, at 3:17 PM, Bob Armstrong <bob at jfcl.com> wrote:
in Phase IV the hello timer value is sent in the hello message
and that value times 3 (or 2) is used for the listen timer
=20
That would explain why having LEGATO send hello messages more often
doesn't make any difference=85
If you send hellos more often, you time out sooner; either way, you time ou=
t if you lose 3 hellos in a row (or, probably just barely, if you lose two =
in a row). The assumption is that no LAN is that badly broken.
The way to run DECnet over a flaky long distance network is to use point to=
point mode with a data link layer that deals with packet loss. DDCMP is a=
n example of such a data link layer; X.25 is also (after a fashion). That=
=92s what the DECnet design intended.
paul
Putting a PTP DECnet link in TCP would do the same as DDCMP/X.25/Lapb
-P
On May 27, 2014, at 4:04 PM, Johnny Billquist <bqt at softjar.se> wrote:
On 2014-05-27 21:48, Bob Armstrong wrote:
The way to run DECnet over a flaky long distance network is to use point
to point mode with a data link layer that deals with packet loss.
Probably a good idea, but we don't have that option on HECnet.
Well, HECnet is not a static piece of equipment. Anything is possible...
My bridge emulates a simple ethernet segment. Good enough many times, but if we have a link like yours, that sometimes seems to drop packets, then maybe some other alternative should be considered.
Now, the question then becomes, what can we do in this case.
As far as I understand, links using Multinet are more broken, and still use UDP. The same would appear to possibly be the case for Cisco as well?
Do anyone run any links using TCP?
That would work. DDCMP over UDP would work.
GRE bridging of Ethernet packets is reasonable because that bridges a datagram service over a datagram service. Usually the Internet is good enough for that to work. Multinet is a much worse option, because it tunnels what it pretends is a point to point link over UDP, without actually implementing any datalink protocol to cope with the mismatch.
DDCMP in a user mode router would not be all that big a job. In Python, even less. Time to accelerate that effort.
paul
On 2014-05-27 21:48, Bob Armstrong wrote:
The way to run DECnet over a flaky long distance network is to use point
to point mode with a data link layer that deals with packet loss.
Probably a good idea, but we don't have that option on HECnet.
Well, HECnet is not a static piece of equipment. Anything is possible...
My bridge emulates a simple ethernet segment. Good enough many times, but if we have a link like yours, that sometimes seems to drop packets, then maybe some other alternative should be considered.
Now, the question then becomes, what can we do in this case.
As far as I understand, links using Multinet are more broken, and still use UDP. The same would appear to possibly be the case for Cisco as well?
Do anyone run any links using TCP?
Johnny
On May 27, 2014, at 3:48 PM, Bob Armstrong <bob at jfcl.com> wrote:
The way to run DECnet over a flaky long distance network is to use point
to point mode with a data link layer that deals with packet loss.
Probably a good idea, but we don't have that option on HECnet.
Bob
The latest SIMH has this (DDCMP, in the DMC/DMR/DMP/DMV emulation).
paul