- HECnet - lists @ Update

by pontus＠Update.UU.SE

On Tue, May 27, 2014 at 11:13:05PM +0200, Peter Lothberg wrote: (I might even find a switch that does it right and send it to Update) If you do, we'll try not to procrastinate setting it up for a year :-) /P

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 23:13, Peter Lothberg wrote: When we where chasing another problem, it was found that packets "disapear" in the Ethernet fabric at Update, sometimes. I think you are referring to when we played with transfers and the performance drops to the floor. That is lost packets because of interface speed differences. They are not really lost, but there only so much that can be queued up in the fabric between interfaces with different speeds. Maybe Johnny can make a small map of the current topology with all involved things and the lan-speeds? That would actually be useful either way, but I'm not sure I can do it easily. Johnny If it was only ONE switch used for all the DECnet speaking things at Update, it would be simple to make a drawing. And with only one device per port, I think any resonable switch has enough buffers to make it work way better. --P (I might even find a switch that does it right and send it to Update) Well, if it was only Update then the obvious other solution would be to just force everything to use 10 Mb/s. I don't think that other than that, changing the switch would help much. If a maching is throwing out DECnet packets on a 100Mb/s interface, and it is received by a machine with a 10Mb/s interface, DECnet will suck because of the way DECnet handles packet loss and so on. Probably made even worse by really slow physical machines that don't even read out the packets from the ethernet interface at any good rates... (Hey, a PDP-11 isn't exactly super fast...) However, the same problem appears for anyone anywhere when they try to use DECnet locally. If you run it over my bridge, I actually reshape the traffic to make it ok (that is what the throttling is about, Bob). Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 22:09, Paul_Koning at Dell.com wrote: On May 27, 2014, at 4:04 PM, Johnny Billquist <bqt at softjar.se> wrote: On 2014-05-27 21:48, Bob Armstrong wrote: The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. Probably a good idea, but we don't have that option on HECnet. Well, HECnet is not a static piece of equipment. Anything is possible... My bridge emulates a simple ethernet segment. Good enough many times, but if we have a link like yours, that sometimes seems to drop packets, then maybe some other alternative should be considered. Now, the question then becomes, what can we do in this case. As far as I understand, links using Multinet are more broken, and still use UDP. The same would appear to possibly be the case for Cisco as well? Do anyone run any links using TCP? That would work. DDCMP over UDP would work. Really? UDP can cause packets to arrive in the wrong order, duplicated, or sometimes dropped. I was certain you wrote above "a data link layer that deals with packet loss". Or was that not meant to be read as that the underlaying transport should deal with it? Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by roll＠Stupi.SE

When we where chasing another problem, it was found that packets "disapear" in the Ethernet fabric at Update, sometimes. I think you are referring to when we played with transfers and the performance drops to the floor. That is lost packets because of interface speed differences. They are not really lost, but there only so much that can be queued up in the fabric between interfaces with different speeds. Maybe Johnny can make a small map of the current topology with all involved things and the lan-speeds? That would actually be useful either way, but I'm not sure I can do it easily. Johnny If it was only ONE switch used for all the DECnet speaking things at Update, it would be simple to make a drawing. And with only one device per port, I think any resonable switch has enough buffers to make it work way better. --P (I might even find a switch that does it right and send it to Update)

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by hvlems＠zonnet.nl

Area 44 is missing from the list. Johnny, did you turn off the bridge link to my LAN? Verzonden vanaf mijn BlackBerry 10-smartphone. Origineel bericht Van: Bob Armstrong Verzonden: dinsdag 27 mei 2014 22:50 Aan: hecnet at Update.UU.SE Beantwoorden: hecnet at Update.UU.SE Onderwerp: RE: [HECnet] "Dropped by adjacent node" .... I appreciate the suggestions, but I'm not really interested in trying to re-architect HECnet. That's a losing battle - I just want my node to work :-) I tried sending the USR1 signal to the bridge program and got this - 0: legato 0.0.0.0:0000 (Rx: 827 Tx:11017 (Drop rx: 6)) Active: 1 Throttle: 278(114) 1: psilo 130.238.19:4711 (Rx:11044 Tx: 821 (Drop rx: 27)) Active: 1 Throttle: 115(332) Hash of known destinations: aa0004000d04 -> 1 (1.13) aa0004000f04 -> 1 (1.15) aa0004001404 -> 1 (1.20) aa0004001504 -> 1 (1.21) aa0004005e05 -> 1 (1.350) aa000400c205 -> 1 (1.450) aa0004000108 -> 0 (2.1) aa000400f810 -> 1 (4.248) aa000400ff17 -> 1 (5.1023) aa0004009021 -> 1 (8.400) aa0004009121 -> 1 (8.401) aa000400bc21 -> 1 (8.444) aa000400f421 -> 1 (8.500) aa000400022c -> 1 (11.2) aa000400032c -> 1 (11.3) aa000400642c -> 1 (11.100) aa0004000138 -> 1 (14.1) aa000400284c -> 1 (19.40) aa000400294c -> 1 (19.41) aa0004000470 -> 1 (28.4) aa0004002970 -> 1 (28.41) aa000400feab -> 1 (42.1022) aa0004002fbc -> 1 (47.47) aa0004004dbd -> 1 (47.333) aa0004002bbe -> 1 (47.555) aa0004002cbe -> 1 (47.556) aa0004002fbe -> 1 (47.559) aa000400bcbe -> 1 (47.700) aa000400bdbe -> 1 (47.701) aa000400d7be -> 1 (47.727) aa0004000bec -> 1 (59.11) aa00040005f8 -> 1 (62.5) aa0004007dfa -> 1 (62.637) Note that I restarted it about 30 minutes ago, so this data is just for that period of time. My first observation is that's a lot of routing nodes, but that's probably neither here nor there. Also I'm struck by the asymmetry in the number of packets sent vs received. I guess that makes sense, since I'm transmitting routing and hello messages for only one node (LEGATO) and receiving them from beaucoup nodes. And I'm not sure that the throttling in the bridge is about... Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bob＠jfcl.com

I appreciate the suggestions, but I'm not really interested in trying to re-architect HECnet. That's a losing battle - I just want my node to work :-) I tried sending the USR1 signal to the bridge program and got this - 0: legato 0.0.0.0:0000 (Rx: 827 Tx:11017 (Drop rx: 6)) Active: 1 Throttle: 278(114) 1: psilo 130.238.19:4711 (Rx:11044 Tx: 821 (Drop rx: 27)) Active: 1 Throttle: 115(332) Hash of known destinations: aa0004000d04 -> 1 (1.13) aa0004000f04 -> 1 (1.15) aa0004001404 -> 1 (1.20) aa0004001504 -> 1 (1.21) aa0004005e05 -> 1 (1.350) aa000400c205 -> 1 (1.450) aa0004000108 -> 0 (2.1) aa000400f810 -> 1 (4.248) aa000400ff17 -> 1 (5.1023) aa0004009021 -> 1 (8.400) aa0004009121 -> 1 (8.401) aa000400bc21 -> 1 (8.444) aa000400f421 -> 1 (8.500) aa000400022c -> 1 (11.2) aa000400032c -> 1 (11.3) aa000400642c -> 1 (11.100) aa0004000138 -> 1 (14.1) aa000400284c -> 1 (19.40) aa000400294c -> 1 (19.41) aa0004000470 -> 1 (28.4) aa0004002970 -> 1 (28.41) aa000400feab -> 1 (42.1022) aa0004002fbc -> 1 (47.47) aa0004004dbd -> 1 (47.333) aa0004002bbe -> 1 (47.555) aa0004002cbe -> 1 (47.556) aa0004002fbe -> 1 (47.559) aa000400bcbe -> 1 (47.700) aa000400bdbe -> 1 (47.701) aa000400d7be -> 1 (47.727) aa0004000bec -> 1 (59.11) aa00040005f8 -> 1 (62.5) aa0004007dfa -> 1 (62.637) Note that I restarted it about 30 minutes ago, so this data is just for that period of time. My first observation is that's a lot of routing nodes, but that's probably neither here nor there. Also I'm struck by the asymmetry in the number of packets sent vs received. I guess that makes sense, since I'm transmitting routing and hello messages for only one node (LEGATO) and receiving them from beaucoup nodes. And I'm not sure that the throttling in the bridge is about... Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 28, 2014, at 12:17 AM, Peter Lothberg <roll at Stupi.SE> wrote: On May 27, 2014, at 3:17 PM, Bob Armstrong <bob at jfcl.com> wrote: in Phase IV the hello timer value is sent in the hello message and that value times 3 (or 2) is used for the listen timer =20 That would explain why having LEGATO send hello messages more often doesn't make any difference=85 If you send hellos more often, you time out sooner; either way, you time ou= t if you lose 3 hellos in a row (or, probably just barely, if you lose two = in a row). The assumption is that no LAN is that badly broken. The way to run DECnet over a flaky long distance network is to use point to= point mode with a data link layer that deals with packet loss. DDCMP is a= n example of such a data link layer; X.25 is also (after a fashion). That= =92s what the DECnet design intended. paul Putting a PTP DECnet link in TCP would do the same as DDCMP/X.25/Lapb -P The earlier DMC emulation in SIMH 3.9 (payload only, not DDCMP) was essentially point to point over TCP. Yes, that gives you what DDCMP does. It s better than X.25/LAPB because those suffer from having a two way rather than three way handshake. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 28, 2014, at 12:18 AM, Peter Lothberg <roll at Stupi.SE> wrote: Do anyone run any links using TCP? You can tell Multinet to use TCP instead of UDP. That s the theory, but I have been unable to find any description of what the protocol would look like in that case. If anyone can describe the protocol I would be grateful, then I can implement in in DECnet/Python. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by roll＠Stupi.SE

Do anyone run any links using TCP? You can tell Multinet to use TCP instead of UDP. -P

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by roll＠Stupi.SE

On May 27, 2014, at 3:17 PM, Bob Armstrong <bob at jfcl.com> wrote: in Phase IV the hello timer value is sent in the hello message and that value times 3 (or 2) is used for the listen timer =20 That would explain why having LEGATO send hello messages more often doesn't make any difference=85 If you send hellos more often, you time out sooner; either way, you time ou= t if you lose 3 hellos in a row (or, probably just barely, if you lose two = in a row). The assumption is that no LAN is that badly broken. The way to run DECnet over a flaky long distance network is to use point to= point mode with a data link layer that deals with packet loss. DDCMP is a= n example of such a data link layer; X.25 is also (after a fashion). That= =92s what the DECnet design intended. paul Putting a PTP DECnet link in TCP would do the same as DDCMP/X.25/Lapb -P

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 27, 2014, at 4:04 PM, Johnny Billquist <bqt at softjar.se> wrote: On 2014-05-27 21:48, Bob Armstrong wrote: The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. Probably a good idea, but we don't have that option on HECnet. Well, HECnet is not a static piece of equipment. Anything is possible... My bridge emulates a simple ethernet segment. Good enough many times, but if we have a link like yours, that sometimes seems to drop packets, then maybe some other alternative should be considered. Now, the question then becomes, what can we do in this case. As far as I understand, links using Multinet are more broken, and still use UDP. The same would appear to possibly be the case for Cisco as well? Do anyone run any links using TCP? That would work. DDCMP over UDP would work. GRE bridging of Ethernet packets is reasonable because that bridges a datagram service over a datagram service. Usually the Internet is good enough for that to work. Multinet is a much worse option, because it tunnels what it pretends is a point to point link over UDP, without actually implementing any datalink protocol to cope with the mismatch. DDCMP in a user mode router would not be all that big a job. In Python, even less. Time to accelerate that effort. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 21:48, Bob Armstrong wrote: The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. Probably a good idea, but we don't have that option on HECnet. Well, HECnet is not a static piece of equipment. Anything is possible... My bridge emulates a simple ethernet segment. Good enough many times, but if we have a link like yours, that sometimes seems to drop packets, then maybe some other alternative should be considered. Now, the question then becomes, what can we do in this case. As far as I understand, links using Multinet are more broken, and still use UDP. The same would appear to possibly be the case for Cisco as well? Do anyone run any links using TCP? Johnny

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 27, 2014, at 3:48 PM, Bob Armstrong <bob at jfcl.com> wrote: The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. Probably a good idea, but we don't have that option on HECnet. Bob The latest SIMH has this (DDCMP, in the DMC/DMR/DMP/DMV emulation). paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bob＠jfcl.com

The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. Probably a good idea, but we don't have that option on HECnet. Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 27, 2014, at 3:17 PM, Bob Armstrong <bob at jfcl.com> wrote: in Phase IV the hello timer value is sent in the hello message and that value times 3 (or 2) is used for the listen timer That would explain why having LEGATO send hello messages more often doesn't make any difference If you send hellos more often, you time out sooner; either way, you time out if you lose 3 hellos in a row (or, probably just barely, if you lose two in a row). The assumption is that no LAN is that badly broken. The way to run DECnet over a flaky long distance network is to use point to point mode with a data link layer that deals with packet loss. DDCMP is an example of such a data link layer; X.25 is also (after a fashion). That s what the DECnet design intended. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bob＠jfcl.com

in Phase IV the hello timer value is sent in the hello message and that value times 3 (or 2) is used for the listen timer That would explain why having LEGATO send hello messages more often doesn't make any difference... Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 27, 2014, at 3:00 PM, Bob Armstrong <bob at jfcl.com> wrote: ... The hello timer on circuit QNA-1 on LEGATO (which is what I assume to be the relevant parameter here) is set to 15 seconds. I don't know what the corresponding timeout is on the MIM side, but maybe Johnny will tell us. In any case, changing the QNA-1 hello timer to 5 seconds just for fun didn't seem to make much difference, so the problem is likely that packets are being lost somewhere. Also, in Phase IV the hello timer value is sent in the hello message, and that value times 3 (or 2) is used for the listen timer. So the timeout at MIM is caused by no hellos from LEGATO for 45 seconds. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bob＠jfcl.com

P.S. Thanks, Paul for the explanation. In effect, LEGATO is saying "I can hear MIM but MIM can't hear me". Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bob＠jfcl.com

Here LEGATO gives adjacent node listener receive timeouts too. Probably not surprising - all HECnet Ethernet traffic from LEGATO goes thru Johnny's bridge and psilo, so nobody on HECnet would see any better results than MIM. I don't exactly know how to go about figuring out where the bottleneck is. "ping Psilo.Update.UU.SE" and traceroute give trip times of about 190ms, which doesn't seem outrageous. The hello timer on circuit QNA-1 on LEGATO (which is what I assume to be the relevant parameter here) is set to 15 seconds. I don't know what the corresponding timeout is on the MIM side, but maybe Johnny will tell us. In any case, changing the QNA-1 hello timer to 5 seconds just for fun didn't seem to make much difference, so the problem is likely that packets are being lost somewhere. Bob

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by Paul_Koning＠Dell.com

On May 27, 2014, at 7:23 PM, Peter Lothberg <roll at Stupi.SE> wrote: Hum... I dunno if this matters, but some nodes like 1.13 always sends two hello's back to back... (I deleted the hello-sending messages for this box..) ... May 27 15:17:16: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:16: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:16: DNET-ADJ: Level 2 hello from 47.556 May 27 15:17:16: DNET-ADJ: Level 2 hello from 47.556 Two possible reasons. 1. Router hellos from the designated router go to both the All Endnodes and the All Routers address. 2. Router hellos from Phase IV+ L2 routers go to both the All Routers address and the (new in IV+) All L2 Routers address. paul

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by e.olofsen＠xs4all.nl

Here LEGATO gives adjacent node listener receive timeouts too. I've just tried dnping under Linux (just because I remembered it existed) and found mean RTTs to MIM of min/avg/max = 93.6/95.0/96.1 ms and to LEGATO 256.0/487.5/981.9 ms. Perhaps this info is of help, Erik On Tue, May 27, 2014 at 05:11:23PM +0200, Johnny Billquist wrote: On 2014-05-27 17:06, Johnny Billquist wrote: On 2014-05-27 16:48, Paul_Koning at Dell.com wrote: On May 27, 2014, at 10:35 AM, Bob Armstrong <bob at jfcl.com> wrote: On 2014-05-27 07:07, Paul_Koning at Dell.com wrote: In other words, the listen timeout for Ethernet is 3 * hello time, for point to point it is 2 * hello time. But can you clarify who is dropping who (i.e. which end is timing out)? The message on LEGATO says "dropped by adjacent node" not "dropping adjacent node", which makes it sound as if MIM is timing out, not LEGATO (and then somehow telling LEGATO that, which is another mystery). Or is this just a poorly worded message? No, the message is very precise. If you see a message on LEGATO which reports adjacency MIM down for that reason, it means LEGATO received an Ethernet router hello message from MIM that no longer lists LEGATO as one of the routers that MIM can see. MIM will have a reason for not reporting LEGATO any longer. It should have logged a message stating that reason. Right. And MIM says "Adjacency listener receive timeout." So it would seem packets are lost from Bob/LEGATO... By the way. Note that it is only LEGATO that MIM had this problem with yesterday. All other nodes were stable. So it's not something local to the ethernet at update, neither to the bridge as such. It's either the network between Bob and Update, or the bridge on Bob's end, or something on the network at his end... Johnny

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by roll＠Stupi.SE

Hum... I dunno if this matters, but some nodes like 1.13 always sends two hello's back to back... (I deleted the hello-sending messages for this box..) May 27 15:17:19: DNET-ADJ: Level 2 hello from 4.248 May 27 15:17:20: DNET-ADJ: Level 2 hello from 42.1022 May 27 15:17:20: DNET-ADJ: Level 2 (IV+) hello from 42.1022 May 27 15:17:20: DNET-ADJ: Level 2 hello from 42.1022 May 27 15:17:21: DNET-ADJ: Level 2 hello from 23.1023 May 27 15:17:21: DNET-ADJ: Level 2 (IV+) hello from 23.1023 May 27 15:17:22: DNET-ADJ: Level 2 hello from 19.40 May 27 15:17:23: DNET-ADJ: Level 2 hello from 19.40 May 27 15:17:25: DNET-ADJ: Level 2 hello from 59.57 May 27 15:17:25: DNET-ADJ: Level 2 (IV+) hello from 59.57 May 27 15:17:27: DNET-ADJ: Level 2 hello from 59.60 May 27 15:17:27: DNET-ADJ: Level 2 (IV+) hello from 59.60 May 27 15:17:28: DNET-ADJ: Level 2 hello from 5.1023 May 27 15:17:30: DNET-ADJ: Level 2 hello from 1.15 May 27 15:17:31: DNET-ADJ: Level 2 hello from 62.637 May 27 15:17:31: DNET-ADJ: Level 2 hello from 62.637 May 27 15:17:31: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:31: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:16: DNET-ADJ: Level 2 hello from 62.637 May 27 15:17:16: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:16: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:16: DNET-ADJ: Level 2 hello from 47.556 May 27 15:17:16: DNET-ADJ: Level 2 hello from 47.556 May 27 15:17:16: DNET-ADJ: Level 2 hello from 28.41 May 27 15:17:16: DNET-ADJ: Level 2 hello from 11.2 May 27 15:17:17: DNET-ADJ: Level 2 hello from 19.41 May 27 15:17:17: DNET-ADJ: Level 2 hello from 8.400 May 27 15:17:17: DNET-ADJ: Level 2 hello from 28.41 May 27 15:17:17: DNET-ADJ: Level 2 hello from 11.2 May 27 15:17:17: DNET-ADJ: Level 2 hello from 5.1023 May 27 15:17:17: DNET-ADJ: Level 2 hello from 4.248 May 27 15:17:18: DNET-ADJ: Level 2 hello from 19.41 May 27 15:17:18: DNET-ADJ: Level 2 hello from 8.400 May 27 15:17:18: DNET-ADJ: Level 2 hello from 28.41 May 27 15:17:18: DNET-ADJ: Level 2 hello from 11.2 May 27 15:17:18: DNET-ADJ: Level 2 hello from 4.248 May 27 15:17:19: DNET-ADJ: Level 2 hello from 19.41 May 27 15:17:19: DNET-ADJ: Level 2 hello from 8.400 May 27 15:17:19: DNET-ADJ: Level 2 hello from 4.248 May 27 15:17:20: DNET-ADJ: Level 2 hello from 42.1022 May 27 15:17:20: DNET-ADJ: Level 2 (IV+) hello from 42.1022 May 27 15:17:20: DNET-ADJ: Level 2 hello from 42.1022 May 27 15:17:21: DNET-ADJ: Level 2 hello from 23.1023 May 27 15:17:21: DNET-ADJ: Level 2 (IV+) hello from 23.1023 May 27 15:17:22: DNET-ADJ: Level 2 hello from 19.40 May 27 15:17:23: DNET-ADJ: Level 2 hello from 19.40 May 27 15:17:25: DNET-ADJ: Level 2 hello from 59.57 May 27 15:17:25: DNET-ADJ: Level 2 (IV+) hello from 59.57 May 27 15:17:27: DNET-ADJ: Level 2 hello from 59.60 May 27 15:17:27: DNET-ADJ: Level 2 (IV+) hello from 59.60 May 27 15:17:28: DNET-ADJ: Level 2 hello from 5.1023 May 27 15:17:30: DNET-ADJ: Level 2 hello from 1.15 May 27 15:17:31: DNET-ADJ: Level 2 hello from 62.637 May 27 15:17:31: DNET-ADJ: Level 2 hello from 62.637 May 27 15:17:31: DNET-ADJ: Level 2 hello from 1.13 May 27 15:17:31: DNET-ADJ: Level 2 hello from 1.13

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 17:06, Johnny Billquist wrote: On 2014-05-27 16:48, Paul_Koning at Dell.com wrote: On May 27, 2014, at 10:35 AM, Bob Armstrong <bob at jfcl.com> wrote: On 2014-05-27 07:07, Paul_Koning at Dell.com wrote: In other words, the listen timeout for Ethernet is 3 * hello time, for point to point it is 2 * hello time. But can you clarify who is dropping who (i.e. which end is timing out)? The message on LEGATO says "dropped by adjacent node" not "dropping adjacent node", which makes it sound as if MIM is timing out, not LEGATO (and then somehow telling LEGATO that, which is another mystery). Or is this just a poorly worded message? No, the message is very precise. If you see a message on LEGATO which reports adjacency MIM down for that reason, it means LEGATO received an Ethernet router hello message from MIM that no longer lists LEGATO as one of the routers that MIM can see. MIM will have a reason for not reporting LEGATO any longer. It should have logged a message stating that reason. Right. And MIM says "Adjacency listener receive timeout." So it would seem packets are lost from Bob/LEGATO... By the way. Note that it is only LEGATO that MIM had this problem with yesterday. All other nodes were stable. So it's not something local to the ethernet at update, neither to the bridge as such. It's either the network between Bob and Update, or the bridge on Bob's end, or something on the network at his end... Johnny

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 17:01, Peter Lothberg wrote: On May 27, 2014, at 10:35 AM, Bob Armstrong <bob at jfcl.com> wrote: On 2014-05-27 07:07, Paul_Koning at Dell.com wrote: In other words, the listen timeout for Ethernet is 3 * hello time, for point to point it is 2 * hello time. =20 But can you clarify who is dropping who (i.e. which end is timing out)? = The message on LEGATO says "dropped by adjacent node" not "dropping adjac= ent node", which makes it sound as if MIM is timing out, not LEGATO (and th= en somehow telling LEGATO that, which is another mystery). =20 Or is this just a poorly worded message? No, the message is very precise. If you see a message on LEGATO which reports adjacency MIM down for that re= ason, it means LEGATO received an Ethernet router hello message from MIM th= at no longer lists LEGATO as one of the routers that MIM can see. MIM will have a reason for not reporting LEGATO any longer. It should have= logged a message stating that reason. paul When we where chasing another problem, it was found that packets "disapear" in the Ethernet fabric at Update, sometimes. I think you are referring to when we played with transfers and the performance drops to the floor. That is lost packets because of interface speed differences. They are not really lost, but there only so much that can be queued up in the fabric between interfaces with different speeds. Maybe Johnny can make a small map of the current topology with all involved things and the lan-speeds? That would actually be useful either way, but I'm not sure I can do it easily. Johnny

11 years, 1 month

1
0
0 0

"Dropped by adjacent node" ....

by bqt＠softjar.se

On 2014-05-27 16:48, Paul_Koning at Dell.com wrote: On May 27, 2014, at 10:35 AM, Bob Armstrong <bob at jfcl.com> wrote: On 2014-05-27 07:07, Paul_Koning at Dell.com wrote: In other words, the listen timeout for Ethernet is 3 * hello time, for point to point it is 2 * hello time. But can you clarify who is dropping who (i.e. which end is timing out)? The message on LEGATO says "dropped by adjacent node" not "dropping adjacent node", which makes it sound as if MIM is timing out, not LEGATO (and then somehow telling LEGATO that, which is another mystery). Or is this just a poorly worded message? No, the message is very precise. If you see a message on LEGATO which reports adjacency MIM down for that reason, it means LEGATO received an Ethernet router hello message from MIM that no longer lists LEGATO as one of the routers that MIM can see. MIM will have a reason for not reporting LEGATO any longer. It should have logged a message stating that reason. Right. And MIM says "Adjacency listener receive timeout." So it would seem packets are lost from Bob/LEGATO... Johnny

11 years, 1 month

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

HECnet