On 2014-05-27 07:07, Paul_Koning at Dell.com wrote:
In other words, the listen timeout for
Ethernet is 3 * hello time, for point to point it is 2 * hello time.
But can you clarify who is dropping who (i.e. which end is timing out)? The message on LEGATO says "dropped by adjacent node" not "dropping adjacent node", which makes it sound as if MIM is timing out, not LEGATO (and then somehow telling LEGATO that, which is another mystery).
Or is this just a poorly worded message?
Thanks
Bob
On May 25, 2014, at 10:20 PM, Johnny Billquist <bqt at softjar.se> wrote:
On 2014-05-26 04:17, Johnny Billquist wrote:
On 2014-05-26 03:47, Bob Armstrong wrote:
Increased it to 32, ...
That's probably a good idea, but I don't actually think that's the
problem. Like I said, I see this happening with several nodes, all on
the
bridge QNA - MIM, PONDUS, A5RTR, SGC, etc.
Yeah. I checked some more, and I actually only have 18 adjacent nodes at
MIM:: so this is definitely not the problem. From my point of view, it
only seems to be LEGATO:: that is currently acting like a yo-yo...
I would guess that either some packets are lost, or else the packet trip times varies a *lot*. We should investigate more, but it's really getting late for me
Varying round trip time itself doesn t matter. Routing layer hellos are sent out periodically; the round trip time is not considered. What is necessary is that they must be delivered reliably enough. For Ethernet, two lost packets are allowed; for point to point links, only one. (Beware of braindead tunnel protocols that look like point to point links but run over UDP.) In other words, the listen timeout for Ethernet is 3 * hello time, for point to point it is 2 * hello time.
paul
On May 25, 2014, at 9:33 PM, Johnny Billquist <bqt at softjar.se> wrote:
On 2014-05-26 03:06, Bob Armstrong wrote:
Do you have a max routers set too low on your machine maybe?
We have quite a few routers on the bridge segment.
$ NCP SHOW EXEC CHAR
...
Max broadcast nonrouters = 512
Max broadcast routers = 128
...
$ NCP TELL MIM SHOW EXEC CHAR
...
Max broadcast nonrouters = 64
Max broadcast routers = 20
...
Maybe it's too low on MIM?? The message actually makes it sound like MIM
is dropping LEGATO, not the other way around.
Could be... In fact you are probably right. Just checked, and MIM have a MAX of 20 right now, which I believe is too low here. Increased it to 32, but I need to reboot for it to take effect
In DECnet, we do not assume that I can hear A means A can hear me . Instead, the protocol explicitly tests for that (in the case of routers). If the test fails, you don t get an adjacency. If there was an adjacency before, and then the test fails, that adjacency goes down.
The way this is done is that the Ethernet router hello message contains a list of routers the sender has heard. If a router doesn t see itself listed, it doesn t bring up the adjacency with the sender of that router hello. If it was there and goes away, you get the dropped by adjacency router event.
To find out why the adjacent router stopped mentioning you, you need to look in its event log. If the reason is too many router, there should be an event that says so. If the reason is something else, the reason should say what else it is. For example, it might be adjacency listener timeout meaning no hello messages were seen in 3 * hello time.
paul
Sent from mobile device that advertises itself for no good reason
On 26 May 2014, at 08:48, "Jerome H. Fine" <jhfinedp3k at compsys.to> wrote:
Cory Smelosky wrote:
On Mon, 19 May 2014, Jerome H. Fine wrote:
Since you are obviously using either V05.06 or V05.07
Yup. 5.07. It has Mentec branding!
(only the last two versions of RT-11 have the RT11ZM
Monitor), you can use the VRUN command to support
giving LINK all 64 KB of memory. Naturally, you
will probably need at least 256 KB of total physical
memory on even a PDP-11/23 (to run RT11XM as well as
to provide the needed extended memory to provide
LINK with the full 64 KB to run in) although 128 KB
might do in a pinch depending on which device you use
for the system device.
Thanks. I'll look in to VRUN.
Any good results (or bad) to report?
Got sidetracked with other projects.
How much physical memory do you have?
256 kilo words. Had more but half the board is bad. :(
Jerome Fine
[ Summary : File transfers between two simh PDP hang,
DECnet reports Data overruns and Response timeouts ]
At 2:14 AM +0200 26/5/14, Johnny Billquist wrote:
This is a problem inside of DECnet on the simulated host. It gets packets
faster than it can process them, so some packets are dropped.
Unfortunately DECnet deals very bad with systematic packet loss like this.
You get retransmissions, and after a while the retransmission timeout backs
off until you have more than a minute between retransmission attempts.
Anyway, if you can get simh to throttle the ethernet interface, that might help you.
(I don't remember offhand if it do support such functionality.)
The service polling timer can be adjusted
SET XQ POLL={DEFAULT|4..2500}
Set to 100 by default.
Changing the polling timer makes a huge difference. Have a look at:
http://pastebin.com/AZ1U6bh3
Although it still hangs sometimes, reliability has vastly improved upon the erratic behavior of the beginning. Remember, the completion time was about 3 minutes.
We're almost there :)
This turns into an interesting challenge : optimize XQ service timer to make overruns the lowest possible. This depends on many factors, among them is the data sink bandwidth.
You may have flawless copy to TI:, but it will fail to disk. The terminal is actually throttling the transfer. Disks are faster, and emulated disks are order of magnitude faster than the original ones. Emulation is pushing DECnet to speeds it was never designed for.
I'm running here as low as 10 polls/sec. Maybe 50 would be optimal, and what about 500? I need a metrics. And tools. Here, I am using AT. to time a 100. blocks file transfer. Overruns and timeouts still raise slowly, but DECnet recovers happily most of the time.
--
Jean-Yves Bernier
>Cory Smelosky wrote:
>On Mon, 19 May 2014, Jerome H. Fine wrote:
Since you are obviously using either V05.06 or V05.07
Yup. 5.07. It has Mentec branding!
(only the last two versions of RT-11 have the RT11ZM
Monitor), you can use the VRUN command to support
giving LINK all 64 KB of memory. Naturally, you
will probably need at least 256 KB of total physical
memory on even a PDP-11/23 (to run RT11XM as well as
to provide the needed extended memory to provide
LINK with the full 64 KB to run in) although 128 KB
might do in a pinch depending on which device you use
for the system device.
Thanks. I'll look in to VRUN.
Any good results (or bad) to report?
How much physical memory do you have?
Jerome Fine
On 2014-05-26 04:17, Johnny Billquist wrote:
On 2014-05-26 03:47, Bob Armstrong wrote:
Increased it to 32, ...
That's probably a good idea, but I don't actually think that's the
problem. Like I said, I see this happening with several nodes, all on
the
bridge QNA - MIM, PONDUS, A5RTR, SGC, etc.
Yeah. I checked some more, and I actually only have 18 adjacent nodes at
MIM:: so this is definitely not the problem. From my point of view, it
only seems to be LEGATO:: that is currently acting like a yo-yo...
I would guess that either some packets are lost, or else the packet trip times varies a *lot*. We should investigate more, but it's really getting late for me...
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
On 2014-05-26 03:47, Bob Armstrong wrote:
Increased it to 32, ...
That's probably a good idea, but I don't actually think that's the
problem. Like I said, I see this happening with several nodes, all on the
bridge QNA - MIM, PONDUS, A5RTR, SGC, etc.
Yeah. I checked some more, and I actually only have 18 adjacent nodes at MIM:: so this is definitely not the problem. From my point of view, it only seems to be LEGATO:: that is currently acting like a yo-yo...
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Increased it to 32, ...
That's probably a good idea, but I don't actually think that's the
problem. Like I said, I see this happening with several nodes, all on the
bridge QNA - MIM, PONDUS, A5RTR, SGC, etc.
Bob
On 2014-05-26 02:52, Paul_Koning at Dell.com wrote:
On May 24, 2014, at 10:15 AM, Johnny Billquist <bqt at softjar.se> wrote:
...
Do I need to spell it out? :-)
The hardware address is the address the card have from factory. The physical address is the address the software have programmed the card to have. Since DECnet uses specific addresses, the address is changed from the hardware address, since you do not want/need that when running DECnet.
Not necessarily exactly that way.
Well, we are talking about old hardware here, Paul... :-)
The hardware address is the default physical address. It is supposed to be globally unique (not just unique on each LAN). If you have virtual devices, like in SIMH, chances are you re responsible for this (you re in essence the manufacturer). Pedantically, if you administer MAC addresses, they should be from the locally administered address space, i.e., second bit set in the 1st byte. In practice that doesn t matter, but it avoids conflict with real hardware addresses.
Right. But you will be really unlucky if you manage to hit an address that you also happen to have some real hardware using, unless you explicitly sets it so. But even more, once DECnet starts up, it becomes irrelevant again, since DECnet changes the MAC address, and do not even consider retaining the ability to use the original MAC address.
DECnet Phase IV uses a physical address it supplies rather than the default. Other protocols (including DECnet Phase V) don t. If your NIC type (or its driver) supports only a single physical address, the physical address changes for all protocols when you turn on DECnet Phase IV. That s why you have to turn on DECnet before LAT.
Right.
However... if your NIC and driver allow per-protocol physical address, then only DECnet Phase IV uses the aa-00-04-00 address and the others continue to use the hardware address. For such systems, you have to be careful that the hardware address is unique even if DECnet is used.
DECnet on neither PDP-11 nor VAX tries any such tricks. They just assume you only will have one hardware address per interface, and sets it to what DECnet thinks it should be, and that's it. I have not checked Alpha, but I suspect it never do such a thing either. And I know that DECnet under Linux also don't play this way (or didn't last I looked). So while you are right that it could do this in theory, it is not done by anything as far as I know, and the additional complexity without any real gains outweight the potential use of such a behavior.
Most newer DEC NICs (Tulip and beyond) support multiple physical addresses, as does QNA. UNA and LANCE do not. Whether a particular OS/driver implements that is another matter.
simh and other things normally use libpcap, and that do not add extra addresses, but just gets the device into promiscuous mode, and then deals with it in software.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol