On Thursday, June 05, 2014 at 4:33 PM, Johnny Billquist wrote:
On 2014-06-06 01:16, Mark Pizzolato - Info Comm wrote:
OK. So you sound like you really need a throttling option for the LAN
devices. I'll look over the throttling logic in the bridge and fold in something
similar.
Well, I'm not so sure that code should be a model to pattern anything after.
Everything about my bridge is just a hack. Done just to fix an immediate
problem without any proper design at any corner. :-)
It definitely has more going for it then my first thoughts. Before looking at your bridge code, I was merely going to create an option to measure the time between successive packets. Your model allows for some bursting but then starts throttling.
I'll make it an option and the control variables (TIMEWINDOW, BURSTSIZE, DELAY) configurable. Default will be no throttle.
- Mark
On 2014-06-06 01:16, Mark Pizzolato - Info Comm wrote:
On Thursday, June 05, 2014 at 12:04 PM, Johnny Billquist wrote:
On 2014-06-05 20:46, Mark Pizzolato - Info Comm wrote:
On Thursday, June 05, 2014 at 10:53 AM, Johnny Billquist wrote:
On 2014-06-05 19:23, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 12:47 PM, Johnny Billquist <bqt at softjar.se> wrote:
Interesting. I still have real network issues with the latest simh
talking to a real PDP-11 sitting on the same physical network. So
you seem to have hit sime other kind of limitation or something...
I wouldn't think that traffic between PDP11 systems would put so much
data in flight that all of the above issues would come into play.
Hmmm...Grind...Grind... I do seem have some vague recollection of an
issue with some DEQNA devices not being able to handle back-to-back
packets coming in from the wire. This issue might have been behind DEC's
wholesale replacement/upgrading of every DEQNA in the field, and it also
may have had something to do with the DEQNA not being officially
supported as a cluster device...
Hey, just do:
sim> SET XQ TYPE=DELQA-T
and your all good. :-) Too bad you can't just upgrade real hardware like
that.
Uh... It's the PDP-11 that have problems receiving packets, not simh.
I knew that. It was a joke. Notice ":-)"....
Well, it was ambiguous. :-)
Also, my PDP-11 already have a DELQA-T. :-) Finally, of course a simulated
PDP-11 running on a fast machine will be able to output data at a high rate.
Why would it not?
It would. I never argued that it wouldn't.
I read it as you was thinking that it would not. Maybe I'm being too literal tonight.
Two real PDP-11 systems do not get any problems.
Sending data from a real PDP-11 to the one in simh (or whatever) does not
have any problems either.
It is only when you send lots of data from a simulated machine (be that a
PDP-11 or a VAX) to a real PDP-11 that you get these issues. I would suspect
you should be able to see similar issues if the receiving end was physical VAX
as well.
I just tried to fire up the old VAX Station 4000 I've got on the shelf. This system hasn't been booted in more than 5 years (maybe 10). When it last booted, it didn't have any working disks, so I was planning to boot it into a cluster from my simh host. Without a disk, I'll have to create a RAM disk to test file copies from simh side to real side... I haven't had a monitor for the system for at about 15 years, but the last booting activities worked fine with a cable to one of the serial ports. But today it doesn't work. :-(
You would have to boot VMS, since you don't have DECnet under much else. (I know you could have Ultrix with DECnet, but I somehow don't think that's the right way... :-) )
The problem is also partly DECnet. DECnet do not seem to keep packets that
arrive out of order. So if a packet in a sequence is lost, DECnet is going to
retransmit all packets from that point forward. Meaning that when the
session timer times out, the retransmission happens, and then you will yet
again drop a packet in the whole sequence of packets that are sent. Each
time the session timer times out, DECnet also do a backoff on the timeout
time of that timer, until the session timer is about 2 minutes. So after a while
you end up with DECnet sending a burst of packets, some of which are lost. It
then takes about 2 minutes before a retransmission happens, at which point
you get another 2 minute timeout. Thus, performance sucks.
TCP/IP is better (well, my TCP/IP anyway), in that when I loose a packet, I still
keep whatever later packets I get, so after a while I get to a stable mode
where TCP only sends one packet at a time, since the window is full. Only
actually lost packets needs to be retransmitted, so I actually do get to this
stable point.
OK. So you sound like you really need a throttling option for the LAN devices. I'll look over the throttling logic in the bridge and fold in something similar.
Well, I'm not so sure that code should be a model to pattern anything after. Everything about my bridge is just a hack. Done just to fix an immediate problem without any proper design at any corner. :-)
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
On Thursday, June 05, 2014 at 12:04 PM, Johnny Billquist wrote:
On 2014-06-05 20:46, Mark Pizzolato - Info Comm wrote:
On Thursday, June 05, 2014 at 10:53 AM, Johnny Billquist wrote:
On 2014-06-05 19:23, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 12:47 PM, Johnny Billquist <bqt at softjar.se> wrote:
Interesting. I still have real network issues with the latest simh
talking to a real PDP-11 sitting on the same physical network. So
you seem to have hit sime other kind of limitation or something...
I wouldn't think that traffic between PDP11 systems would put so much
data in flight that all of the above issues would come into play.
Hmmm...Grind...Grind... I do seem have some vague recollection of an
issue with some DEQNA devices not being able to handle back-to-back
packets coming in from the wire. This issue might have been behind DEC's
wholesale replacement/upgrading of every DEQNA in the field, and it also
may have had something to do with the DEQNA not being officially
supported as a cluster device...
Hey, just do:
sim> SET XQ TYPE=DELQA-T
and your all good. :-) Too bad you can't just upgrade real hardware like
that.
Uh... It's the PDP-11 that have problems receiving packets, not simh.
I knew that. It was a joke. Notice ":-)"....
Also, my PDP-11 already have a DELQA-T. :-) Finally, of course a simulated
PDP-11 running on a fast machine will be able to output data at a high rate.
Why would it not?
It would. I never argued that it wouldn't.
Two real PDP-11 systems do not get any problems.
Sending data from a real PDP-11 to the one in simh (or whatever) does not
have any problems either.
It is only when you send lots of data from a simulated machine (be that a
PDP-11 or a VAX) to a real PDP-11 that you get these issues. I would suspect
you should be able to see similar issues if the receiving end was physical VAX
as well.
I just tried to fire up the old VAX Station 4000 I've got on the shelf. This system hasn't been booted in more than 5 years (maybe 10). When it last booted, it didn't have any working disks, so I was planning to boot it into a cluster from my simh host. Without a disk, I'll have to create a RAM disk to test file copies from simh side to real side... I haven't had a monitor for the system for at about 15 years, but the last booting activities worked fine with a cable to one of the serial ports. But today it doesn't work. :-(
The problem is also partly DECnet. DECnet do not seem to keep packets that
arrive out of order. So if a packet in a sequence is lost, DECnet is going to
retransmit all packets from that point forward. Meaning that when the
session timer times out, the retransmission happens, and then you will yet
again drop a packet in the whole sequence of packets that are sent. Each
time the session timer times out, DECnet also do a backoff on the timeout
time of that timer, until the session timer is about 2 minutes. So after a while
you end up with DECnet sending a burst of packets, some of which are lost. It
then takes about 2 minutes before a retransmission happens, at which point
you get another 2 minute timeout. Thus, performance sucks.
TCP/IP is better (well, my TCP/IP anyway), in that when I loose a packet, I still
keep whatever later packets I get, so after a while I get to a stable mode
where TCP only sends one packet at a time, since the window is full. Only
actually lost packets needs to be retransmitted, so I actually do get to this
stable point.
OK. So you sound like you really need a throttling option for the LAN devices. I'll look over the throttling logic in the bridge and fold in something similar.
- Mark
On Jun 5, 2014, at 3:24 PM, Johnny Billquist <bqt at softjar.se> wrote:
On 2014-06-05 21:12, Paul_Koning at Dell.com wrote:
...
True, provided congestion control is working. In the days of DECnet Phase IV, congestion control was a topic of active research, rather than a well understood problem. (Things like the TCP/IP DEC bit are an outcome of that work as well as a lot of other less obvious knowledge that made its way into other protocols.) So in Phase IV, you probably don t have effective congestion control, and scenarios with widely differing bandwidth points are likely to behave poorly. In Phase V, that should all be much better.
I don't even know what the "DEC bit" in TCP/IP is. Never heard of it. (Feel free to educate me.)
But TCP have the slow start control, the ICMP source quench, handling of out of order packets, and I'm sure a few more tricks to better deal with this kind of situation.
It s officially the Congestion Experienced bit. http://minnie.tuhs.org/PhD/th/2Existing_Congestion_Contro.html has a large amount of stuff on the topic; section 4.2 mentions the DEC Bit. In fact, that whole page is full of references to DECnet work on the subject.
paul
On 2014-06-05 21:12, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 2:46 PM, Mark Pizzolato - Info Comm <Mark at infocomm.com> wrote:
...
All of this is absolutely true, but it would seem that no one is trying to push full wire speed traffic between systems. It would seem that given high quality signal levels on the wires in the data path (i.e. no excessive collisions due to speed/duplex mismatching), that the natural protocol on the wire (with acknowledgements, etc.) should be able to move data at the speed of the lowest component in the datapath. That may be either the Unibus or Qbus or the PDP11's CPU.
True, provided congestion control is working. In the days of DECnet Phase IV, congestion control was a topic of active research, rather than a well understood problem. (Things like the TCP/IP DEC bit are an outcome of that work as well as a lot of other less obvious knowledge that made its way into other protocols.) So in Phase IV, you probably don t have effective congestion control, and scenarios with widely differing bandwidth points are likely to behave poorly. In Phase V, that should all be much better.
I don't even know what the "DEC bit" in TCP/IP is. Never heard of it. (Feel free to educate me.)
But TCP have the slow start control, the ICMP source quench, handling of out of order packets, and I'm sure a few more tricks to better deal with this kind of situation.
...
Hmmm...Grind...Grind... I do seem have some vague recollection of an issue with some DEQNA devices not being able to handle back-to-back packets coming in from the wire. This issue might have been behind DEC's wholesale replacement/upgrading of every DEQNA in the field, and it also may have had something to do with the DEQNA not being officially supported as a cluster device...
I m not sure about that for QNA. It certainly was an incorrigible device, which is why VMS dropped it, but I don t remember back to back packets being its specific issue.
I do remember that the 3C901 had this issue, and DECnet/DOS (Pathworks) ran into big trouble with that. There was even a proposal to throttle sending speeds across all DECnet implementations as a workaround for that design error; that proposal went down in flames very quickly indeed. So at that point it was even more clearly understood that back to back packets at the wire end of a NIC must always be handled.
Yeah. And I do not think that it is actually back-to-back packets that is the issue.
Johnny
On Jun 5, 2014, at 2:46 PM, Mark Pizzolato - Info Comm <Mark at infocomm.com> wrote:
...
All of this is absolutely true, but it would seem that no one is trying to push full wire speed traffic between systems. It would seem that given high quality signal levels on the wires in the data path (i.e. no excessive collisions due to speed/duplex mismatching), that the natural protocol on the wire (with acknowledgements, etc.) should be able to move data at the speed of the lowest component in the datapath. That may be either the Unibus or Qbus or the PDP11's CPU.
True, provided congestion control is working. In the days of DECnet Phase IV, congestion control was a topic of active research, rather than a well understood problem. (Things like the TCP/IP DEC bit are an outcome of that work as well as a lot of other less obvious knowledge that made its way into other protocols.) So in Phase IV, you probably don t have effective congestion control, and scenarios with widely differing bandwidth points are likely to behave poorly. In Phase V, that should all be much better.
...
Hmmm...Grind...Grind... I do seem have some vague recollection of an issue with some DEQNA devices not being able to handle back-to-back packets coming in from the wire. This issue might have been behind DEC's wholesale replacement/upgrading of every DEQNA in the field, and it also may have had something to do with the DEQNA not being officially supported as a cluster device...
I m not sure about that for QNA. It certainly was an incorrigible device, which is why VMS dropped it, but I don t remember back to back packets being its specific issue.
I do remember that the 3C901 had this issue, and DECnet/DOS (Pathworks) ran into big trouble with that. There was even a proposal to throttle sending speeds across all DECnet implementations as a workaround for that design error; that proposal went down in flames very quickly indeed. So at that point it was even more clearly understood that back to back packets at the wire end of a NIC must always be handled.
paul
On 2014-06-05 20:46, Mark Pizzolato - Info Comm wrote:
On Thursday, June 05, 2014 at 10:53 AM, Johnny Billquist wrote:
On 2014-06-05 19:23, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 12:47 PM, Johnny Billquist <bqt at softjar.se> wrote:
Interesting. I still have real network issues with the latest simh
talking to a real PDP-11 sitting on the same physical network. So
you seem to have hit sime other kind of limitation or something...
I wouldn't think that traffic between PDP11 systems would put so much data in flight that all of the above issues would come into play.
Hmmm...Grind...Grind... I do seem have some vague recollection of an issue with some DEQNA devices not being able to handle back-to-back packets coming in from the wire. This issue might have been behind DEC's wholesale replacement/upgrading of every DEQNA in the field, and it also may have had something to do with the DEQNA not being officially supported as a cluster device...
Hey, just do:
sim> SET XQ TYPE=DELQA-T
and your all good. :-) Too bad you can't just upgrade real hardware like that.
Uh... It's the PDP-11 that have problems receiving packets, not simh. Also, my PDP-11 already have a DELQA-T. :-)
Finally, of course a simulated PDP-11 running on a fast machine will be able to output data at a high rate. Why would it not?
Two real PDP-11 systems do not get any problems.
Sending data from a real PDP-11 to the one in simh (or whatever) does not have any problems either.
It is only when you send lots of data from a simulated machine (be that a PDP-11 or a VAX) to a real PDP-11 that you get these issues. I would suspect you should be able to see similar issues if the receiving end was physical VAX as well.
The problem is also partly DECnet. DECnet do not seem to keep packets that arrive out of order. So if a packet in a sequence is lost, DECnet is going to retransmit all packets from that point forward. Meaning that when the session timer times out, the retransmission happens, and then you will yet again drop a packet in the whole sequence of packets that are sent. Each time the session timer times out, DECnet also do a backoff on the timeout time of that timer, until the session timer is about 2 minutes. So after a while you end up with DECnet sending a burst of packets, some of which are lost. It then takes about 2 minutes before a retransmission happens, at which point you get another 2 minute timeout. Thus, performance sucks.
TCP/IP is better (well, my TCP/IP anyway), in that when I loose a packet, I still keep whatever later packets I get, so after a while I get to a stable mode where TCP only sends one packet at a time, since the window is full. Only actually lost packets needs to be retransmitted, so I actually do get to this stable point.
Johnny
On Thursday, June 05, 2014 at 10:53 AM, Johnny Billquist wrote:
On 2014-06-05 19:23, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 12:47 PM, Johnny Billquist <bqt at softjar.se> wrote:
...
It don't. Believe me, I've seen this, and investigated it years ago.
When you have a real PDP-11 running on half-duplex 10Mb/s talking to
something on a 1GB/s full duplex, the PDP-11 simply can't keep up.
Makes sense. The issue isn't the 10 Mb/s Ethernet. The switch deals with
that part. The issue is that a PDP-11 isn't fast enough to keep up with a 10
Mb/s Ethernet going flat out. If I remember right, a Unibus is slower than
Ethernet, and while a Q22 bus is slightly faster and could theoretically keep
up, a practical system cannot.
It's several things. The Unibus is definitely slower than the ethernet if I
remember right. The Qbus, while faster, is also slower than ethernet.
So there is definitely a bottleneck at that level.
However, there is also an issue in the switch. If one system is pumping out
packets on a 1Gb/s port, and the switch is forwarding them to a 10Mb/s port,
the switch needs to buffer, and might need to buffer a lot.
There are limitations at that level as well, and I would not be surprised if that
also can come into play here.
Thridly, even given the limitations above, we then also have the software on
the PDP-11, which also needs to set up new buffers to receive packet into,
and the system itself will not be able to keep up here. So the ethernet
controller is probably running out of buffers to DMA data into as well.
All of this is absolutely true, but it would seem that no one is trying to push full wire speed traffic between systems. It would seem that given high quality signal levels on the wires in the data path (i.e. no excessive collisions due to speed/duplex mismatching), that the natural protocol on the wire (with acknowledgements, etc.) should be able to move data at the speed of the lowest component in the datapath. That may be either the Unibus or Qbus or the PDP11's CPU.
Clearly we can't make old hardware work any faster than it ever did, and I certainly didn't think you were raising an issue about that when you said:
Interesting. I still have real network issues with the latest simh
talking to a real PDP-11 sitting on the same physical network. So
you seem to have hit sime other kind of limitation or something...
I wouldn't think that traffic between PDP11 systems would put so much data in flight that all of the above issues would come into play.
Hmmm...Grind...Grind... I do seem have some vague recollection of an issue with some DEQNA devices not being able to handle back-to-back packets coming in from the wire. This issue might have been behind DEC's wholesale replacement/upgrading of every DEQNA in the field, and it also may have had something to do with the DEQNA not being officially supported as a cluster device...
Hey, just do:
sim> SET XQ TYPE=DELQA-T
and your all good. :-) Too bad you can't just upgrade real hardware like that.
- Mark
On 2014-06-05 19:23, Paul_Koning at Dell.com wrote:
On Jun 5, 2014, at 12:47 PM, Johnny Billquist <bqt at softjar.se> wrote:
...
It don't. Believe me, I've seen this, and investigated it years ago.
When you have a real PDP-11 running on half-duplex 10Mb/s talking to something on a 1GB/s full duplex, the PDP-11 simply can't keep up.
Makes sense. The issue isn t the 10 Mb/s Ethernet. The switch deals with that part. The issue is that a PDP-11 isn t fast enough to keep up with a 10 Mb/s Ethernet going flat out. If I remember right, a Unibus is slower than Ethernet, and while a Q22 bus is slightly faster and could theoretically keep up, a practical system cannot.
It's several things. The Unibus is definitely slower than the ethernet if I remember right. The Qbus, while faster, is also slower than ethernet.
So there is definitely a bottleneck at that level.
However, there is also an issue in the switch. If one system is pumping out packets on a 1Gb/s port, and the switch is forwarding them to a 10Mb/s port, the switch needs to buffer, and might need to buffer a lot. There are limitations at that level as well, and I would not be surprised if that also can come into play here.
Thridly, even given the limitations above, we then also have the software on the PDP-11, which also needs to set up new buffers to receive packet into, and the system itself will not be able to keep up here. So the ethernet controller is probably running out of buffers to DMA data into as well.
Johnny
Mark, a deuna couldn't even keep up with thick wire ethernet speed. Iirc it could read up to 3 Mb/s average over a period of time, with full sized frames. Probably the same for the deqna.
Verzonden vanaf mijn BlackBerry 10-smartphone.
Origineel bericht
Van: Mark Pizzolato - Info Comm
Verzonden: donderdag 5 juni 2014 13:58
Aan: hecnet at Update.UU.SE
Beantwoorden: hecnet at Update.UU.SE
Onderwerp: RE: [HECnet] Emulated XQ polling timer setting and data overrun
On Thursday, June 05, 2014 at 1:27 AM, Johnny Billquist wrote:
On 2014-06-05 02:38, Jean-Yves Bernier wrote:
[...]
I have tested commit 753e4dc9 on Mac OS 10.6, no virtualization. Both
simh instances run on the same hardware. XQ set to different MAC
addresses, since this is now enforced.
Asynchronous network/disk IO may explain the uncommon transfer speed
(I
have filled a RM03 in seconds).
Interesting. I still have real network issues with the latest simh
talking to a real PDP-11 sitting on the same physical network. So you
seem to have hit sime other kind of limitation or something...
Can you elaborate on these 'real network issues'?
This is the first I've heard of anything like this.
I have not had any experience with a real PDP11 talking to a simh PDP or VAX, but in the past there were no issues with multiple simh VAX simulators talking to real VAX systems on the same LAN.
Let me know.
Thanks.
- Mark