On Mar 29, 2021, at 2:25 AM, Johnny Billquist <bqt
at softjar.se> wrote:
On 2021-03-28 23:08, Paul Koning wrote:
On Mar
28, 2021, at 4:40 PM, Johnny Billquist <bqt at softjar.se> wrote:
...
Technically, TCP should work just fine. But yes, Multinet even have some funny specific
behavior making even TCP a bit tricky.
No, technically Multinet TCP does NOT work
fine. The issue is that Multinet, whether over TCP or over UDP, fails several of the
requirements imposed on point to point datalinks by the DECnet routing spec. In
particular, it fails the requirement of "restart notification". In the TCP
case, it can be hacked to make it work most of the time, but architecturally speaking
it's flat out wrong.
The issue is that there is more to a point to point datalink that merely delivering a
packet stream from point A to point B. That's the main job, of course, but that by
itself is not sufficient in the DECnet architecture.
I'm not sure what you think the problem is here. This would be interesting to
explore.
Restart notification, in my ears, seems to be about informing DECnet if the link had to
be restarted. Which, I would say, obviously can be done with Multinet over TCP, because
this is detected/handled by TCP just fine. And if the TCP connection is lost, all you need
to do is just inform DECnet that the link went down. And I do exactly that with TCP
connections.
If that were done by the other implementations things would indeed be ok.
The routing spec lists the requirements:
The required Data Link Layer guarantees for point-to-point links are:
1. Provision that both source and destination nodes complete
start-up before message exchange can occur
2. Detection of remote start-up
3. Provision that no old messages be received after start-up is
complete
4. Masking of transient errors in order to prevent packet data
corruption
5. Provision for not duplicating or corrupting packets
6. Packet sequentiality ensuring that, if a packet has been
received, all previously sent packets have been received
7. Reporting of failures and degraded circuit conditions
"just inform DECnet that the link went down" is a good summary of requirements
1, 2, and 3. But, as far as I can tell from observation, Multinet on VMS does not do
this. Instead, there is no link between the TCP connection state and the routing layer
protocol machinery. It's strange that this is so, given that DDCMP does implement
these things correctly (of course) which means the interfaces must exist in DECnet. One
wonders why the Multinet people did not make those calls. Perhaps they didn't know of
them; perhaps they didn't care enough to bother. Given that they created Multinet
over UDP my assumption is that they didn't care enough -- the existence of Multinet
over UDP is proof positive that they didn't know what they were doing and never
bothered to read and understand the DECnet routing protocol specification.
What else is there, that you think is a problem?
My problem, on the other hand, is that Multinet isn't just setting up the TCP
connection, and then let things run. Multinet seems to explicitly want also one side to be
the first to start communicating, and seems to drop packets if things don't happen in
the order Multinet thinks they should happen. Which is really annoying, since DECnet
should just be let alone to sort it out. So I've had to start trying to play more like
Multinet, designating one as the active, and the other as the passive, and at connection
time, delay signalling based on this, in order to play nicely with VMS.
Yes, that's one way to describe the observed misbehavior. It may be that it's an
explicit asymmetry (which is not authorized by the specification) or it may be a side
effect of the failure to implement the notification requirement.
...
Well, your restart mechanism is something I do not understand.
I would just run the TCP connection straight as it is. If the connection is closed, for
whatever reason, this should be signalled to DECnet just like DDCMP signals link errors,
and when the connection is established, data just flows again.
It should work just fine without any further stuff. This is pretty close to what I do
today, with the unfortunate extra fluff required since Multinet don't like if the
"wrong" side talks first.
That's what I tried to do as well. The DDCMP mechanism is simple. There, the link
down notification tells the routing layer to do a circuit down, and the link up
notification tells it to begin running the routing layer initialization handshake (Init
message, Verify message if needed, and then things are ready).
The trouble is that if you do this when the other side routing layer hasn't been told
about the down/up cycle, it sees an Init message when it thinks the circuit is
"up". That produces a "down", and that end sends an Init message.
But it expects an Init message in reply, which this side doesn't normally send because
it did so earlier. So this is why I call it a side effect of the failure to implement the
correct tie-in between data link and routing layer circuit states.
The workaround is to detect this pattern and send the Init message again.
The former
would be vaguely like the old "DDCMP emulation" in SIMH 2.9 that I once had in
PyDECnet but removed -- the one that sends DMC payloads over TCP. That might be worth
doing if people were interested in deploying it. I could dust of that old code to serve
as the template to follow.
I would not send DMC payloads. Seems like an unnecessary extra step/layer with no added
value.
I didn't explain it clearly. The old emulation simply sends the data handed down from
routing, with a count in front to do the packet framing in the TCP stream. It's
pretty much like Multinet except that, since it's a DMC emulation, the tie to the
routing layer state is correctly provided.
The latter
would be like GRE, but it wouldn't really do anything that GRE doesn't do just as
well so there wouldn't be a whole lot of point in it.
GRE basically just encapsulate ethernet packets anyway. (Yes, I know it can encapsulate
other things, but anyway...)
Yes, and as such it isn't really a good choice for any path that has more than a tiny
packet loss rate. The assumption DECnet makes for Ethernet is that it is a datagram
network (so it CAN lose packets) but that the loss rate is quite low -- the sort of
numbers you'd expect to see on a correctly constructed LAN. Bridges of course can
introduce loss due to congestion, but if that number is at all significant you're in
trouble and DECnet won't be all that happy. The biggest issue is that reaction to
route changes becomes slow if there is a more-than-tiny probability a routing update
packet is lost.
paul