On Mar 27, 2021, at 11:06 AM, Mark Berryman <mark
at theberrymans.com> wrote:
DDCMP was originally designed to run over intelligent synchronous controllers, such as
the DMC-11 or the DMR-11, although it could also be run over async serial lines. Either
of these could be local or remote. If remote, they were connected to a modem to talk over
a circuit provided by a common carrier and async modems had built in error correction.
From the DMR-11 user manual describing its features:
DDCMP implementation which handles message sequencing and error correction by automatic
retransmission
In other words, DDCMP expected the underlying hardware to provide guaranteed transmission
or be running on a line where the incidence of data loss was very low. UDP provides
neither of these.
DDCMP via UDP over the internet is a very poor choice and will result in exactly what you
are seeing. This particular connection choice should be limited to your local LAN where
UDP packets have a much higher chance of surviving.
GRE survives much better on the internet than does UDP and TCP guarantees delivery. If
possible, I would recommend using one these encapsulations for DECnet packets going to any
neighbors over the internet rather than UDP.
GRE is a broadcast subtype, so it follows the Ethernet rules. That means an idle link
tolerates two consecutive packet losses but gets a hello timeout on three consecutive
losses. Also, and this is more serious, if a routing update packet is lost, that route
change is not seen by the other end until the background timer (BCT1) fires.
DDCMP is a point to point subtype. That means an outage that lasts longer than twice (not
three times) the hello timer will cause a listen timeout. On the other hand, if packets
are dropped they are retransmitted promptly (a second typically) at the datalink level and
the drop is invisible to routing. In particular, routing packets will get through unless
you have a sustained outage. This is why T1 is by default far larger for point to point
links -- it exists only as a "self-stabilization" safety measure to deal with
software bugs, not as a protection against packet drop.
DDCMP does NOT expect the underlying hardware to run on a line with "very low"
data loss, let alone on a lossless link. Instead, like any ARQ protocol, it runs
correctly (delivers its promised guarantees) even at quite high error rates. However,
also in common with any other ARQ protocol, if the error rate is high the throughput drops
a lot.
There is a classic result from the early days of ARPAnet, when a "high speed
backbone" link ran at 56 kbps, that a 1 percent packet drop rate would produce a 50
percent drop in throughput. With modern links that ratio is likely to be worse. So if
you're running on a path with 1 percent packet drops, DDCMP will run but rather
slowly. And, for the same reason, TCP will also run slowly; perhaps not quite so much
because TCP implementations may use faster timeouts.
There are protocols specifically designed for lossy high latency links; deep space
satellite links are an example. ARQ is not used in such cases, instead one uses FEC
(forward error correction) such that packets are delivered even in the presence of a
specified level of bit error or packet error. But those schemes are way outside what we
deal with.
One of these days I will hack up a quick & dirty "network simulator" that
provides a pipe with specified error rates, then run DDCMP over it to see how it performs
under abusive conditions. Hopefully that will confirm my implementation is correct in
these areas; if not I'll fix it.
paul