It's been a while since I sent out the T1.1 beta 2. I had collected a number of cleanups and bugfixes as well as some doc updates. I don't know how many are using this release; I see one in the mapper but there might be more for all I know.
Earlier today I committed those pending improvements and created a new "T1.1 third beta" kit. You can get that code from the Subversion server:
or alternatively from the Downloads link on the mapper website:
I'd like to do the V1.1 release reasonably soon, so I'd appreciate any feedback from people who have tried the beta (the earlier version or this one).
Area 60 is back online, things have cooled back down, and the max temp for the next 10 days is only supposed to be about 80F.
The bad news is, I lost a Hard Drive. The good news is, it was the drive with my backups.
Hail all you RSTS buffs!
One of my nodes (PIRSTS) is running RSTS V10.1L and DECnet V4.1
Ever since I am on HECnet, I have observed that job slots are filled over
time with jobs under the DECnet account [29,206] in HB state, up to the
point that the job max is reached, and I cannot log in anymore.
My working hypothesis is that the polling processes (for HECnet mapping and
other inquiring minds - you know who you are) keep creating new jobs,
instead of reusing old ones.
Is there a way to tell DECnet/E that it should not keep jobs in HB state,
but log them out after use? Or reuse existing jobs on incoming connection
requests? On DECnet/VMS there are timer and other logicals that steer this
Running a daily kill job seems, well, overkill.
I've discovered that three of my four RA8x drives have failed due to bad
tachometer optical sensors. Apparently the material these guys were potted
with decades ago gradually turns opaque and, being as they're "optical",
that's Really Bad. Has anybody else had this issue? Anybody found a
replacement for them? It's just an IR LED and a phototransistor in a fancy
plastic housing, and that's still pretty common devices these days. I just
need one that's mechanically compatible with the RA8x HDA. A modern
production replacement seems like a better plan than a new old stock
replacement since any old production ones are just as likely to be bad.
Is anyone who's familiar with pyDECnet configuring available on a
communication system that's less async than email? I've got Matrix, IRC,
Discord, and Slack as well as WhatsApp and Signal available to me.
I thought I'd share an example of a non-optimal setup for people, so
that you can understand a little better what we currently have.
This is from area 1 to area 34. More specifically ANKE to area 34.
Now, physically, ANKE is in Stockholm, Sweden, while A34RTR is in
Båtbyggartorp, Sweden. They are actually not that far from each other,
physically, if you look on a map. Maybe 40 kilometers at the most.
However, in HECnet, it is 3 hops, and a cost of 20.
Now, when ANKE wants to talk to area 34, the next hops are:
PYTHON - New Boston, NH, USA (cost 8)
IMPRTR - Washington DC, USA (cost 4)
and then I *think* it must be A34RTR, since that should be the final
hop, but since both IMPRTR and A34RTR are Cisco boxes, I can't see.
And a guess that the cost of that last hop would be 8.
But clearly, such a roundabout way to talk to such a close node is
kindof silly. :-) We should have reasonable links, in reasonable
directions, and with appropriate costs, so that we don't have things
like this. No good reason to. It's not like we need to pay money to have
physical cables installed between places.
(This must have been such a fun work back in the day when you needed to
actually pay for the physical cables...)
If the link to PYTHON went down, the alternative route would be through
A39RTR(9), PYRTR(2), IMPRTR(4) and then A34RTR. The costs in
parenthesis. (A cost of 2 between A39RTR and PYRTR seems rather cheap,
but what do I know?)
A few suggestions on how to look at things:
If you have a machine that talks NICE, you can examine for both VMS, RSX
and PyDECnet, what the next hop towards an area is. Giving examples on MIM:
.ncp tell anke sho area 34 stat
Area status as of 10-SEP-22 15:34:49
Area State Cost Hops Circuit Node
34 Reachable 20 3 DMC-15 41.1 (PYTHON)
.ncp tell anke sho cir dmc-15 cha
Circuit characteristics as of 10-SEP-22 15:36:02
Circuit = DMC-15
Level one cost = 8
Hello timer = 60, Listen timer = 630
This can then be repeated for node PYTHON and so on. But as noted, when
you get to a Cisco box, you can't do this. Cisco boxes do not speak NICE.
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt(a)softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
I just had a conversation with one of the operators of a HECnet L2 router who has demoted it to L1 because of connectivity issues to other areas. That prompted some thinking about the DECnet requirements for fault tolerance.
The basic principle of DECnet Phase IV is that L1 routing (within an area) involves only routers of that area, and L2 routing (across areas) involves only L2 routers. Phase V changes that to some extent, but HECnet is not Phase V and isn't likely to be. :-) In addition, L1 routers send out of area traffic to some L2 router in their area, but without any awareness of the L2 topology.
This has several consequences:
1. If some of the L2 routers in your area can't see the destination area but others can, you may not be able to communicate even though it would seem that there is a way to get there from here.
2. If your area is split, i.e., some of its L2 routers can see one subset of the nodes in the area and other L2 routers can see a different subset, then out of area traffic inbound to that area may not reach its destination -- if it enters at the "wrong" L2 entry point.
I believe the issue I mentioned at the top was #1: one of the L2 routers went down and the remaining L2 routers of that area ended up at two sides of a partitioned L2 network.
Obviously HECnet isn't a production network, but still it would be nice for it to be tolerant of outages. Especially since we can insert additional routers easily with PyDECnet or Robert Jarratt's C router. The HECnet map can be set to show just the L2 network (using the layers menu, accessible via the layers icon in the top right corner of the map). It's easy to see a number of L2 routers that have only one connection to the rest of HECnet. It's also clear that a large fraction of the connectivity is via Sweden, which certainly is a fine option but it's a bit odd for a node in, say, western Canada to have only that one connection and none to nodes much closer to it.
The map display doesn't give a visual clue about singly-connected area routers for which there is no location information in the database (the ones plotted at Inaccessible Island). The data is there in the map data table; it wouldn't be too hard to do some post-processing on that data to find cases of no redundancy.
I'm curious if people would be interested in trying to make HECnet more fault tolerant. My router (PYTHON) can definitely help, especially for North American nodes, and I'm sure there are a number of others that feel the same.
I think I already sent this to all the people who are directly connected
to A2RTR, but just in case I missed anybody, here it is for general
I have moved A2RTR to an Amazon cloud server. The new IP is now
22.214.171.124, although I strongly recommend that you use the FQDN
decnet.jfcl.com instead. I've already updated the latter to point to the
A few of you with passive, listen, connections on your end and who aren't
checking the source IP don't actually need to do anything. The new A2RTR
will just connect to you as before and you won't notice a difference.
Those who have active connections to A2RTR, or who have some kind of
source IP based filtering in place, will need to update the IP for A2RTR.
Once again, I really suggest that you use decnet.jfcl.com if at all
possible, but if not then the new IP is 126.96.36.199.