On Thu, 15 Sep 2011 15:46:29 -0400, you wrote:
Unless youLre running an NI cluster. If PEDRIVER is loaded shutting the
decnet circuit used by it must crash the node.
PEDRIVER uses DECnet? I thought that clustering used their own Ethernet
packets and that DECnet was merely used for some management functions
like SYSMAN.
PEDRIVER does NOT use DECnet (Ethertype 0x6003), but a specific SCA protocol
(0x6007). If for some reason a NI clustered node looses communication with
other cluster members, its activities are temporarily paused and a timer is
started; when the timer expires (after RECNXINTERVAL seconds, default 20),
the surviving cluster reconfigures itself and goes on without the lost node.
Meanwhile, if the lost node reestablishes cluster communications within that
interval, everything goes on as before; on the other hand, if the lost node
retries to connect after the remaining cluster has reconfigured itself, it
receives a sort of negative acknowledge from the cluster connection manager
(CNXMAN) and is then forced to crash with a specific CLUEXIT bugcheck code.
This is to ensure integrity of the locking database and such.
The OP had a different crash (SSRVEXCEPT) which is symptom of a bug, not of
a "normal" cluster reconfiguration crash.
HTH, :-)
G.
P.S. The new IPCI interface to PEDRIVER should work in a similar way.
Hi,
I'm happy to report that the issues with the MOP configuration were fixed by Hans suggestion to make the configuration changes to the permanent database then rebooting.
The satellite is now booting, although the system disk image I had created and transferred to the Alpha is not of the node clustered, so I have more work to do. The satellite node has it's own system disk currently, which is booting, so it would make sense to get that configuration correct, then go through the pain of transferring the system disk image (via an intermediary VAX) back to a drive in the Alpha.
As always a learning experience.
Thanks for the help, Mark.
On 15/09/11 13:38, hvlems at zonnet.nl wrote:
Mark, there is a mistake in your first ncp command: it uses SET instead of DEFINE. The ncp define command operates on the permanent database Set modifies the volatile database, somewhat similar to the way SYSGEN works where you WRITE CURRENT (define) or WRITE ACTIVE (set).
What is the output of these commands:
$ mc ncp show circ ewa-0 char
$ mc ncp list circ ewa-0 char
The example in the cluster manual tries to minimize decnet downtime. That's why it uses a set circ state off and set circ state on sequence.
The state off command crashes your system for some reason. The only reason I can think of is that it is already a cluster member and a boothost.
The only way to modify circuit or line attributes is to add or modify them to/in the permanent database and either reboot the entire system or shutdown decnet:
$ mc ncp set exec state off
Followed by
$ @sys$manager:netstart.
However if there are active cluster members (also in a SCSI cluster) then shutting decnet will (and must) crash the node.
-----Original Message-----
From: Mark Wickens<mark at wickensonline.co.uk>
Sender: owner-hecnet at Update.UU.SE
Date: Thu, 15 Sep 2011 11:55:36
To:<hecnet at Update.UU.SE>
Reply-To: hecnet at Update.UU.SESubject: Re: [HECnet] Satellite configuration/NCP crash
I tried the commands you gave, and this time it crashed again:
NCP>set circ ewa-0 service enabled
NCP>set circ ewa-0 state off
NCP>
%%%%%%%%%%% OPCOM 15-SEP-2011 11:39:48.02 %%%%%%%%%%%
Message from user DECNET on SLAVE
DECnet event 4.7, circuit down, circuit fault
Unless youLre running an NI cluster. If PEDRIVER is loaded shutting the decnet circuit used by it must crash the node.
PEDRIVER uses DECnet? I thought that clustering used their own Ethernet packets and that DECnet was merely used for some management functions like SYSMAN.
--Marc
Unless youLre running an NI cluster. If PEDRIVER is loaded shutting the decnet circuit used by it must crash the node.
------Origineel bericht------
Van: Peter Coghlan
Afzender: owner-hecnet at Update.UU.SE
Aan: hecnet at Update.UU.SE
Beantwoorden: hecnet at Update.UU.SE
Onderwerp: Re: [HECnet] Satellite configuration/NCP crash
Verzonden: 15 september 2011 21:30
I'm following the procedure in the Cluster manual, pp10-9, to configure a
VAX satellite from an alpha boot node, and I'm getting a system crash.
My question is whether I need to run this procedure with the Alpha in
standalone mode, or whether this is a situation requiring a patch, or
whether I can get away with something different.
NCP commands should work fine when the system is running in normal
multiuser mode. The worst that should happen from entering something
incorrect is an error message or cutting yourself off if you happen
to be logged on via decnet. If a system crash results from doing
something in NCP, there is a software error and it is time to go
looking for patches or upgrading to a newer version or downgrading
to a stable version.
Regards,
Peter Coghlan.
I'm following the procedure in the Cluster manual, pp10-9, to configure a
VAX satellite from an alpha boot node, and I'm getting a system crash.
My question is whether I need to run this procedure with the Alpha in
standalone mode, or whether this is a situation requiring a patch, or
whether I can get away with something different.
NCP commands should work fine when the system is running in normal
multiuser mode. The worst that should happen from entering something
incorrect is an error message or cutting yourself off if you happen
to be logged on via decnet. If a system crash results from doing
something in NCP, there is a software error and it is time to go
looking for patches or upgrading to a newer version or downgrading
to a stable version.
Regards,
Peter Coghlan.
Mark, there is a mistake in your first ncp command: it uses SET instead of DEFINE. The ncp define command operates on the permanent database Set modifies the volatile database, somewhat similar to the way SYSGEN works where you WRITE CURRENT (define) or WRITE ACTIVE (set).
What is the output of these commands:
$ mc ncp show circ ewa-0 char
$ mc ncp list circ ewa-0 char
The example in the cluster manual tries to minimize decnet downtime. That's why it uses a set circ state off and set circ state on sequence.
The state off command crashes your system for some reason. The only reason I can think of is that it is already a cluster member and a boothost.
The only way to modify circuit or line attributes is to add or modify them to/in the permanent database and either reboot the entire system or shutdown decnet:
$ mc ncp set exec state off
Followed by
$ @sys$manager:netstart.
However if there are active cluster members (also in a SCSI cluster) then shutting decnet will (and must) crash the node.
-----Original Message-----
From: Mark Wickens <mark at wickensonline.co.uk>
Sender: owner-hecnet at Update.UU.SE
Date: Thu, 15 Sep 2011 11:55:36
To: <hecnet at Update.UU.SE>
Reply-To: hecnet at Update.UU.SESubject: Re: [HECnet] Satellite configuration/NCP crash
I tried the commands you gave, and this time it crashed again:
NCP>set circ ewa-0 service enabled
NCP>set circ ewa-0 state off
NCP>
%%%%%%%%%%% OPCOM 15-SEP-2011 11:39:48.02 %%%%%%%%%%%
Message from user DECNET on SLAVE
DECnet event 4.7, circuit down, circuit fault
I tried the commands you gave, and this time it crashed again:
NCP>set circ ewa-0 service enabled
NCP>set circ ewa-0 state off
NCP>
%%%%%%%%%%% OPCOM 15-SEP-2011 11:39:48.02 %%%%%%%%%%%
Message from user DECNET on SLAVE
DECnet event 4.7, circuit down, circuit fault
The output of
$ mc ncp show circ ewa-0 char
will tell you whether it worked: the attribute service must have the value enabled.
If that is not the case check:
$ mc ncp list circ ewa-0 char
The service enabled value must be present and reboot the system to make it effective.
You'll see decnet opcom messages only on OPA0 or on a terminal where the command $ REPLY/ENABLE was entered. Hans
On Thu, 15 Sep 2011, hvlems at zonnet.nl wrote:
Yes the NCP commands work on a standalone VMS (VAX and Alpha) system.
Possinle reasons for the failure (that I can think of....):
- wrong circuit name
- DECnet not configured
- phase 5 running, not phase 4
- insufficient privileges
- wrong circuit name
NCP>show active circuits
Active Circuit Volatile Summary as of 15-SEP-2011 10:53:46
Circuit State Loopback Adjacent
Name Routing Node
EWA-0 on 8.400 (GORVAX)
EWA-0 11.2 (MAISA)
EWA-0 20.1 (WOPR)
EWA-0 1.300 (CTAKAH)
EWA-0 1.13 (MIM)
- DECnet not configured
$$ dir maisa::
Directory MAISA::SYS$SPECIFIC:[FAL$SERVER]
INFO.TXT;3 NETSERVER.LOG;62 NETSERVER.LOG;61 NETSERVER.LOG;60
NETSERVER.LOG;59 NETSERVER.LOG;58 NETSERVER.LOG;57 NETSERVER.LOG;56
NETSERVER.LOG;55 NETSERVER.LOG;54 NETSERVER.LOG;53 TEST.EXE;2
Total of 12 files.
- phase 5 running, not phase 4
NCP>show executor characteristics
Node Volatile Characteristics as of 15-SEP-2011 10:55:56
Executor node = 4.249 (SLAVE)
Identification = HP DECnet for OpenVMS Alpha
Management version = V4.0.0
Incoming timer = 45
Outgoing timer = 60
Incoming Proxy = Enabled
Outgoing Proxy = Enabled
NSP version = V4.1.0
Maximum links = 32
Delay factor = 80
Delay weight = 5
Inactivity timer = 60
Retransmit factor = 10
Routing version = V2.0.0
Type = area
Routing timer = 600
Broadcast routing timer = 180
Maximum address = 1023
Maximum circuits = 16
Maximum cost = 1022
Maximum hops = 30
Maximum visits = 63
Maximum area = 63
Max broadcast nonrouters = 64
Max broadcast routers = 32
Maximum path splits = 1
Area maximum cost = 1022
Area maximum hops = 30
Maximum buffers = 100
Buffer size = 576
Nonprivileged user id = DECNET
Nonprivileged password = SALISPENTS
Default access = incoming and outgoing
Pipeline quota = 4032
Alias maximum links = 32
Path split policy = Normal
Maximum Declared Objects = 31
- insufficient privileges
$$ show proc/priv
15-SEP-2011 10:56:45.84 User: SYSTEM Process ID: 20200260
Node: SLAVE Process name: "SYSTEM"
Authorized privileges:
ACNT ALLSPOOL ALTPRI AUDIT BUGCHK BYPASS
CMEXEC CMKRNL DIAGNOSE DOWNGRADE EXQUOTA GROUP
GRPNAM GRPPRV IMPERSONATE IMPORT LOG_IO MOUNT
NETMBX OPER PFNMAP PHY_IO PRMCEB PRMGBL
PRMMBX PSWAPM READALL SECURITY SETPRV SHARE
SHMEM SYSGBL SYSLCK SYSNAM SYSPRV TMPMBX
UPGRADE VOLPRO WORLD
BTW you must use LAT, IP or a locally attached terminal to perform this trick.
Logged in via a DECserver providing the serial console:
$ telnet decserv 2005
%TELNET-I-TRYING, Trying ... 192.168.1.200
%TELNET-I-SESSION, Session 01, host decserv, port 2005
$$
What happens if you do this, assuming that EWA-0 is correct::
$ mc ncp def circ ewa-0 service enabled
$ mc ncp set circ ewa-0 state off
$ mc ncp set circ ewa-0 state on
$$ mcr ncp
NCP>define circuit ewa-0 service enabled
NCP>define circuit ewa-0 state off
NCP>define circuit ewa-0 state on
NCP>show active circuits
Active Circuit Volatile Summary as of 15-SEP-2011 10:53:46
Circuit State Loopback Adjacent
Name Routing Node
EWA-0 on 8.400 (GORVAX)
EWA-0 11.2 (MAISA)
EWA-0 20.1 (WOPR)
EWA-0 1.300 (CTAKAH)
EWA-0 1.13 (MIM)
NCP>exit
That doesn't crash the system, although there is no indication that the state is dropped.
Should I proceed with the commands required based on the fact that the commands you gave me didn't cause a crash?
Thanks for the help, much appreciated
Mark.
Hans
-----Original Message-----
From: Mark Wickens <mark at wickensonline.co.uk>
Sender: owner-hecnet at Update.UU.SE
Date: Thu, 15 Sep 2011 10:14:28
To: <hecnet at Update.UU.SE>
Reply-To: hecnet at Update.UU.SESubject: [HECnet] Satellite configuration/NCP crash
Guys,
I'm following the procedure in the Cluster manual, pp10-9, to configure a
VAX satellite from an alpha boot node, and I'm getting a system crash.
My question is whether I need to run this procedure with the Alpha in
standalone mode, or whether this is a situation requiring a patch, or
whether I can get away with something different.
This is the first part of the procedure to enable the MOP server.
The second part is to configure the MOP service for the VAX satellite.
The Alpha is running OpenVMS 8.3
Thanks for the help, Mark.
The log is:
$$ mcr ncp
NCP>define circuit ewa-0 service enabled state on
NCP>set circuit ewa-0 state off
NCP>
%%%%%%%%%%% OPCOM 15-SEP-2011 10:03:37.64 %%%%%%%%%%%
Message from user DECNET on SLAVE
DECnet event 4.7, circuit down, circuit fault
Yes the NCP commands work on a standalone VMS (VAX and Alpha) system.
Possinle reasons for the failure (that I can think of....):
- wrong circuit name
- DECnet not configured
- phase 5 running, not phase 4
- insufficient privileges
BTW you must use LAT, IP or a locally attached terminal to perform this trick.
What happens if you do this, assuming that EWA-0 is correct::
$ mc ncp def circ ewa-0 service enabled
$ mc ncp set circ ewa-0 state off
$ mc ncp set circ ewa-0 state on
Hans
-----Original Message-----
From: Mark Wickens <mark at wickensonline.co.uk>
Sender: owner-hecnet at Update.UU.SE
Date: Thu, 15 Sep 2011 10:14:28
To: <hecnet at Update.UU.SE>
Reply-To: hecnet at Update.UU.SESubject: [HECnet] Satellite configuration/NCP crash
Guys,
I'm following the procedure in the Cluster manual, pp10-9, to configure a
VAX satellite from an alpha boot node, and I'm getting a system crash.
My question is whether I need to run this procedure with the Alpha in
standalone mode, or whether this is a situation requiring a patch, or
whether I can get away with something different.
This is the first part of the procedure to enable the MOP server.
The second part is to configure the MOP service for the VAX satellite.
The Alpha is running OpenVMS 8.3
Thanks for the help, Mark.
The log is:
$$ mcr ncp
NCP>define circuit ewa-0 service enabled state on
NCP>set circuit ewa-0 state off
NCP>
%%%%%%%%%%% OPCOM 15-SEP-2011 10:03:37.64 %%%%%%%%%%%
Message from user DECNET on SLAVE
DECnet event 4.7, circuit down, circuit fault