Why will my clustered volume mount on all nodes once and then go comatose?

  • 7004523
  • 17-Apr-2012
  • 11-Jul-2012

Environment

Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 2
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 3
Novell Open Enterprise Server 11 (OES 11) Linux

Situation

Cluster load script for the resource showed:

+ exit_on_error novcifs --add '--vserver=".cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE."' --ip-addr=10.20.30.44
+ eval novcifs --add '--vserver=".cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE."' --ip-addr=10.20.30.44
++ novcifs --add --vserver=.cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE. --ip-addr=10.20.30.44
Adding a Virtual Server is failed
Error : virtual server already exists , Error Number: 20
+ rc=1
+ date


and the cluster unload script showed:

++ novcifs --remove --vserver=.cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE. --ip-addr=10.20.30.44
Deleting a Virtual Server is failed
Error : virtual server doesn't exist , Error Number: 22


Finally, checking the file where CIFS messages are stored -- /var/log/messages, by default, or the one defined by auditing tools (i.e. /var/log/cifs/cifs.log) -- the following is observed:

CIFS[18765]: EVENT: RPC: Recieved Mount Volume RPC: volumeNumber: 251, status: 1119879171, volumeName: ABCUSERS1, volumeGUID: 7cc2dc9c-8034-01e1-80-00-74e4a8bf7b8c, poolName: ABCUSERS1, volumeMountPoint: /media/nss/ABCUSERS1
CIFS[18765]: EVENT: CLI: AddServer : Adding virtual server FDN .cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE. with IP 0xb0004f0a
CIFS[18765]: ERROR: ENTRY: CIFSNDSGetCIFSServerInfo: isVirtualServer = FALSE, uServerVersion = %U, aServerVersion = N, expected = %U
CIFS[18765]: ERROR: CLI: AddServer: Server fdn .cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE. with netbios name has already been added

CIFS[18765]: EVENT: CLI: Removing virtual server FDN .cn=OU2CLS2ABCUSERS1.ou=SERVERS.ou=OU22.ou=OU1.o=MyORG.t=MY_TREE. from CIFS server list
CIFS[18765]: EVENT: CLI: RemoveServer : Finding virtual server FDN OU2CLS2ABCUSERS1.SERVERS.OU22.OU1.MyORG.MY_TREE with netbios name= CHICLS2_SSDUSERS in local server list
CIFS[18765]: WARNING: CLI: RemoveServer: Could not find virtual server fdn =
OU2CLS2ABCUSERS1.SERVERS.OU22.OU1.MyORG.MY_TREE with netbios name OU2CLS2_ABCUSERS in local list, probably deleted already
CIFS[18765]: WARNING: CODIR: Failed to add share "ABCUSERS1" to eDirectoy. Error: 0
CIFS[18765]: EVENT: RPC: Dismount Volume NCP RPC received: volumeNumber: 251, status: 0, volumeName: ABCUSERS1, volumeGUID: , poolName: , volumeMountPoint: /media/nss/ABCUSERS1

Resolution

Modify the "nfapCIFSServerName" on the virtual server object for the clustered resource, to be less than 15 characters.  (NOTE: CARE should be exercised when performing this operation so as not to corrupt any object data).  One method of changing the nfapCIFSServerName is to:

  1. Log in to iManager
  2. Click the "Browse Objects" icon (next to the person behind the desk)
  3. In the left pane, navigate down and click on the container holding the virtual server object
  4. In the right pane, click the check box before the virtual server object in question, and click "Edit"
  5. In the "Modify Object" window, navigate to:
    1. General tab
    2. "Other" sub-tab  (this may take a few moments)
    3. select "nfapCIFSServerName" and click "Edit..." button
  6. Modify the nfapCIFSServerName value to less than 15 characters, click OK and apply.
  7. Exit iManager
  8. Offline and Online the resource.
Everything should now work properly.

Cause

SMB/CIFS names are required to be 15 or less characters.  When this volume was CIFS enabled, the name was manually changed to longer than 15 characters.  As it turns out, the first 15 characters of this CIFSname matched another resource's CIFSname.