This document is created to help other people using Novell Clustering on VMWare ESX server and not cause you to pull your hair out.
There are two ways to do Novell Clustering on the ESX server(that I have found). The first way is to use RAW Lun mapping for each Virtual Machine(VM). The second way is to use VMWare virtual disks.
Using Raw Lun Mapping is a bit limiting in the sense that VMWare only allows you to have one Raw Lun Mapping per ESX server. So in order to cluster you need to have two physical servers. This is a good idea anyway but there is a further issue, if you are using VMWare to run multiple servers only one of those servers can be clustered. So in my instance we have a IBM Blade server with 5 Blades, each blade has 8GB of RAM and 2 Quad Core CPU’s, I can run many Netware servers on this and I want to cluster a few of them. With Raw Lun mapping I can only have one clustered server per ESX server and the second clustered node will have to be on another ESX server.
Using the VMWare disks allows me to have multiple clustered servers on each blade with their counterparts on other blades or on the same blade. The issue with using VMWare disks is that it is harder to configure then Raw Luns.
I will go over Raw Lun config first because it is real easy and fast.
RAW LUN Configuration
First Configure a Lun on a SAN, we are using Pillar Data Systems for our SAN so your interface may look different.
You may notice that the Lun is the same on both servers. You want this in case you need to migrate the VM to another physical machine. So make sure each cluster resource has the same LUN!!!
Once this is created you can add a disk to an already running server. Once the Luns are added to the different servers you need to go to storage adapters and press rescan a couple of times for the Luns to come up.
Select Edit Setting.
Select the Hard Disk.
From here select Raw Device Mapping.
Notice the Lun ID in the Target, you want this to be the same on all servers so when you migrate a server it will migrate without issue.
This can be stored with the VM or in another Datastore, this is just the files that define the mapping to the Lun not the Lun itself.
Here I found that it makes no difference so you can choose either depending on your preference.
This one doesn’t really matter either so you can choose which ever.
As I said at the beginning, there is only one Raw Lun Mapping per Physical server so the picture above is what you get if you try and add a second Raw Lun Mapping on the same ESX server, MAJOR LIMITATION.
Now you have your Raw Lun mapping you just need to make it Sharable..
FFrom NSSMU program on Netware just select the device and press F6 to share it, now you can install clustering and make a Clustered volume and services..
Ok that was the easy one!!!! Now onto the Harder more flexible one.
NCS with VMWare Virtual Disks.
The benefit here is that you can have multiple members of the same or different clusters on one ESX Server. So say I have a Netware File Server Cluster and GroupWise Server cluster I can have one node of the Netware file server cluster on Server 1 as well as one node of the GroupWise Server cluster or even have more then one node of each of the servers or have a two node cluster on one physical server. It is really open there.
The First thing you have to do is figure out what volume sizes you need for the server. You are limited to 2 Terabytes per Virtual Disk. A virtual disk is basically a file system in a file, like an ISO image, and when you configure a Datastore in VMWare you specify the size of the largest possible file, the largest being 2 terabytes.
Second is to create a Datastore area.
Select the Add Storage.
Select the Disk/Lun option
In order make sure you can move these servers to other ESX servers make sure the Lun number are the same on all servers.
Here you will need to pick the maximum file size of the data store, it ranges from 256GB to 2TB, this will also define how large any single Virtual disk will be. Of course if your savvy enough you probably already realized you could probably do a stripped volume of multiple 256GB Virtual Disks.
Once this is done you now have a file system and the Linux fun BEGINS!!!!
The issue with Virtual disks is that when you try and access a VD from more then one server you get a disk lock error when trying to start the server, this is no good seeing as Clustering needs a Quorum disk at minimum (well not really you can do it without but that is no fun, I know I did it). So you have to get VMware to access one disk from multiple servers and in order to do this you have to create a Virtual Disk with the “thick” option, do not ask me what that stands for because I do not know.
The issue is that there is no way from the infrastructure client to do this; you have to access the ESX server command line itself.
There are two ways to access the server; the first is to go to the server console itself or to SSH into the server. If you are physically at the server you can use the root login but if you are coming in from SSH you have to use an account that can have terminal access so you can gain a session into the box then do a switch user (su) command to login as root.
To gain ssh access, point your VMWare infrastructure client to the physical ESX server, Virtual Center access does not have this option.
I assume that the adm account is good to edit, seeing as that is what VMWare tech support had me use for another issue.
Make sure you change the password to something you know as well as check the “Grant Shell Access to the User” check box.
Next is to ssh into the box
If you notice I cd’d to the vmfs and volumes directory, this is where your data-stores are.
The next part is to create a Virtual Disk with the thick option set, I gained this command from the Microsoft clustering, the only one VMWare supports.
The command to create a Virtual disk with the thick option is:
/usr/sbin/vmkfstools –c <Size of Volume in Bytes> –d thick –a lsilogic <path to Virtual disk>
So for Cluster Disk clstrdisk1 under the NW1Data file system that I want to be 10GB I would enter:
/usr/sbin/vmkfstools –c 100000000000 –d thick –a lsilogic vmfs/volumes/NW1Data/clstrdisk1.vmdk
The –d is for the disk format and the –a is for the adapter, if you do not specify the –a option you will get a message the first time you start the server saying the disk was made with a buslogic adapter, do you want to change it to LSI Logic. You will need to use the LSILogic adapter for clustering though (that is what VMWare supports).
Once this is done you can proceed to create your servers, just make sure they point to the Virtual disks you created. During the creation or after you need to change the Scsi controller setting on the VM..
The default is None, Virtual, as it says in the description allows you to share within the same ESX server and Physical allows you to share across servers.
Once this is all done you can start the VM and set the device to share, as described in the Raw Mapping example. Then install cluster services.
How I did it myself.
We have a blade server with 5 blades and a SAN connected to them, we have 5 ESX Servers and Virtual Center with the VMotion option.
What I did was configure the servers on one server and then I migrated the server to another ESX server with VMotion. I am a bit new to ESX so I don’t know how to register a datastore already created on the SAN from another ESX server but if I create it all on one server then migrate the cluster nodes to other servers VC seems register it automatically. When I tried to create a data-store point to the data store another ESX server created it would keep telling me all the info would be deleted so I never did it, however I suspect that all that needs to be done is create a linked file in the volumes directory to the hardware device associated with the Lun or that there is a tool that will do this for you from the command line.
What I found with the LUNS not matching is that when you use VMotion and try and move a server it will have a problem moving from one LUN number to another, if the LUN is the same then it seems to just move the references, this makes for a very fast move operation with no data loss.
Anyway, I hope this helps someone out