This job demonstrates downloading a file from the datagrid.
> zos login --user vmmanager Please enter current password for 'vmmanager': Logged into grid as vmmanager > zos jobinfo --detail dgtest Jobname/Parameters Attributes ------------------ ---------- dgtest Desc: This job demonstrates downloading from the Datagrid multicast Desc: Whether to download using multicast or unicast Type: Boolean Default: false filename Desc: The filename to download from the Datagrid Type: String Default: None! Value must be specified
Demonstrates usage of the datagrid to download a file stored on the Grid Management Server (GMS) to a node. For additional background information, see Section 3.1, Defining the Datagrid.
Because it typically grows quite large, the physical location of the GMS root directory is important. Use the following procedure to determine the location of the datagrid in the Orchestrator server console:
Select the grid id on the left in the Orchestrator Explorer window >
Click the
tab.The read-only fact name (matrix.datagrid.root) is located here by default:
/var/opt/novell/zenworks/zos/server
The top level directory name is dataGrid.
Contents of the GMS can be seen with the Console command:
> zos dir grid:/// <DIR> Dec-6-2007 6:55 installs <DIR> Dec-6-2007 6:55 jobs <DIR> Dec-6-2007 22:01 users <DIR> Dec-6-2007 6:55 vms <DIR> Dec-6-2007 6:56 warehouse
The files that make up the Dgtest job include:
dgtest # Total: 208 lines |-- dgtest.jdl # 158 lines `-- dgtest.policy # 50 lines
1 """ 2 Example usage of DataGrid to download a file stored on the GMS Server to a node. 3 4 Setup: 5 Before running the job, you must: 6 (1) Create a dgtest resource group using the management console. 7 (2) Copy a suitable file into the GMS Server datagrid 8 (3) Modify the dgtest policy with the filename to download 9 (to not use the default test file). 10 11 For example, use the following matrix command to copy the file 'suse-9-flat.vmdk' 12 into the deployment area for the job 'dgtest' 13 >matrix mkdir grid:///images 14 15 >matrix copy suse-9-flat.vmdk grid:///images/ 16 17 To verify the file is there: 18 >matrix dir grid:///images 19 20 21 To start job, after the above setup steps are complete: 22 >matrix run dgtest filename=suse-9-flat.vmdk 23 24 """ 25 import os,time 26 27 # 28 # Add to the 'examples' group on deployment 29 # 30 if __mode__ == "deploy": 31 try: 32 jobgroupname = "examples" 33 jobgroup = getMatrix().getGroup(TYPE_JOB, jobgroupname) 34 if jobgroup == None: 35 jobgroup = getMatrix().createGroup(TYPE_JOB, jobgroupname) 36 jobgroup.addMember(__jobname__) 37 except: 38 exc_type, exc_value, exc_traceback = sys.exc_info() 39 print "Error adding %s to %s group: %s %s" % (__jobname__, jobgroupname, exc_type, exc_value) 40 41 42 class test(Job): 43 44 def job_started_event(self): 45 filename = self.getFact("jobargs.filename") 46 print "Starting Datagrid Test Job." 47 print "Filename: %s" % (filename) 48 49 rg = None 50 try: 51 rg = getMatrix().getGroup("resource","dgtest") 52 except: 53 # no such group 54 pass 55 56 if rg == None: 57 self.fail("The resource group 'dgtest' was not found. It is required for this job.") 58 return 59 60 members = rg.getMembers() 61 count = 0 62 for resource in members: 63 if resource.getFact("resource.online") == True and \ 64 resource.getFact("resource.enabled") == True: 65 count += 1 66 67 memo = "Scheduling Datagrid Test on %d Joblets" % (count) 68 self.setFact("jobinstance.memo",memo) 69 print memo 70 self.schedule(testnode,count) 71 72 73 class testnode(Joblet): 74 75 def joblet_started_event(self): 76 jobletnum = self.getFact("joblet.number") 77 print "Running datagrid test joblet #%d" % (jobletnum) 78 filename = self.getFact("jobargs.filename") 79 multicast = self.getFact("jobargs.multicast") 80 81 # Test download a file from server job directory 82 dg_url = "grid:///images/" + filename 83 84 # Create an intance of the JDL DataGrid object 85 # This object is used to manage DataGrid operations 86 dg = DataGrid() 87 88 # Set to always force a download. 89 dg.setCache(False) 90 91 # Set whether to use multicast or unicast 92 # If set to True, then the following 4 multicast 93 # options are applicable 94 dg.setMulticast(multicast) 95 96 # how long to wait for a quorom (milliseconds) 97 #dg.setMulticastWait( 10000 ) 98 99 # Number of receivers that constitute a quorum 100 #dg.setMulticastQuorum(4) 101 102 # Requested data rate in bytes per second. 0 means use default 103 #dg.setMulticastRate(0) 104 105 # Min number of receivers 106 #dg.setMulticastMin(1) 107 108 if multicast: 109 mode = "multicast" 110 else: 111 mode = "unicast" 112 113 memo = "Starting %s download of file: %s" % (mode,dg_url) 114 self.setFact("joblet.memo",memo) 115 print memo 116 117 # Destination defaults to Node's Joblet dir. 118 # Change this path to go to any other local filesystem. 119 # e.g. to store in /tmp: 120 # dest = "/tmp/" + filename 121 dest = filename 122 try: 123 dg.copy(dg_url,dest) 124 except: 125 exc_type, exc_value, exc_traceback = sys.exc_info() 126 retryUnicast = False 127 if multicast == True: 128 # If node's OS, NIC does not fully support multicast, 129 # then the node will timeout waiting for broadcasts. 130 # Note the error and fallback to unicast 131 if exc_type != None and len(str(exc_type)) > 0: 132 msg = str(exc_type) 133 index = msg.find("Multicast receive timed out") 134 retryUnicast = index != -1 135 136 if retryUnicast: 137 memo = "Multicast timeout. Fallback to unicast" 138 self.setFact("joblet.memo",memo) 139 print memo 140 dg.setMulticast(False) 141 dg.copy(dg_url,dest) 142 else: 143 raise exc_type,exc_value 144 145 if os.path.exists(dest): 146 print dg_url + " downloaded successfully." 147 148 # Show directory listing of downloaded file to job log 149 if self.getFact("resource.os.family") == "windows": 150 cmd = "dir %s" % (dest) 151 else: 152 cmd = "ls -lsart %s" % (dest) 153 154 system(cmd) 155 else: 156 raise RuntimeError, "Datagrid copy() failed" 157 158 print "Datagrid test completed"
1 <policy> 2 3 <jobargs> 4 5 <!-- 6 Name of file that is stored in the Datagrid area to 7 download to the resource. 8 9 A value for this fact the 'zos run' is assigned when 10 using the 'zos run' command. 11 --> 12 <fact name="filename" 13 type="String" 14 description="The filename to download from the Datagrid" 15 /> 16 17 <fact name="multicast" 18 type="Boolean" 19 description="Whether to download using multicast or unicast" 20 value="false" /> 21 22 </jobargs> 23 24 <job> 25 <fact name="description" 26 type="String" 27 value="This job demonstrates downloading from the Datagrid" /> 28 29 <!-- limit to one per host --> 30 <fact name="joblet.maxperresource" 31 type="Integer" 32 value="1" /> 33 </job> 34 35 36 <!-- 37 This job will only run on resources in the "dgtest" resource group. 38 39 You must create a Resource Group named 'dgtest' using the management 40 console and populate the new group with resources that you wish to have 41 participate in the datagrid test. 42 --> 43 <constraint type="resource" reason="No resources are in the dgtest group" > 44 45 <contains fact="resource.groups" value="dgtest" 46 reason="Resource is not in the dgtest group" /> 47 48 </constraint> 49 50 </policy>
A representation of a running job instance.
Defines execution on the resource.
A representation of the matrix grid object, which provides operations for retrieving and creating grid objects in the system. MatrixInfo is retrieved using the built-in getMatrix() function. Write capability is dependent on the context in which getMatrix() is called. For example, in a joblet process on a resource, creating new grid objects is not supported.
A representation of Group grid objects. Operations include retrieving the group member lists and adding/removing from the group member lists, and retrieving and setting facts on the group.
Class test (line 42 in dgtest.jdl is derived from the Job class.
Class testnode (line 73 in dgtest.jdl is derived from the Joblet class.
dgtest.job can be broken down into the following parts:
In addition to describing the filename and multicast jobargs and the default settings for multicast (lines 3-22) in the dgtest.policy file, there is the <job/> section (lines 24-33), which describes static facts (Section 5.1.2, Facts).You must assign the filename argument when executing this example. This is only the name of the file in the “images” area of the GMS. For example, for grid:///images/disk.img, just assign disk.img to the argument. This file must be in the GMS file system for fetching and delivering to remote nodes used in this example.
To populate the GMS use the zos copy command. For example, for a file named suse-9-flat.vmd in the current directory, use the following command:
> zos mkdir grid:///images > zos copy suse-9-flat.vmd grid:///images/
The multicast jobarg is a Boolean, defaulted to false so that unicast is used for transport. Set this value to true to use multicast transport for delivery of the file.
The policy in the <job/> section also describes a resource.groups constraint. (For more information, see Constraints). This requires a resource group named dgtest (lines 30-39 in dgtest.policy) and that group should have member nodes. Consequently, you must create this resource group using the Orchestrator server console and assign it some nodes to run this example successfully.
When the Orchestrator server deploys a job for the first time (see Section 7.5, Deploying Jobs), the job JDL files are executed in a special deploy mode. Looking at dgtest.jdl, you might notice that when the job is deployed (line 30), either via the Orchestrator console or the zosadmin deploy command, that it attempts to find the examples jobgroup (lines 32-33), create it if missing (lines 34-45), and add the dgtest job to the group (line 36).
If this deployment fails for some reason, an exception is thrown (line 37), which prints (line 39) the job name, group name, exception type, and value.
In dgtest.jdl, the test class (line 42) defines only the required job_started_event (line 44) method. This method runs on the Orchestration server when the job is run to launch the joblets.
When job_started_event is executed, it gets the name of the file assigned to the jobargs.filename variable and prints useful tracing information (lines 45-47). It then tries to find the resource group named dgtest. If the resource group doesn’t exist, the member fail string is set to inform the user and returns without scheduling the joblet(s) (lines 49-58).
After finding the dgtest group, the job gets the member list and determines how many nodes are online and enabled. The total count is stored in lines 60-65. After setting the memo line in the Console (67-69), the job schedules count number of testnode joblets (line 70).
In dgtest.jdl, the testnode class (line 73) defines only the required joblet_started_event (line 75) method. This method runs on the Orchestrator agent nodes when scheduled by a Job class.
The joblet_started_event prints some trace information (lines 76-77), gets the name of the file to transfer (line 78) and the mode of transfer (line 79), and creates the grid URL for the file (line 82).
A DataGrid is instantiated (line 86), set not to cache (line 89), and set to use the multicast jobarg (line 94). The next four settings control multicast behavior are commented out (lines 97, 100, 103, and 106). See ,, , and .
The joblet prints a memo line for the Orchestrator console (lines 108-115), sets the location for the file on the local node (line 121), and tries to transfer the file from the datagrid (line 123).
If the datagrid copy at line 123 fails for some reason, we have a retry mechanism in the exception handler (lines 125-143). The information for why the exception occurred is fetched (line125).
The variable retryUnicast (line 126) is set False and will only be set True if the failed download attempt was using multicast transport and the exception type has the string "Multicast receive timed out" (lines 125-134). If the timed out string is not found, the triad assigns the retryUnicast a value of -1. With this logic, either multicast timeout or not, a unicast attempt is made if multicast fails.
If you get to line 136 from a failed multicast copy, a memo for the Orchestrator console is set and printed to the log (137-138), setMulticast is set to false (140), and another copy from the datagrid is attempted.
If we get to line 136 from a failed unicast copy, an exception is raised (line 143) and we’re done.
> zos run dgtest filename=suse-9-flat.vmd JobID: vmmanager.dgtest.323
Looks like it ran successfully; let’s see what the log says:
> zos log vmmanager.dgtest.323 Starting Datagrid Test Job. Filename: suse-9-flat.vmd Job 'vmmanager.dgtest.323' terminated because of failure. Reason: The resource group 'dgtest' was not found. It is required for this job.
There is no resource group. Using the Orchestration Console create the resource group dgtest:
> zos run dgtest filename=suse-9-flat.vmd JobID: vmmanager.dgtest.324 > zos log vmmanager.dgtest.324 Starting Datagrid Test Job. Filename: suse-9-flat.vmd Scheduling Datagrid Test on 0 Joblets
NOTE:No joblets were scheduled because we have no active nodes in the group.
Using the Orchestration Console populate the dgtest group with nodes that are both online and anabled:
> zos run dgtest filename=suse-9-flat.vmd JobID: vmmanager.dgtest.325 > zos log vmmanager.dgtest.325 Starting Datagrid Test Job. Filename: suse-9-flat.vmd Scheduling Datagrid Test on 2 Joblets [freeze] Running datagrid test joblet #0 [freeze] Starting unicast download of file: grid:///images/suse-9-flat.vmd [freeze] Traceback (innermost last): [freeze] File "dgtest.jdl", line 143, in joblet_started_event [freeze] copy() failed: DataGrid file "/images/suse-9-flat.vmd" does not exist. Job 'vmmanager.dgtest.325' terminated because of failure. Reason: Job failed because of too many joblet failures (job.joblet.maxfailures = 0) [melt] Running datagrid test joblet #1 [melt] Starting unicast download of file: grid:///images/suse-9-flat.vmd [melt] Traceback (innermost last): [melt] File "dgtest.jdl", line 143, in joblet_started_event [melt] copy() failed: DataGrid file "/images/suse-9-flat.vmd" does not exist.
Because the path and the file in the DataGrid are missing, we need to create and populate them:
> zos mkdir grid:///images Directory created. > zos copy suse-9-flat.vmd grid:///images/ suse-9-flat.vmd copied. > zos run dgtest filename=suse-9-flat.vmd JobID: vmmanager.dgtest.326 > zos log vmmanager.dgtest.326 Starting Datagrid Test Job. Filename: suse-9-flat.vmd Scheduling Datagrid Test on 2 Joblets [melt] Running datagrid test joblet #1 [melt] Starting unicast download of file: grid:///images/suse-9-flat.vmd [melt] grid:///images/suse-9-flat.vmd downloaded successfully. [melt] 16732 -rw-r--r-- 1 root root 17108462 Dec 21 21:32 suse-9-flat.vmd [melt] Datagrid test completed [freeze] Running datagrid test joblet #0 [freeze] Starting unicast download of file: grid:///images/suse-9-flat.vmd [freeze] grid:///images/suse-9-flat.vmd downloaded successfully. [freeze] 16732 -rw-r--r-- 1 root root 17108462 Dec 21 21:31 suse-9-flat.vmd [freeze] Datagrid test completed
Finally, the file is deployed from the dataGrid and copied successfully. However, you will not find it if you look for it on the agent after the joblet is finished. By default, the file is deployed only for the joblet’s lifetime into a directory for the joblet, like the following:
/var/opt/novell/zenworks/zos/agent/node.default/melt/vmmanager.dgtest.326.0
So, for a more permanent demonstration, see lines 118-120 in dgtest.jdl. Uncomment line 120 and comment out line 121 to store your file in the /tmp directory and have it continue to exist on the agent after the joblet executes completely.