dgtest.job

This job demonstrates downloading a file from the datagrid.

Usage

> zos login --user zenuser
Please enter current password for 'zenuser':
Logged into grid as zenuser

> zos jobinfo --detail dgtest
Jobname/Parameters    Attributes
------------------    ----------
dgtest                 Desc: This job demonstrates downloading from the Datagrid

    multicast        Desc: Whether to download using multicast or unicast
                     Type: Boolean
                   Default: false

    filename         Desc: The filename to download from the Datagrid
                     Type: String
                  Default: None! Value must be specified

Description

Demonstrates usage of the datagrid to download a file stored on the ZENworks Orchestrator Server to a node. For additional background information, see Section 3.1, Defining the Datagrid.

Because it typically grows quite large, the physical location of the ZENworks Orchestrator root directory is important. Use the following procedure to determine the location of the datagrid in the Orchestrator server console:

  1. Select the grid id on the left in the Orchestrator Explorer window >

  2. Click the Constraints/Facts tab.

    The read-only fact name (matrix.datagrid.root) is located here by default:

    /var/opt/novell/zenworks/zos/server
    

    The top level directory name is dataGrid.

    Contents of the ZENworks Orchestrator can be seen with the Console command:

    > zos dir grid:///
         <DIR>        Dec-6-2007 6:55 installs
         <DIR>        Dec-6-2007 6:55 jobs
         <DIR>        Dec-6-2007 22:01 users
         <DIR>        Dec-6-2007 6:55 vms
         <DIR>        Dec-6-2007 6:56 warehouse
    

Job Files

The files that make up the Dgtest job include:

dgtest                                          # Total: 238 lines
|-- dgtest.jdl                                  #  172 lines
`-- dgtest.policy                               #   66 lines

dgtest.jdl

  1  # ---------------------------------------------------------------------- -------
  2  #  Copyright © 2008 Novell, Inc. All Rights Reserved.
  3  #
  4  #  NOVELL PROVIDES THE SOFTWARE "AS IS," WITHOUT ANY EXPRESS OR IMPLIED
  5  #  WARRANTY, INCLUDING WITHOUT THE IMPLIED WARRANTIES OF MERCHANTABILITY ,
  6  #  FITNESS FOR A PARTICULAR PURPOSE, AND NON-INFRINGMENT.  NOVELL, THE AUTHORS
  7  #  OF THE SOFTWARE, AND THE OWNERS OF COPYRIGHT IN THE SOFTWARE ARE NOT LIABLE
  8  #  FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
  9  #  TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE
 10  #  OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 11  # -----------------------------------------------------------------------------
 12  #  $Id: dgtest.jdl,v 1.4 2008/03/05 20:05:43 ray Exp $
 13  # -----------------------------------------------------------------------------
 14
 15  """
 16  Example usage of DataGrid to download a file stored on the Server to a node.
 17
 18  Setup:
 19      Before running the job, you must:
 20          (1) Create a dgtest resource group using the management console.
 21         (2) Copy a suitable file into the Server DataGrid
 22          (3) Modify the dgtest policy with the filename to download
 23                  (to not use the default test file).
 24
 25      For example, use the following command to copy the file 'suse-10-fla t.vmdk'
 26      into the deployment area for the job 'dgtest'
 27          >zos mkdir grid:///images
 28
 29          >zos copy suse-10-flat.vmdk grid:///images/
 30
 31      To verify the file is there:
 32          >zos dir grid:///images
 33
 34
 35  To start the job after the above setup steps are complete:
 36          >zos run dgtest filename=suse-10-flat.vmdk
 37
 38  """
 39  import os,time
 40
 41  #
 42  # Add to the 'examples' group on deployment
 43  #
 44  if __mode__ == "deploy":
 45      try:
 46          jobgroupname = "examples"
 47          jobgroup = getMatrix().getGroup(TYPE_JOB, jobgroupname)
 48          if jobgroup == None:
 49              jobgroup = getMatrix().createGroup(TYPE_JOB, jobgroupname)
 50          jobgroup.addMember(__jobname__)
 51      except:
 52          exc_type, exc_value, exc_traceback = sys.exc_info()
 53          print "Error adding %s to %s group: %s %s" % (__jobname__, jobgr oupname, exc_type, exc_value)
 54
 55
 56  class test(Job):
 57
 58       def job_started_event(self):
 59            filename = self.getFact("jobargs.filename")
 60            print "Starting Datagrid Test Job."
 61            print "Filename: %s" % (filename)
 62
 63            rg = None
 64            try:
 65               rg = getMatrix().getGroup("resource","dgtest")
 66            except:
 67               # no such group
 68               pass
 69
 70            if rg == None:
 71               self.fail("The resource group 'dgtest' was not found. It is  required for this job.")
 72               return
 73
 74            members = rg.getMembers()
 75            count = 0
 76            for resource in members:
 77                if resource.getFact("resource.online") == True and \
 78                      resource.getFact("resource.enabled") == True:
 79                     count += 1
 80
 81            memo = "Scheduling Datagrid Test on %d Joblets" % (count)
 82            self.setFact("jobinstance.memo",memo)
 83            print memo
 84            self.schedule(testnode,count)
 85
 86
 87  class testnode(Joblet):
 88
 89       def joblet_started_event(self):
 90            jobletnum = self.getFact("joblet.number")
 91            print "Running datagrid test joblet #%d" % (jobletnum)
 92            filename = self.getFact("jobargs.filename")
 93            multicast = self.getFact("jobargs.multicast")
 94
 95            # Test download a file from server job directory
 96            dg_url = "grid:///images/" + filename
 97
 98            # Create an intance of the JDL DataGrid object
 99            # This object is used to manage DataGrid operations
100            dg = DataGrid()
101
102            # Set to always force a download.
103            dg.setCache(False)
104
105            # Set whether to use multicast or unicast
106            # If set to True, then the following  4 multicast
107            # options are applicable
108            dg.setMulticast(multicast)
109
110            # how long to wait for a quorom (milliseconds)
111            #dg.setMulticastWait( 10000 )
112
113            # Number of receivers that constitute a quorum
114            #dg.setMulticastQuorum(4)
115
116            # Requested data rate in bytes per second. 0 means use default
117            #dg.setMulticastRate(0)
118
119            # Min number of receivers
120            #dg.setMulticastMin(1)
121
122            if multicast:
123                mode = "multicast"
124            else:
125                mode = "unicast"
126
127            memo = "Starting %s download of file: %s" % (mode,dg_url)
128            self.setFact("joblet.memo",memo)
129            print memo
130
131            # Destination defaults to Node's Joblet dir.
132            # Change this path to go to any other local filesystem.
133            # e.g. to store in /tmp:
134            #     dest = "/tmp/" + filename
135            dest = filename
136            try:
137                dg.copy(dg_url,dest)
138            except:
139                exc_type, exc_value, exc_traceback = sys.exc_info()
140                retryUnicast = False
141                if multicast == True:
142                    # If node's OS and/or NIC does not fully support multi cast,
143                    # then the node will timeout waiting for broadcasts.
144                    # Note the error and fallback to unicast
145                    if exc_type != None and len(str(exc_type)) > 0:
146                        msg = str(exc_type)
147                        index = msg.find("Multicast receive timed out")
148                        retryUnicast = index != -1
149
150                if retryUnicast:
151                    memo = "Multicast timeout. Fallback to unicast"
152                    self.setFact("joblet.memo",memo)
153                    print memo
154                    dg.setMulticast(False)
155                    dg.copy(dg_url,dest)
156                else:
157                    raise exc_type,exc_value
158
159            if os.path.exists(dest):
160                 print dg_url + " downloaded successfully."
161
162                 # Show directory listing of downloaded file to job log
163                 if self.getFact("resource.os.family") == "windows":
164                     cmd = "dir %s" % (dest)
165                 else:
166                     cmd = "ls -lsart %s" % (dest)
167
168                 system(cmd)
169            else:
170                 raise RuntimeError, "Datagrid copy() failed"
171
172            print "Datagrid test completed"

dgtest.policy

 1  <!--
 2   *=============================================================================
 3   * Copyright © 2008 Novell, Inc. All Rights Reserved.
 4   *
 5   * NOVELL PROVIDES THE SOFTWARE "AS IS," WITHOUT ANY EXPRESS OR IMPLIED
 6   * WARRANTY, INCLUDING WITHOUT THE IMPLIED WARRANTIES OF MERCHANTABILITY,
 7   * FITNESS FOR A PARTICULAR PURPOSE, AND NON INFRINGMENT.  NOVELL, THE AUTHORS
 8   * OF THE SOFTWARE, AND THE OWNERS OF COPYRIGHT IN THE SOFTWARE ARE NOT LIABLE
 9   * FOR ANY CLAIM, DAMAGES, OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
10   * TORT, OR OTHERWISE, ARISING FROM, OUT OF, OR IN CONNECTION WITH THE SOFTWARE
11   * OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
12   *=============================================================================
13   * $Id: dgtest.policy,v 1.2 2008/02/27 20:49:29 john Exp $
14   *=============================================================================
15   -->
16
17  <policy>
18
19       <jobargs>
20
21            <!--
22               Name of file that is stored in the Datagrid area to
23               download to the resource.
24
25               A value for this fact the 'zos run' is assigned when
26               using the 'zos run' command.
27            -->
28            <fact name="filename"
29                  type="String"
30                  description="The filename to download from the Datagrid"
31                  />
32
33            <fact name="multicast"
34                  type="Boolean"
35                  description="Whether to download using multicast or unicast"
36                  value="false" />
37
38       </jobargs>
39
40       <job>
41            <fact name="description"
42                  type="String"
43                  value="This job demonstrates downloading from the Datagrid" />
44
45            <!-- limit to one per host -->
46            <fact name="joblet.maxperresource"
47                  type="Integer"
48                  value="1" />
49       </job>
50
51
52       <!--
53          This job will only run on resources in the "dgtest" resource group.
54
55          You must create a Resource Group named 'dgtest' using the management
56          console and populate the new group with resources that you wish to have
57          participate in the datagrid test.
58       -->
59       <constraint type="resource" reason="No resources are in the dgtest group" >
60
61            <contains fact="resource.groups" value="dgtest"
62              reason="Resource is not in the dgtest group" />
63
64       </constraint>
65
66  </policy>

Classes and Methods

Definitions:

Job

A representation of a running job instance.

Joblet

Defines execution on the resource.

MatrixInfo

A representation of the matrix grid object, which provides operations for retrieving and creating grid objects in the system. MatrixInfo is retrieved using the built-in getMatrix() function. Write capability is dependent on the context in which getMatrix() is called. For example, in a joblet process on a resource, creating new grid objects is not supported.

GroupInfo

A representation of Group grid objects. Operations include retrieving the group member lists and adding/removing from the group member lists, and retrieving and setting facts on the group.

test

Class test (line 42 in dgtest.jdl is derived from the Job class.

testnode

Class testnode (line 73 in dgtest.jdl is derived from the Joblet class.

Job Details

dgtest.job can be broken down into the following parts:

Policy

In addition to describing the filename and multicast jobargs and the default settings for multicast (lines 3-22) in the dgtest.policy file, there is the <job/> section (lines 24-33), which describes static facts (Section 5.1.2, Facts).You must assign the filename argument when executing this example. This is only the name of the file in the “images” area of ZENworks Orchestrator. For example, for grid:///images/disk.img, just assign disk.img to the argument. This file must be in the ZENworks Orchestrator file system for fetching and delivering to remote nodes used in this example.

To populate the ZENworks Orchestrator, use the zos copy command. For example, for a file named suse-9-flat.vmd in the current directory, use the following command:

> zos mkdir grid:///images
> zos copy suse-9-flat.vmd grid:///images/

The multicast jobarg is a Boolean, defaulted to false so that unicast is used for transport. Set this value to true to use multicast transport for delivery of the file.

The policy in the <job/> section also describes a resource.groups constraint. (For more information, see Constraints). This requires a resource group named dgtest (lines 30-39 in dgtest.policy) and that group should have member nodes. Consequently, you must create this resource group using the Orchestrator server console and assign it some nodes to run this example successfully.

zosadmin deploy

When the Orchestrator server deploys a job for the first time (see Section 7.5, Deploying Jobs), the job JDL files are executed in a special deploy mode. Looking at dgtest.jdl, you might notice that when the job is deployed (line 30), either via the Orchestrator console or the zosadmin deploy command, that it attempts to find the examples jobgroup (lines 32-33), create it if missing (lines 34-45), and add the dgtest job to the group (line 36).

If this deployment fails for some reason, an exception is thrown (line 37), which prints (line 39) the job name, group name, exception type, and value.

job_started_event

In dgtest.jdl, the test class (line 42) defines only the required job_started_event (line 44) method. This method runs on the Orchestration server when the job is run to launch the joblets.

When job_started_event is executed, it gets the name of the file assigned to the jobargs.filename variable and prints useful tracing information (lines 45-47). It then tries to find the resource group named dgtest. If the resource group doesn’t exist, the member fail string is set to inform the user and returns without scheduling the joblet(s) (lines 49-58).

After finding the dgtest group, the job gets the member list and determines how many nodes are online and enabled. The total count is stored in lines 60-65. After setting the memo line in the Console (67-69), the job schedules count number of testnode joblets (line 70).

joblet_started_event

In dgtest.jdl, the testnode class (line 73) defines only the required joblet_started_event (line 75) method. This method runs on the Orchestrator agent nodes when scheduled by a Job class.

The joblet_started_event prints some trace information (lines 76-77), gets the name of the file to transfer (line 78) and the mode of transfer (line 79), and creates the grid URL for the file (line 82).

A DataGrid is instantiated (line 86), set not to cache (line 89), and set to use the multicast jobarg (line 94). The next four settings control multicast behavior are commented out (lines 97, 100, 103, and 106).

The joblet prints a memo line for the Orchestrator console (lines 108-115), sets the location for the file on the local node (line 121), and tries to transfer the file from the datagrid (line 123).

If the datagrid copy at line 123 fails for some reason, we have a retry mechanism in the exception handler (lines 125-143). The information for why the exception occurred is fetched (line125).

The variable retryUnicast (line 126) is set False and will only be set True if the failed download attempt was using multicast transport and the exception type has the string "Multicast receive timed out" (lines 125-134). If the timed out string is not found, the triad assigns the retryUnicast a value of -1. With this logic, either multicast timeout or not, a unicast attempt is made if multicast fails.

If you get to line 136 from a failed multicast copy, a memo for the Orchestrator console is set and printed to the log (137-138), setMulticast is set to false (140), and another copy from the datagrid is attempted.

If we get to line 136 from a failed unicast copy, an exception is raised (line 143) and we’re done.

Configure and Run

> zos run dgtest filename=suse-9-flat.vmd
JobID: zenuser.dgtest.323

Looks like it ran successfully; let’s see what the log says:

> zos log zenuser.dgtest.323
Starting Datagrid Test Job.
Filename: suse-9-flat.vmd
Job 'zenuser.dgtest.323' terminated because of failure. Reason: The resource group 'dgtest' was not found. It is required for this job.

There is no resource group. Using the Orchestration Console create the resource group dgtest:

> zos run dgtest filename=suse-9-flat.vmd
JobID: zenuser.dgtest.324

> zos log zenuser.dgtest.324
Starting Datagrid Test Job.
Filename: suse-9-flat.vmd
Scheduling Datagrid Test on 0 Joblets

NOTE:No joblets were scheduled because we have no active nodes in the group.

Using the Orchestrator Console, populate the dgtest group with nodes that are both online and anabled:

> zos run dgtest filename=suse-9-flat.vmd
JobID: zenuser.dgtest.325

> zos log zenuser.dgtest.325
Starting Datagrid Test Job.
Filename: suse-9-flat.vmd
Scheduling Datagrid Test on 2 Joblets
[freeze] Running datagrid test joblet #0
[freeze] Starting unicast download of file: grid:///images/suse-9-flat.vmd
[freeze] Traceback (innermost last):
[freeze]   File "dgtest.jdl", line 143, in joblet_started_event
[freeze] copy() failed: DataGrid file "/images/suse-9-flat.vmd" does not exist.
Job 'zenuser.dgtest.325' terminated because of failure. Reason: Job failed because of too many joblet failures (job.joblet.maxfailures = 0)
[melt] Running datagrid test joblet #1
[melt] Starting unicast download of file: grid:///images/suse-9-flat.vmd
[melt] Traceback (innermost last):
[melt]   File "dgtest.jdl", line 143, in joblet_started_event
[melt] copy() failed: DataGrid file "/images/suse-9-flat.vmd" does not exist.

Because the path and the file in the DataGrid are missing, we need to create and populate them:

> zos mkdir grid:///images
Directory created.

> zos copy suse-9-flat.vmd grid:///images/
suse-9-flat.vmd copied.

> zos run dgtest filename=suse-9-flat.vmd
JobID: zenuser.dgtest.326

> zos log zenuser.dgtest.326
Starting Datagrid Test Job.
Filename: suse-9-flat.vmd
Scheduling Datagrid Test on 2 Joblets
[melt] Running datagrid test joblet #1
[melt] Starting unicast download of file: grid:///images/suse-9-flat.vmd
[melt] grid:///images/suse-9-flat.vmd downloaded successfully.
[melt] 16732 -rw-r--r-- 1 root root 17108462 Dec 21 21:32 suse-9-flat.vmd
[melt] Datagrid test completed
[freeze] Running datagrid test joblet #0
[freeze] Starting unicast download of file: grid:///images/suse-9-flat.vmd
[freeze] grid:///images/suse-9-flat.vmd downloaded successfully.
[freeze] 16732 -rw-r--r-- 1 root root 17108462 Dec 21 21:31 suse-9-flat.vmd
[freeze] Datagrid test completed

Finally, the file is deployed from the datagrid and copied successfully. However, you will not find it if you look for it on the agent after the joblet is finished. By default, the file is deployed only for the joblet’s lifetime into a directory for the joblet, like the following:

/var/opt/novell/zenworks/zos/agent/node.default/melt/zenuser.dgtest.326.0

So, for a more permanent demonstration, see lines 118-120 in dgtest.jdl. Uncomment line 120 and comment out line 121 to store your file in the /tmp directory and have it continue to exist on the agent after the joblet executes completely.