Workflow Guide

CHAPTER 7

Workflow Administration

This chapter describes how to use the Workflow administration tools and features. It has these sections:

Using the workflow administration portlets

The exteNd Director install includes two workflow administration portlets for managing the engine, queue, and runtime processes:

Engine and Queue Administration Console (WorkflowEngineAdmin)
Workflow Administration Client (WorkflowAdminClient)

Workflow administrators can access these portlets from the Director Administration Console (DAC).

For more information, see the chapter about the Director Administration Console in Developing exteNd Director Applications.

Engine and Queue Administration Console

wfAdminComp1

The engine and queue administration console provides the following functionality:

Start, suspend, or shut down the workflow queue
Start, suspend, or shut down the workflow engine
Start, suspend, or shut down both

Start functions Start the selected operation. This option starts an operation from scratch, resumes a suspended operation, or restarts an operation that was shut down.

Suspend functions Suspend the specified operation. Any messages sent to the engine or queue are stored but not executed until you select Start.

Shutdown functions Shut down the specified operation. Any messages sent to the engine or queue during the shutdown phase are lost.

NOTE: You cannot shut down the engine or the queue if it is in the suspended state. In other words, the engine or queue must be running in order for the shutdown function to work.

Workflow Administration Client

wfAdminComp2

This portlet allows you to manipulate the execution of a process instance by allowing you to:

See lists of running, suspended, and finished process instances
See a list of the activities and their states in a selected process instance
Suspend or resume a process instance
Suspend or resume an activity in a process instance
Terminate a process instance

Process functions

Process functions operate on all workitems associated with the process, unless that work is already displayed in a user's queue. In this case, workitems are not affected by suspending or resuming either processes or activities. Essentially, the user owns this work, and no existing functions (apart from locking) can prevent the user from updating and forwarding this work.

Suspending a process Suspending a process tells the Workflow subsystem that no updates should be made to workitems associated with that process. Resuming a process returns it to the running state. The effect of suspending and resuming is apparent only when a user forwards work from one activity to another.

For example, assume there are two work queues, workqueue-A and workqueue-B. Also assume that workitem-1 is displayed in workqueue-A, and a forward on workitem-1 would normally result in the work appearing in workqueue-B.

If the workflow process currently being executed by workitem-1 is suspended (by someone selecting the Suspend Process button on the Process Administration Console), a forward on workitem-1 would result in the work disappearing from workqueue-A but not appearing in workqueue-B.

Resuming a process Resuming a process returns it to its original state. Continuing with the same example, assume the workflow process is resumed (by selecting the Resume Process button on the Process Administration Console). A refresh of workqueue-B will reveal workitem-1, ready for updating and forwarding by the user.

Terminating a process Terminating a process tells the Workflow subsystem to remove the process instance and its associated activities from the engine queue process. When a process is terminated, it cannot be recovered.

Activity functions

Suspending and resuming activities is similar to suspending and resuming processes, but enables more precision.

An activity can be in one of the following states:

This state	Means the workitem
Idle	Is not yet running and has not been suspended
Running	Is active and cannot be suspended
Suspended	Has been suspended; resuming the activity will reopen the workitem
Finished	Has been completed and forwarded to the next activity; cannot be suspended
Pending	Activity has finished executing and is waiting for the engine that owns the process to forward to the next activity. Applies only to presentation activities running in a cluster.

Suspending an activity Suspending an activity tells the Workflow subsystem not to forward work to this activity in this process. Only idle activities may be suspended; attempting to suspend a running or finished activity will have no effect. Other combinations may also be ignored (such as attempting to suspend a process or activity that is already suspended).

IMPORTANT: Java activities cannot be suspended, so you cannot access this type of activity using the workflow administration client portlet or the administration APIs.

Continuing with the example from the preceding section, assume there is an additional work queue (workqueue-C) but that workitem-1 is currently displayed in workqueue-A as before. Further assume that a forward from workqueue-B would result in the work appearing in workqueue-C (giving a simple A-B-C queue sequence).

If the workflow activity for workqueue-C is suspended (by someone selecting the Suspend Activity button), then a forward on workitem-1 (currently in workqueue-A) would result in the work showing up in workqueue-B, as expected. However, a forward from workqueue-B would not result in workitem-1 showing up in workqueue-C, since the associated activity is currently suspended.

Resuming an activity Now assume that the activity for workqueue-C is resumed (by someone selecting the Resume Activity button). A refresh of workqueue-C will then reveal workitem-1, ready for updating and forwarding by the user.

Auditing runtime processes

The Workflow API provides the EbiAuditDelegate.getAuditInfo() method for getting audit information about workflow processes, activities, workitems, and workitem properties. Here is the method declaration:

  public Document getAuditInfo(
  	 String processId, String runNumber, String activityName, Date startDate, Date endDate) 
  throws com.sssw.wf.client.EboAuditLoggerException

The DTD for the return document is available in your exteNd Director project directory at library/WorkflowService/DTD/workflow-auditlist.dtd.

NOTE: All parameter values are optional. To return all information, specify null for the parameter.

Generating runtime exception reports

The Workflow API includes exception-handling classes for the Workflow subsystem portlets, including EboEngineException, EboQueueException, and EboWorkitemException. By default, exception messages are written out to the server console. But you can have the messages sent via e-mail to a specified user. The setting for this option is in config.xml.

Procedure To configure exception messages to be sent to an administrator:

Outside exteNd Director, open the following file:

  My_Project/library/WorkflowService/WorkflowService-conf/config.xml

Uncomment the following lines:

   <!--
  	 <property>
  	 	 <key>WorkflowService/administrator-host</key>
  	 	 <value>SomeHost</value>
  	 </property>
  	 <property>
  	 	 <key>WorkflowService/administrator-address</key>
  	 	 <value>SomeAddress</value>
  	 </property>
  	 -->

NOTE: After you uncomment the properties, you can access them in the XML Editor in exteNd Director.

Set the two properties to a valid user or admin group with a valid e-mail address. For example:

  <property>
  	 	 <key>WorkflowService/administrator-host</key>
  	 	 <value>SMPT mail server hostname</value>
  	 </property>
  	 <property>
  	 	 <key>WorkflowService/administrator-address</key>
  	 	 <value>admin@addresss.com</value>
  	 </property>

Save the file.

Configuring workflow to run in a cluster

The workflow subsystem provides support for clustering. A cluster is a group of application servers running on different hosts that share the processing load for a single application. In a cluster configuration, clients interact with the cluster as if it were a single high-performance server. Clustering offers several benefits, including scalability and high availability.

To ensure that each workflow process instance executes successfully in a cluster, the workflow subsystem carefully manages the state of the process throughout its execution. The workflow subsystem also manages state information for the workitem and the queue.

What you need to do

To deploy a workflow application to run in a cluster, you need to deploy the project that contains the workflow to each server in the cluster. Then you start each server with a system property that specifies the ID for the workflow engine.

Deploy the application to each server

To run a workflow application in a cluster, you must first deploy the application to each server in the cluster. The procedure for deploying the application is the same as for any exteNd Director application. The only restriction you need to be aware of is that all of the servers within the cluster must share a single exteNd Director database.

For more information, see the chapter on deploying applications in Developing exteNd Director Applications.

Start each server with a unique engine ID

A workflow process instance must execute on a single workflow engine running within a cluster. To bind a process instance to an engine, the workflow subsystem associates the process with an engine ID that is specified when the server is first started. The system property used to specify the engine ID is com.novell.afw.wf.engine-id. This property setting overrides the value specified in the config.xml file for the workflow subsystem.

The standard format for specifying system properties for the server is:

  -DpropName=propvalue

You can use this format to specify the engine ID. For example, when starting the exteNd Application Server, you would include this argument to set the engine ID:

  +Dcom.novell.afw.wf.engine-id=engine-id-value

The value you specify for the engine ID is a logical name that can include numbers or letters.

IMPORTANT: The workflow administrator is responsible for ensuring that each engine ID value is unique for all the workflow engine instances running in the cluster.

Once started by an engine running on a particular server, a workflow process instance can only run and complete on that server. This ensures that the workflow process executes safely. However, it does not provide process instance failover support. If a server in the cluster crashes, the process instance will not be restarted until an engine with the same ID is restarted.

What to do if a machine cannot be recovered If a server machine cannot be restarted because of a serious hardware or software failure, you can start the application server on a new machine, associating the server with the workflow engine ID that was specified on the unrecoverable machine. Since the engine ID is a logical name, not a direct mapping to the physical machine on which the engine was running, the dangling process instance will complete successfully on the new machine.

What happens at server restart

When a server is restarted after a crash, the workflow engine restarts automatically rather than waiting for a client component to make a request. At this point, the engine restarts any processes that were left dangling at the time of the crash. These processes in turn forward execution to any activities that have not been completed.

Process restart is handled by the engine All logic required for process startup is handled by the workflow engine. During process restart, the engine restarts only those processes that have the same ID as that of the engine. Therefore, the process instance will be restarted by only one server in the cluster. When the engine is restarted, it reads process state information from the WFPROCESSTATE table in the exteNd Director database.

Activity dispatch is handled by the process All logic required for activity dispatch (forwarding of execution) is handled by the process. Any activities (user or system activities) that have completed successfully (including corresonding database updates) are not reexecuted when the process is restarted.

Process instances are tied to the engine that started the process. However, a user may logon to any engine in a cluster to execute a presentation activity (user activity or pageflow activity). When the workflow engine forwards from a presentation activity to the next activity in the workflow, it must execute on the engine that started the workflow process instance. When a workflow engine is finished executing a presentation activity for a process that it does not own, the activity state in the database is set to PENDING. The other engines in the cluster periodically poll the activity state table looking for activities that belong to a process they own. Whenever an engine finds one of these activities, it then forwards to the next activity in the process.

Activity state management To keep track of which activities should be executed at restart time, the system stores activity states in the WFACTIVITYSTATE table in the exteNd Director database. These activity states provide checkpoints that can be used to determine where execution should resume after a restart.

The activity states are as follows:

Activity state	Description
Idle	Activity is not executing but can be dispatched to.
Running	Activity is executing.
Suspended	Activity has been suspended.
Finished	Activity has finished execution.
Pending	Activity has finished executing and is waiting for the engine that owns the process to forward to the next activity. Applies only to presentation activities running in a cluster.

Once an activity is finished processing, it is transitioned to Idle.

Here's a scenario that illustrates what happens when two activities (activity A and activity B) are executed sequentially.

Before execution is dispatched to activity A, the process makes an entry in the WFACTIVITYSTATE table that contains the name of the activity and its state. The entry has the values {"Activity A", "Running"}.
Activity A is executed.
After activity A finishes executing, the entry is modified to be {"Activity A", "Finished"}.
Before execution is dispatched to activity B, the process makes an entry for this activity {"Activity B", "Running"}. Following that, the entry for activity A is removed.

When the engine restarts a process, the process uses data in the WFACTIVITYSTATE table to determine where to resume execution within the process. The data in the WFACTIVITYSTATE table indicates the last finished activity of every executing branch. This information is treated as a kind of checkpoint from which processing can resume. Once the last finished activity is identified, execution is resumed with the following activity.

Workitem state management Workitems retrieved from the engine always update their state directly from the database, rather from a cache. Therefore, all process engines running in a cluster access queue, workitem, and lock state that is consistent and up-to-date.

Workitem changes made by a user on one machine in the cluster are not overwritten by changes made by another user on a different machine. The workflow subsystem uses an optimistic concurrency control that relies on the use of the SQL WHERE clause to safeguard changes made to the workitem. Each UPDATE statement issued against the database includes a WHERE clause that specifies some of the values initially retrieved. If the record has been modified by another engine since it was first retrieved, the update operation fails.