Run Book Orchestration
Automate any data center IT workflow with PlateSpin Orchestrate
Written by Adam Spiers, Till Franke and Bill Tobey
First, a brief disclaimer; this is not exactly an article about run book automation. Within Novell and the industry at large, that term has taken on the sense of workflow-driven IT automation in a management environment that includes centralized process definition, monitoring and compliance. Today, we'll be talking about automation in the data center, but at a lower level, using the workflow execution and job scripting features of PlateSpin Orchestrate. It’s an approach that can be very powerful and flexible for anyone who has a specific problem to solve and is comfortable with a little scripting. It can be ideal for consultants, and can provide the first step in an incremental approach to more complete process automation and governance functions.
Our main purpose here is to show off the automation features of PlateSpin Orchestrate that tend to be overlooked as we emphasize its capabilities as a virtual machine manager within a larger solution for intelligent workload management.
The Run Book: A Coping Mechanism for Data Center Complexity
For the uninitiated, a run book is an IT staff’s hard copy collection of system management cheat sheets. Data centers, of course, are like snowflakes—no two alike. Massive diversity of platforms, hardware and applications is the norm, as is some degree of specialization in the IT staff that manages systems, storage, network and front-line operations. Knowledge management is a chronic problem. How do you document dozens or hundreds of sequence-sensitive, dependency-ridden procedures for startup, maintenance, problem diagnosis and recovery so each instance can be executed successfully?
The traditional answer is the run book, a hard-copy collection of rough workflows, hints, approximations and warnings—invariably incomplete, with gaps guaranteed to occur at the most critical location. Today’s version is more apt to be a wiki repository than a loose-leaf binder, but the core challenges are unchanged. Run books are hard to write, hard to maintain and usually even harder to read.
The Answer Is Automation
One obvious option is to automate as many procedures as is feasible and sensible. Given the choice between reading a lengthy instruction set of questionable reliability or running a script, most of us will instantly choose the latter, usually with better results. Automation can improve productivity, cut reaction times and reduce errors in many areas of IT operations. Various analysts have called out data center automation as an important strategy for meeting SLA commitments within tightening budgets and headcount constraints. IT decision makers have been advised to look for run book automation, process automation and orchestration tools that can be implemented incrementally, with the caveat that such tools must support all the diverse elements of the typically heterogeneous data center—physical and virtual machines, network devices, middleware, applications and data bases. They must also integrate with all the existing management tools and processes already in place.
Orchestrate to Automate
PlateSpin Orchestrate makes an ideal candidate for these types of automation applications, wherever in the data center they occur. Sometimes perceived as a virtual machine manager chiefly notable for its hypervisor-agnostic backend, PlateSpin Orchestrate is actually an advanced data center management product that originated in the high-performance computing space as a grid management tool. (See Figure 1.) It’s a Java-based, multi-platform, distributed automation tool designed to manage all network resources in environments that scale from tens to thousands of physical or virtual resources.
PlateSpin Orchestrate provides front-end integration through a Java API, and automation through an embedded Python engine. It manages any physical resource with a JVM, and virtual resources running under any of the leading hypervisors, including VMware ESX, VMware ESXi, Xen and Microsoft Hyper-V. It uses constraints and rule-based execution policies to manage resource allocation dynamically, and integrates with identity management solutions to provide authentication and policy-based authorization.
In short, PlateSpin Orchestrate is a general-purpose IT workflow automation product whose applications are in no way limited to run book automation. But run book offers an appealing entry-level automation target with short implementation cycles, fast ROI, limited integration complexity and the opportunity to incrementally address workload optimization and other more complex implementations.
Many Existing Successes
The combination of a highly capable tool set and a long list of low-hanging automation opportunities with significant ROI hasn’t gone unnoticed. The list of successful PlateSpin Orchestrate-based run book automation projects is already long and growing quickly. A few examples:
SAP Business Intelligence Accelerator failover – A German company with SAP BIA installations for disaster recovery located in two cities had a failover process so complex the run book ran to 40 pages of step-by-step instructions, with dependencies on both sides. Some steps had to be run on one server, others on up to 16. Using PlateSpin Orchestrate as the workflow engine, a group of Novell and HP engineers needed just two days to produce a proof-of-concept that automates the more complex steps on the infrastructure side, including storage system failover, SAP cleanup and restart.
Administrator password reset – When a system administrator leaves an organization it is standard practice to change the passwords on all systems, including some legacy resources that may only be powered up intermittently. PlateSpin Orchestrate is able to run predefined jobs on resources as they join the grid, including password resets. For one organization, this function alone justified the capital costs of PlateSpin Orchestrate deployment.
Service desk ticket enrichment – PlateSpin Orchestrate has been used to automate initial intelligence gathering on trouble tickets. When a new incident is reported, the ticketing system triggers a PlateSpin Orchestrate job to identify all systems associated with that service, run a support script on those services and append the results to the ticket. This is an example of an application in which the job script can acquire additional intelligence over time. It’s also an application that can benefit from the availability of a configuration management database.
Check system health – PlateSpin Orchestrate has been used in various applications to check specific measures of system health, and to automatically initiate remedial responses. Examples include checks for full file systems, and checks on various application and system processes.
Public key handling – PlateSpin Orchestrate has been used to automate the management of SSL/TLS keys for servers. A regularly scheduled job checks all systems for certificate validity, generates new certificate signing requests for those that are expiring, collects the CSRs and distributes new certificates. A similar process is used with SSH keys to automatically update known host and authorized key lists.
Orchestrate isn't just a hypervisor-agnostic VM manager. It originated in high-performance computing as an advanced grid management tool.
A Simple Job Script Sample: Baseline
Let’s take a look at a very simple example of the type of JDL script used to automate a workflow in PlateSpin Orchestrate. This illustration (See Figure 2.) is a basic configuration file management task you might use to ensure that a certain configuration file remains identical across a number of machines. This job can have two modes of operation. It can either do a check, which compares the file contents on all designated resources with a baseline version and reports all discovered differences for subsequent action. It can also take a more aggressive approach and simply overwrite any versions that have diverged from the baseline.This job does three things:
- Job code running on the PlateSpin Orchestrate server retrieves the authoritative baseline file and copies it to the data grid, where it can be seen by all other resources.
- Joblets running on the targeted resources compare their own configuration files with the authoritative file contents (jobargs.mode == "put")
- If necessary, the joblets remediate any unwanted differences by replacing the divergent file with the authoritative version from the master server.
Now obviously, a simple script like this one will never threaten the livelihood of any entrenched configuration management tool. But if you harness this type of automation in a management environment where additional information about resource state, health, workload and events can be leveraged along with rules, policies and constraints, the possibilities for improving operational efficiency, reducing costs and enhancing service level performance become hard to ignore.
Run book automation offers an appealing entry-level automation target with short implementation cycles, fast ROI, limited integration complexity, and the opportunity to incrementally address workload optimization and other more complex implementations.
Find Out More
For more information on PlateSpin Orchestrate and its various applications in data center automation and virtual machine management, visit PlateSpin Orchestrate or contact product manager Jo De Baer at JDeBaer@novell.com.
- 01. New Zealand Supercomputing Centre +
- 02. Pernod Ricard Pacific +
- 03. Tenzing +
- 04. PlateSpin Orchestrate +
- 05. PlateSpin Orchestrate Webinar Recording +