Novell Home

Making Scripts Portable

Novell Cool Solutions: Feature
By Simon Nattrass

Digg This - Slashdot This

Posted: 14 Jul 2004
 

Abstract
Portable shell scripting is something of a black art, since with the evolution and derivation of the UNIX shell, the definition of "portable" is perhaps ambiguous. This paper explores the best practices involved in creating such elusive scripts.

Contents

1. The Shell Landscape
2. Which Shell?
3. Feature and Syntax Consideration
4. Common Shell Syntax Matrix
5. The Rest

The Shell Landscape

The evolution of the shell mirrors the splintered history of the Unix platform. From the original 1970s Bourne Shell, each new shell has borrowed features from its predecessors finally reaching the conclusion today with a web of related shells each very similar yet subtly different.

The suite of shells fall roughly into two families, those derived or relating to the original Bourne shell and those derived from the C Shell. This is represented in the diagram below with the major predecessor influences. Note the Z Shell has been displayed to fall outside these two common groups as it can be argued to fall into either. Largely derived from the Korn shell, yet borrowing features from the C Shell family. Other shells not displayed but worthy of note are rc from the Plan9 operating system and es derived for UNIX from rc.

Note: The Trusted C Shell (tcsh), is also reference to as the Toronto C Shell, Extended C Shell and TENEX C Shell.

It is exactly this diversity which causes the probability problem, not all platforms support all shells and not all shells provide all features. When considering a shell platform it would not be unreasonable to assume that this would be UNIX or UNIX-Like centric, yet Windows should also be considered with the evolution and adoption of UNIX tools for Windows with Cygwin and UWin.

Which Shell?

With such diversity of shells, to archive optimal portability the accepted best choice is the lowest common denominator, that shell which is available to all platforms -- the original Bourne Shell. This decision is enforced by with the general consensus that the C Shell family is best avoided due to bugs in their implementation and flaws in their design. See http://www.faqs.org/faqs/unix-faq/shell/csh-whynot/ for further information.

But what about Linux? To complicate matters further the original Bourne shell doesn't exist on Linux. You might see /bin/sh, but further examination shows that it's simply a link to bash, The Bourne Again Shell from GNU which is the default for Linux. When bash is executed as sh, it attempts to emulate sh as closely as possible, but ultimately fails to warn about syntax which would be incompatible with a a pure Bourne shell. Although it might be desirable to use this default, bash is not distributed with many variants of UNIX. Thus an alternative is to use the Almquist Shell (ash) which is pared down implementation of the original Bourne Shell.

Shells available on SUSE include: ash, bash, pdksh, tcsh and zsh.

Feature and Syntax Consideration

Although using the Bourne shell is the "safest" option, not all Bourne shells are equal particularly where another shell emulating Bourne shell compatibility is used (which is the case for most Linux distributions). When in such a compatibility mode it is easy to mistakenly utilize the extended features which are not pure Bourne shell compatible and for the script to work the emulated environment without warning.

The first two bytes of a shell script is expected to contain #! hence the the very first line must contain the characters #! which is followed by a single space, and then the fully qualified path to the shell executable. The example below numbers the lines and columns to enforce this concept.

    1234567890
1 #! /bin/sh

The absolute pathname can cause problems when the shell executable is located in a non-standard location, or when the this path is longer than the 32 character limitation.

Occasionally scripts omit the space between the #! and the shell executable which is not recognized in many older shells.

Very old Bourne scripts may replace the magic #! with a single colon (:) as the first character followed by a carriage return. This arcane format has long been disbanded and should not be used.

Any code which is tightly bound to the host operating system is a bad idea, but on those rare occasions where it cannot be avoided it is advisable to switch on uname and wrap the respective sections.

A chief culprit for portability problems lies with the use of external utilities within the script which either are not common or their input flags differ across platforms. The "pallet" of commands which are considered safe:

cat mkdir
cmp mv
cp pwd
diff rm
echo rmdir
egrep sed
expr sleep
FALSE sort
grep tar
install-info                 test
ln touch                
ls TRUE

Avoid aliases and functions, hence no local variables (local to a function) are supported and when exporting a variable, initialize the variable and export on separate lines.

Pure Bourne Bash
MYVAR="foo" export MYVAR="foo"     
export $MAYVAR            

Forgo any use of [ ... ], preferring test instead. The two are equivalent and in all but the oldest of shells, test is a built-in command and incurs no additional overhead. For more details refer to chapter 22.2.6 of the "GNU Autoconf, Automake and Libtool" book at: http://sources.redhat.com/autobook/

Conditional with [ ... ]                   Equivalent using test                
if [ -f foo.c ] if test -f foo.c
then then
... ...
fi fi

Abstain from the use of [[ ... ]] and $(( ... )) which are extensions to the Korn and Bourne Again shells. The former specifies a conditional which may be replaced with the test command as previously mentioned, while the later relates to an arithmetic expression which may be substituted with a call to expr which, although slower, provides greater portability.

Conditional with [[ ... ]]                   Equivalent using test                
if [[ -f foo.c ]] if test -f foo.c
then then
... ...
fi fi

Expression with $(( ... ))                   Equivalent using expr                
x=$(($x+1)) x=`expr $x + 1`
Note: another valid equivalent would be x=$(expr $x + 1), however the $(command) is a modern enhancement to the Bourne shell.

While wildcard expansion (also known as globbing) via * and ? is supported, brace expansion should be avoided.

$ ls my_{finger,toe}s

The $(command) may be supported in many modern Bourne shells but for pure Bourne shells use of `command` is favored, although this method tends to lead to confusion when the backquotes contain the characters $, ` and \. In such cases remember to use the character \ to escape these sequences.

For a complete functionality matrix of shell features see: http://www.faqs.org/faqs/unix-faq/shell/shell-differences/

Common Shell Syntax Matrix

>
sh bash ksh csh Meaning/Action
$ $ $ % Prompt
  >| >| >! Force redirection
      >>! Force append
> file 2>&1 &> file or > file 2>&1 > file 2>&1 >& file Redirect stdout and stderr to file
  { }   { } Expand elements in list
`command` `command` or $(command) $(command) `command` Substitute output of enclosed command
$HOME $HOME $HOME $home Home directory
  ~ ~ ~ Home directory symbol
  ~+, ~-, dirs ~+, ~- =-, =N Access directory stack
var=value var=value var=value set var=value Variable assignment
export var export var=val export var=val setenv var val Set environment variable
  ${nnnn} ${nn}   More than 9 arguments can be referenced
"$@" "$@" "$@"   All arguments as separate words
$# $# $# $#argv Number of arguments
$? $? $? $status Exit status of the last executed command
$! $! $!   PID of the last executed backgrounded process
$- $- $-   Current options
. file source file or . file . file source file Read commands in file
  alias x='y' alias x=y alias x y Name x stands for command y
case case case switch / case Choose alternatives
done done done end End a loop statement
esac esac esac endsw End case or switch
exit n exit n exit n exit (expr) Exit with a status
for/do for/do for/do foreach Loop through variables
  set -f, set -o nullglob|dotglob|nocaseglob|noglob   noglob Ignore substitution characters for filename generation
hash hash alias -t hashstat Display hashed commands (tracked aliases)
hash cmds hash cmds alias -t cmds rehash Remember command locations
hash -r hash -r   unhash Forget command locations
  history history history List previous commands
  ArrowUp+Enter or !! r !! Redo previous command
  !str r str !str Redo last command that starts with "str"
  !cmd:s/x/y/ r x=y cmd !cmd:s/x/y/ Replace "x" with "y" in most recent command starting with "cmd", then execute.
if [ $i -eq 5 ] if [ $i -eq 5 ] if ((i==5)) if ($i==5) Sample condition test
fi fi fi endif End if statement
ulimit ulimit ulimit limit Set resource limits
pwd pwd pwd dirs Print working directory
read read read $< Read from terminal
trap 2 trap 2 trap 2 onintr Ignore interrupts
  unalias unalias unalias Remove aliases
until until until   Begin until loop
while/do while/do while/do while Begin while loop
?

Shells available on SUSE include: ash, bash, pdksh, tcsh and zsh.

The Rest

There are other languages that scripts could be written in, most notably Perl or Python and beyond these Tcl, Ruby, Java etc... Of course, the script would no longer work with one of the the common shells and thus the the first line of the script would change:

#! /usr/local/bin/perl

However, all of these rely on the prerequisite interpreter or compiler being present which negates the aim of portability. In addition it can be argued that these languages form a closer shift to programming as opposed to scripting, where "scripting" is defined to be the "automation of tasks normally carried out interactively at the keyboard" and a program extends beyond this limitation. Albeit the boundary between scripting and a programming languages is somewhat vague and subjective, thus the reader may draw his/her own conclusions.

Summary

This document has covered the main areas in portable scripting, a topic for which there are a good many resources available, although there exists no one source for all answers. Ultimately it is recognized that the best approach for portability is the regression to the oldest shell, with the exclusion of many of the useful (but not portable!) features of its successors. For all further queries and questions beyond this document and associated references, the reader is advised to consult the comp.unix.shell newsgroup which contains a wealth of information and advice.

References


Novell Cool Solutions (corporate web communities) are produced by WebWise Solutions. www.webwiseone.com

© 2014 Novell