Rute User's Tutorial and Exposition. 7. Shell Scripting
Next: 8. Streams and sed
Up: rute
Previous: 6. Editing Text Files
  Contents
Subsections
This chapter introduces you to the concept of computer programming.
So far, you have entered commands one at a time. Computer programming is merely
the idea of getting a number of commands to be executed, that in combination
do some unique powerful function.
To execute a number of commands in sequence, create a file with a
.sh
extension, into which you will enter your commands. The
.sh extension
is not strictly necessary but serves as a reminder that the file contains special
text called a shell script. From now on, the word script will
be used to describe any sequence of commands placed in a text file. Now do a
which allows the file to be run in the explained way.
Edit the file using your favorite text editor. The first line should be as follows with no
whitespace. [Whitespace are tabs and spaces, and in some contexts, newline (end of
line) characters.]
The line dictates that the following program is a shell script, meaning
that it accepts the same sort of commands that you have normally been typing
at the prompt. Now enter a number of commands that you would like to be executed.
You can start with
|
echo "Hi there"
echo "what is your name? (Type your name here and press Enter)"
read NM
echo "Hello $NM"
|
Now, exit from your editor and type
./myfile.sh. This will execute [Cause the computer to read and act on your list of commands, also called running
the program.
] the file. Note that typing
./myfile.sh is no different from typing
any other command at the shell prompt. Your file
myfile.sh has in fact
become a new UNIX command all of its own.
Note what the
read command is doing. It creates a pigeonhole called
NM, and then inserts text read from the keyboard into that pigeonhole.
Thereafter, whenever the shell encounters
NM, its contents are
written out instead of the letters NM (provided you write a
$ in front
of it). We say that
NM is a variable because its contents can
vary.
You can use shell scripts like a calculator. Try
5
|
echo "I will work out X*Y"
echo "Enter X"
read X
echo "Enter Y"
read Y
echo "X*Y = $X*$Y = $[X*Y]"
|
The
[ and
] mean that everything between must be evaluated [Substituted, worked out, or reduced to some simplified form.
] as a numerical expression [Sequence of numbers with
+,
-,
*, etc. between them.
]. You can, in fact, do a calculation at any time by typing at the prompt
[Note that the shell that you are using allows such
[ ]
notation. On some UNIX systems you will have to use the
expr
command to get the same effect.]
The shell reads each line in succession from top to bottom: this is called program
flow. Now suppose you would like a command to be executed more than once--you
would like to alter the program flow so that the shell reads particular commands
repeatedly. The
while command executes a sequence of commands many
times. Here is an example (
-le stands for less than or equal):
5
|
N=1
while test "$N" -le "10"
do
echo "Number $N"
N=$[N+1]
done
|
The
N=1 creates a variable called
N and places the number
1 into it. The
while command executes all the commands between
the
do and the
done repetitively until the
test
condition is no longer true (i.e., until
N is greater than
10).
The
-le stands for less than or equal to. See
test(1) (that is, run
man 1 test)
to learn about the other types of tests you can do on variables. Also be aware
of how
N is replaced with a new value that becomes
1 greater
with each repetition of the
while loop.
You should note here that each line is a distinct command--the commands are
newline-separated. You can also have more than one command on a line
by separating them with a semicolon as follows:
|
N=1 ; while test "$N" -le "10"; do echo "Number $N"; N=$[N+1] ; done
|
(Try counting down from 10 with
-ge (greater than or equal).)
It is easy to see that shell scripts are extremely powerful, because any kind
of command can be executed with conditions and loops.
The
until statement is identical to
while except that the
reverse logic is applied. The same functionality can be achieved with
-gt (greater than):
|
N=1 ; until test "$N" -gt "10"; do echo "Number $N"; N=$[N+1] ; done
|
The
for command also allows execution of commands multiple times. It
works like this:
5
|
for i in cows sheep chickens pigs
do
echo "$i is a farm animal"
done
echo -e "but\nGNUs are not farm animals"
|
The
for command takes each string after the
in, and executes
the lines between
do and
done with
i substituted
for that string. The strings can be anything (even numbers) but are often file names.
The
if command executes a number of commands if a condition is met
(
-gt stands for greater than,
-lt stands for less
than). The
if command executes all the lines between the
if
and the
fi (``if'' spelled backwards).
5
|
X=10
Y=5
if test "$X" -gt "$Y" ; then
echo "$X is greater than $Y"
fi
|
The
if command in its full form can contain as much as:
5
|
X=10
Y=5
if test "$X" -gt "$Y" ; then
echo "$X is greater than $Y"
elif test "$X" -lt "$Y" ; then
echo "$X is less than $Y"
else
echo "$X is equal to $Y"
fi
|
Now let us create a script that interprets its arguments. Create a new script
called
backup-lots.sh, containing:
|
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
cp $1 $1.BAK-$i
done
|
Now create a file
important_data with anything in it and then run
./backup-lots.sh important_data, which will copy the file 10 times
with 10 different extensions. As you can see, the variable
$1 has
a special meaning--it is the first argument on the command-line. Now let's
get a little bit more sophisticated (
-e test whether the file exists):
5
10
|
#!/bin/sh
if test "$1" = "" ; then
echo "Usage: backup-lots.sh <filename>"
exit
fi
for i in 0 1 2 3 4 5 6 7 8 9 ; do
NEW_FILE=$1.BAK-$i
if test -e $NEW_FILE ; then
echo "backup-lots.sh: **warning** $NEW_FILE"
echo " already exists - skipping"
else
cp $1 $NEW_FILE
fi
done
|
A loop that requires premature termination can
include the
break statement within it:
5
10
|
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
NEW_FILE=$1.BAK-$i
if test -e $NEW_FILE ; then
echo "backup-lots.sh: **error** $NEW_FILE"
echo " already exists - exitting"
break
else
cp $1 $NEW_FILE
fi
done
|
which causes program execution to continue on the line after the
done.
If two loops are nested within each other, then the command
break 2
causes program execution to break out of both loops; and so
on for values above
2.
The
continue statement is also useful for terminating
the current iteration of the loop. This means that if a
continue
statement is encountered, execution will immediately continue from
the top of the loop, thus ignoring the remainder of the body of
the loop:
5
10
|
#!/bin/sh
for i in 0 1 2 3 4 5 6 7 8 9 ; do
NEW_FILE=$1.BAK-$i
if test -e $NEW_FILE ; then
echo "backup-lots.sh: **warning** $NEW_FILE"
echo " already exists - skipping"
continue
fi
cp $1 $NEW_FILE
done
|
Note that both
break and
continue work inside
for,
while, and
until loops.
We know that the shell can expand file names when given wildcards. For
instance, we can type
ls *.txt to list all files ending with
.txt.
This applies equally well in any situation, for instance:
|
#!/bin/sh
for i in *.txt ; do
echo "found a file:" $i
done
|
The
*.txt is expanded to all matching files. These files
are searched for in the current directory. If you include an absolute path
then the shell will search in that directory:
|
#!/bin/sh
for i in /usr/doc/*/*.txt ; do
echo "found a file:" $i
done
|
This example demonstrates the shell's ability to search for matching files and expand
an absolute path.
The
case statement can make a potentially complicated program very
short. It is best explained with an example.
5
10
15
20
|
#!/bin/sh
case $1 in
--test|-t)
echo "you used the --test option"
exit 0
;;
--help|-h)
echo "Usage:"
echo " myprog.sh [--test|--help|--version]"
exit 0
;;
--version|-v)
echo "myprog.sh version 0.0.1"
exit 0
;;
-*)
echo "No such option $1"
echo "Usage:"
echo " myprog.sh [--test|--help|--version]"
exit 1
;;
esac
echo "You typed \"$1\" on the command-line"
|
Above you can see that we are trying to process the first argument to a program.
It can be one of several options, so using
if statements will result
in a long program. The
case statement allows us to specify several
possible statement blocks depending on the value of a variable. Note how each
statement block is separated by
;;. The strings before the
)
are glob expression matches. The first successful match causes that block to
be executed. The
| symbol enables us to enter several possible glob
expressions.
So far, our programs execute mostly from top to bottom. Often, code needs to
be repeated, but it is considered bad programming practice to repeat groups
of statements that have the same functionality. Function definitions provide
a way to group statement blocks into one. A function
groups a list of commands and assigns it a name. For example:
5
10
15
20
25
|
#!/bin/sh
function usage ()
{
echo "Usage:"
echo " myprog.sh [--test|--help|--version]"
}
case $1 in
--test|-t)
echo "you used the --test option"
exit 0
;;
--help|-h)
usage
;;
--version|-v)
echo "myprog.sh version 0.0.2"
exit 0
;;
-*)
echo "Error: no such option $1"
usage
exit 1
;;
esac
echo "You typed \"$1\" on the command-line"
|
Wherever the
usage keyword appears, it is effectively substituted
for the two lines inside the
{ and
}. There are obvious
advantages to this approach: if you would like to change the program usage
description, you only need to change it in one place in the code. Good programs
use functions so liberally that they never have more than 50 lines of program
code in a row.
Most programs we have seen can take many command-line arguments, sometimes in any order.
Here is how we can make our own shell scripts with this functionality. The command-line
arguments can be reached with
$1,
$2, etc. The script,
|
#!/bin/sh
echo "The first argument is: $1, second argument is: $2, third argument is: $3"
|
can be run with
|
myfile.sh dogs cats birds
|
and prints
|
The first argument is: dogs, second argument is: cats, third argument is: birds
|
Now we need to loop through each argument and decide what to do with it. A script
like
|
for i in $1 $2 $3 $4 ; do
<statments>
done
|
doesn't give us much flexibilty. The
shift keyword is meant to make
things easier. It shifts up all the arguments by one place so that
$1
gets the value of
$2,
$2 gets the value of
$3,
and so on. (
!= tests that the
"$1" is not
equal to
"", that is, whether it is empty and is hence past
the last argument.) Try
|
while test "$1" != "" ; do
echo $1
shift
done
|
and run the program with lots of arguments.
Now we can put any sort of condition
statements within the loop to process the arguments in turn:
5
10
15
20
25
30
|
#!/bin/sh
function usage ()
{
echo "Usage:"
echo " myprog.sh [--test|--help|--version] [--echo <text>]"
}
while test "$1" != "" ; do
case $1 in
--echo|-e)
echo "$2"
shift
;;
--test|-t)
echo "you used the --test option"
;;
--help|-h)
usage
exit 0
;;
--version|-v)
echo "myprog.sh version 0.0.3"
exit 0
;;
-*)
echo "Error: no such option $1"
usage
exit 1
;;
esac
shift
done
|
myprog.sh can now run with multiple arguments on the command-line.
Whereas
$1,
$2,
$3, etc. expand to the individual
arguments passed to the program,
$@ expands to all arguments.
This behavior is useful for passing all remaining arguments onto a second command. For
instance,
|
if test "$1" = "--special" ; then
shift
myprog2.sh "$@"
fi
|
$0 means the name of the program itself and not any command-line argument.
It is the command used to invoke the current program. In the above cases, it
is
./myprog.sh. Note that
$0 is immune to
shift operations.
Single forward quotes
' protect the enclosed text from
the shell. In other words, you can place any odd characters inside forward quotes,
and the shell will treat them literally and reproduce your text exactly. For
instance, you may want to
echo an actual
$ to the screen to
produce an output like
costs $1000. You can use
echo 'costs
$1000' instead of
echo "costs $1000".
Double quotes
" have the opposite sense of single quotes.
They allow all shell interpretations to take place inside them. The
reason they are used at all is only to group text containing whitespace into
a single word, because the shell will usually break up text along whitespace
boundaries. Try,
|
for i in "henry john mary sue" ; do
echo "$i is a person"
done
|
compared to
|
for i in henry john mary sue ; do
echo $i is a person
done
|
Backward quotes
` have a special meaning to the shell. When a command
is inside backward quotes it means that the command should be run and its output
substituted in place of the backquotes. Take, for example, the
cat command.
Create a small file,
to_be_catted, with only the text
daisy
inside it. Create a shell script
|
X=`cat to_be_catted`
echo $X
|
The value of
X is set to the output of the
cat command, which in this
case is the word
daisy. This is a powerful tool. Consider the
expr
command:
|
X=`expr 100 + 50 '*' 3`
echo $X
|
Hence we can use
expr and backquotes to do mathematics inside our shell
script. Here is a function to calculate factorials. Note how we enclose the
*
in forward quotes. They prevent the shell from expanding the
* into matching file names:
5
10
|
function factorial ()
{
N=$1
A=1
while test $N -gt 0 ; do
A=`expr $A '*' $N`
N=`expr $N - 1`
done
echo $A
}
|
We can see that the square braces used further above can actually suffice for
most of the times where we would like to use
expr. (However,
$[]
notation is an extension of the GNU shells and is not
a standard feature on all varients of UNIX.) We can now run
factorial
20 and see the output. If we want to assign the output to a variable,
we can do this with
X=`factorial 20`.
Note that another notation which gives the effect of a backward quote is
$(command
), which is identical to
`command
`.
Here, I will always use the older backward quote style.
Next: 8. Streams and sed
Up: rute
Previous: 6. Editing Text Files
  Contents
|