Lab 12

Computer Architecture I ShanghaiTech University
Lab 11 Lab 12 Lab 13

Shell Programming and Regular Expressions

Introduction to Shell Scripts in Unix-like Systems

In computing, a shell is a user interface for access to the operating system. Based on its scripting, the command-line interface (CLI) provides a flexible mechanism to invoke the programs and combine their functions. Though there are plenty of kinds of scripting in DOS, OS/2, Windows PowerShell and Vax/VMS machines, in this lab, we mainly introduce the shell script in Unix-like systems, which is a CLI interpreter designed to be run by the Unix shell. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup, logging, etc. is called a wrapper.[1].

The recommended shell to use in this lab is Bash since all the involved scripts are well tested on it. Considering the various dialects of shell scripts, the semantic or the behaviors could be slightly different in other shells; however, you are encouraged to explore these differences and establish a global view of the shell scripts.

Enter and Exit

Usually you've been in a shell environment after starting the terminal. You are allowed to enter a nested shell, for example, typing bash to enter a new bash environment. By typing exit or Ctrl+D, you could exit from the current shell. Ctrl+D sends a end-of-file (EOF) character to terminate the current input. Obviously the end of the input in a shell means exiting.

Try it yourself.

Run the Commands From a File

By writing the commands into a file, you can repeatedly invoke them in a much easier way rather than type all of them over and over again. Now create a new file named hello.sh in the current directory and fill it with the following lines:

#!/bin/bash
echo "Hello, Computer Architecture!"

There are two methods to execute the content of this file. The first one is to type

$ bash hello.sh
In this case, hello.sh is passed to the interpreter(bash) to run the commands in a new shell environment. Recalling the common way to execute the python scripts, python hello.py, they actually share the same form. Notice that the lines starting with # is deemed as comments and will be ignored during the execution.

Another method is to directly run the script file as an executable. Try typing

$ ./hello.sh
It should report an error since the file is actually text. So we should assign executable attribute to it by typing chmod +x hello.sh firstly. In this case, an interpreter should be determined before the execution - that is what the first line does. If the first line starts with #!, it is not a comment but indicates the interpreter. The default interpreter is sh. This special comment is called Shebang.

Now try to output the sentence "Hello Lab" by directly running an executable file which is interpreted by Python 3.

Checkoff

  • Show the steps to your TA

Variables

You can define a variable in the shell like this

a=1
b=2.0
c="foo $a"
d='bar $b'

Important to notice the spaces between the variable name and the equal sign are not allowed, which is different from the usual experience in other programming languages. It is because the spaces in the shell represent the separation between the parameters. The value of a plain variable is just literal, like macros in C/C++ languages, without data type. As you can see, the dollar sign refers to the content of a variable; however, the parts wrapped by the single quotes disable any interpretation.

e="actually not e"
echo "This is $empty"
echo "This is ${e}mpty"
Referring to undefined variables will not cause any exceptions and they are deemed empty. The braces help specify the range of the variable name.

unset <variable name> is used to delete a variable from the current running environment.

There are several built-in variables which are useful during the shell script programming.

  • $#
  • $*
  • $$
  • $!
  • $?

Checkoff

  • Explain the meaning of each variable above to your TA

Shell script also supports one-dimensional array, which can be declared with

a=(A1 B2 C3)

Try running following lines and getting familiar with the syntax.

a=(A1 "B2" 'C3')
a[2]=CC
echo "the length is ${#a[*]}"
echo "${a[0]} ${a[2]}"
echo "${a[@]}"

Branch Statements

As a programming language, shell script supports functions and branch control. Read the following statement, understand the format and learn more usage on the internet.

echo '--- if statement ---'
if [ $a == $b ] # The space is necessary
then
    echo "equal"
else
    echo "not equal"
fi

echo '--- for statement ---'
for filename in `ls` # think about what the `` means
do
    echo $filename
done

echo '--- while & case statement ---'
i=1
while(( $i<8 )) # The space is necessary
do
    case $i in
        1|2) echo 'one or two'
        ;;
        3) echo 'three'
        ;;
        *) echo 'otherwise'
        ;;
    esac
    i=$((i+1)) # (()) calculates the numeric value
done

Then complete the missing part in functions.sh and sort.sh to make them work well as their descriptions.

Checkoff

  • Show TA your scripts and the execution result

Parallel operations

Suppose the four task scripts task1.sh to task4.sh are independent and you want to execute them in parallel. Find out a way to launch multiple processes in the script parallel.sh and converge them correctly.

Checkoff

  • Show the modified scripts to TA and display the result
  • Explain the commands in your scripts

Regular Expression

A regular expression is a sequence of characters that define a search pattern[2]. Though you could easily find the concepts and explanation of regular expression with the help of search engines, you may still be interested in the following website.

The first one provides a convenient pattern matching online as well as detailed explanations. You can also verify the search pattern in the locally advanced editors, e.g., sublime, vs code and notepad++. The second website offers a visual way to display the behaviors of the entered pattern.

Now learn the regular expression and

Checkoff

show TA the answers for the following questions
  • The explanation for the pattern ^([^re]|re).*
  • The explanation for the pattern 2{2}\(3{3,}CA?(lab).A+\)?$
  • The pattern to match a positive finite decimal in the decimal form. For example, 0, 12345, 1.26, 0.00 are legal forms while 02, 1.xy are not.