August 30 --- Class 2 --- UNIX Basics, Hierarchical File System, Special Characters, Redirection Activities: Introduction to UNIX and hierarchical file system UNIX has a hierarical file system. It is a lot like a file cabinet, with subdirectories playing the roles of drawers and file playing the role of file folders. However, in UNIX there are more levels in that any subdirectory can have a file in it that is itself a subdirectory. There is a special directory called the root of the file system and denoted by / . A full path name of a file starts with /. However, when you are typing commands to a "shell", or UNIX interpreter, you are considered to be "in" a particular directory. The command pwd stands for "print working directory" and will tell you where you are. You can refer to a file with a relative path name by starting the name with a subdirectory of your current working directory. You should also know that ~ stands for your home directory in most shells, and that ~username stands for the home directory of user whose login name is username. Thus, ~sg/bin stands for my bin subdirectory. Commands relevant to directories: pwd print working directory cd change directory mkdir make a new directory The ls command stands for list files. Explore what it does by trying the following commands ls ls / ls -a ls -l ls -lt To find out more about ls, use the manual command man ls Special directory names . stands for the current directory .. stands for one level up in the directory hierarchy Special or Metacharacters in UNIX UNIX has a number of characters that are special to the command interpreter, or shell. The following characters are used in filename matching: * ? [ ] - { } , * matches any number of characters, so ls m* lists all files that start with the letter m. UNIX is case sensitive. So ls M* is not the same. ? matches any single character, so ls m? lists only files whose names are two characters long and begin with m. ls m?? would match files whose names are three characters long and begin with m. [ and ] are used to make a list of single characters. [abf] matches any of the characters a, b or f. [a-j] matches the range of characters a, b, ... j. [0-9] matches any digit. For example, ls *.[cf] would match any file that ends in .c or .f, which is usually how we name C and Fortran programs, respectively. { and } are used to make a list of multicharacter matches. For example, {abc,red,green} will match the three strings "abc", "red" or "green". It is also possible to get the ls command to list all files that do not match a certain pattern in some version of unix. This doesn't seem to work on the Macintosh, but is does work on linux. ls -I "*.c" ls --ignore="*.c" The double quotes keep the expression with the metacharacter from being expanded. There are other special characters in UNIX that do things other than match characters. $ is used to denote a variable name. For example there is a path variable. Try the following examples echo path echo $path In the second line we get the path variable, not just the string path. ; is used to separate two commands on a single line, e.g. echo $path; date \ at the end of the line is used for line coninuation if you have a long command. The carriage return must immediately follow the \. \ is also used to escape the special meaning of a character. Try this echo \$path / we have already seen is used for separating (sub)directory names, and for root ~ is used for home directories ( ) are used in pairs to combine the output of two UNIX commands. This will be explained more when we need it. & at the end of a command line puts the command in the background so that you immediately get back a prompt and can issue another command. The command in the background continues to run, however. | < > are very useful as we will now explain. Filters, Pipelines and Redirection UNIX has some wonderful facilities that make it easy to combine simple commands together to do complex jobs. The key ideas are "standard input", "standard output", and pipelines. Many commands require input to do their work and are set up to use the "standard input", which is normally the keyboard. Suppose you have prepared a file with the input you want because you know there are a lot of commands and you may make a typo. In UNIX do this: command < infile This runs "command", but tells it to "redirect" the input from the keyboard to the file named infile. So, the infile is used as the input and you don't have to type all the input commands. If the file is setup, so that its output goes to the "standard output", you will see the output at the screen. But, what is good for the input, is good for the output, so if we type command < infile > outfile then what would go to the screen instead is redirected to the file outfile. Any program that takes its input from the standard input and sends its output to the standard output is called a filter. Think of a red filter. On the input side is any kind of light. On the output is only the red component of the incoming light. With filters, you can then pass the light through another filter further modify the light. In UNIX, a filter takes information from the standard input, modifies it and sends it to the standard output. But how do we send the output of one filter to the input of another? We do that with a pipeline. filter1 out2 takes input from in1 and runs command filter1, then it takes the output of filter1 and uses is as the input to command filter2. Finally, it puts the results in file out2. We will have many opportunities to use this capability. There are other types of redirection >> appends to the output file. If you set the no clobber variable, and you say cmd > outfile if outfile exits, the command will not clobber outfile with new input. However, cmd >! outfile will override the noclobber variable and replace the existing outfile with the new information. Similarly, if noclobber is set, >> outfile will complain if outfile is not already there to append to. >>! will eliminate such a complaint if outfile does not exist. Of course, if you don't set nocobber, you don't have to worry about this, but you do have to worry about writing over a file by redirecting output to an existing file. We also looked at my .cshrc and .alias file to see how we can set up a useful environment and save some typing. For example alias cp "cp -i" alias mv "mv -i" save us from copying a file or renaming a file to the name of an existing file. If cp or mv see that we are about to write over an existing file, the command will inquire (hence -i) whether we really want to do that. If find these aliases very convenient. Aliases In my file ~/.alias, I have a number of useful aliases. We will talk more about them in the next class.