Beyond the Mouse LAB 7: Unix Tools 1
October 17,19
Instructor: Jeff Freymueller
x7286 Elvey 413B jfreymueller@alaska.eduTA: Shanshan Li
Last Updated: October 10, 2017
Due: Tuesday Oct 24, before class
Lab slides
Refer to the last slides from the lecture.
We are going to run the exercises for the next few labs in a "Virtual Machine", or a computer within a computer. This allows us to have you use a machine with the Linux operating system even though we only have Windows machines in the lab. There are two parts to this solution: a software program that sets up a "virtual computer" in software, and a file on the Windows machine that contains the image of the hard disk(s) for the virtual machine. Because of the setup of our computer lab, you will need to bring a USB drive with you for the lab and save all your work to that -- you cannot assume that any files you create within the virtual machine will be there the next time you log in (and if they are there, they might be someone else's files).
Setting up the VirtualBox (Read this before doing anything else!)
The VirtualBox software is installed on all the machines in the lab. VirtualBox allows you to install many different guest operating systems into one host operating system. For us the guest operating system is a Linux distribution; namely Fedora. You can read through one of the comparisions of the most widely used Linux distributions if you're interested and wondering what you would choose if you install it on your own machine (personally, I use a different distribution, openSuSE).
Because of the setup of our computer lab, you will need to go through several setup steps before you can use it. This should only be needed one time:
- Open up a Windows Explorer window and go to the folder "C:\Program Files\Oracle\VirtualBox\HardDisks". Copy the file "fbtm22.vdi" to your Documents folder.
- Double-click on the "Oracle VM VirtualBox" icon on the desktop, or select the program from the Start Menu.
- Click on the "New" button in the upper left of the window, which will start the process of setting up a new virtual machine.
- Type "Fedora" or some other name you prefer (this is only for your use) in the "Name" box. Select "Fedora-64bit" from the menu for Version. Type should be set to "Linux". Then click Next to go to the next dialog box.
- Change the Memory size from 1024 MB to 2048 and click Next again.
- The next dialog box should say "Hard drive". Click on "Use an existing virtual hard drive file", and then click on the folder icon to broswe and select the file "C:\Users\username\My Documents\fbtm22.vdi", where username should be replaced with your username. Double-click this file, or click once and then click Open.
- Click Create, and your virtual machine is now configured.
All of the above steps should only have to be done one time. Now you need to start up your virtual machine. The remaining steps you will do each time you run the program and use the Virtual Machine.
- Click Start to start it up. It will take just a few seconds -- it is actually booting up the virtual computer and loading the operating system.
- You will get a couple of dialog boxes with information, which you can dismiss. Basically they are telling you that as long as the mouse is over the virtual computer's window and that window is in front, everything you type and every mouse movement will go to the virtual machine.
- You may get an error message with a "Press S to skip" message -- press S. This is actually a leftover from the configuration of the previous computer lab -- it is nothing to worry about.
- Some messages will flash up while Fedora boots up. You can ignore them. You will eventually get a window with a colorful background, which is the desktop of the virtual machine.
- You may see a dialog box telling you that this version of Fedora is no longer supported. Click close to dismiss the box. Also, something called "Update Manager" will start up and offer to install updates for you. Don't install any updates! Click the red X in the upper left corner of it's window to get rid of Update Manager.
- When you are ready to exit VirtualBox, you have the choice of shutting down the virtual machine, or suspending it. If you suspend it, then it will be in the same state you left it when you come back to the program later.
The Fedora installation doesn't know a thing about the Windows it's running in; poor thing. As far as it can tell, it is running on a real computer! The VirtualBox software has some settings and controls that determine how the virtual machine interacts with the real computer, but the only one you need to use here is the one that looks like a USB connector. You will use this to tell make a USB drive attached to the real computer usable within the virtual machine.
At some point, after the system is initialized you will see a Desktop
with a task bar and all that stuff. Welcome to your working environment for the next labs
(this window manager is called gdm
(gnome
desktop manager)).
You can resize the window and make it as big as you like (bigger is better here).
You will be automatically logged in as user geos436
. This is a local user which exists on all the computers in the lab.
Your home directory is /home/geos436/
. Everything on the virtual machine is local to the
geos436>>
machine you're working on. So save your scripts (NOT THE DATA!) to
your personal flash drive, or open a web browser (mozilla
, an early version of Firefox)
and upload files to your Google Drive, or email them to yourself. See below
for instructions on using a USB drive.
This distribution comes with quite a few applications already; check out the applications
menu; or go to /usr/bin
. The
text editor we will use is gedit
(press alt+F2
and type gedit
and press enter).
I added a few other things
including MATLAB, GMT, LaTeX compiler, and several other standard development tools which we'll probably not use, though. Feel free to poke around;
one objective of these labs is indeed getting you acquainted with a (maybe) unfamiliar operating system (a crusade in disguise: one can do the
virtualbox thing also the other way around -- Windows inside a Linux ;) ).
In the upper task bar, you'll find a shortcut to the gnome-terminal
application (you could also
press alt+F2
and type gnome-terminal
and press enter to open such a window). Another
popular terminal application is xterm
. Whichever you choose; your shell will be a tcsh
. Yes,
I want you to open that window now ...
You will be greeted by a prompt: geos436>>
Let's look at what this means ('>
' marks the prompt):
geos436>>
: The hostname, the actual machine you're working on. (at the moment, all of them actually have the same name because the hard disk file was cloned from one machine).
Using a USB drive on the virtual machine. Here is how to mount and dismount a USB drive on the virtual machine. Note that once you have mounted the USB drive to the virtual machine, Windows can't see it any more. I don't know why they did it that way -- it is actually possible for the two OSes to share, but I suspect it is because Windows doesn't like sharing control.
- Insert your USB drive into the computer. Windows will go through the usual stuff, and then tell you that the device is ready to use.
- There is a little icon that looks like a USB conector in the window frame of the virtual machines's window, at the bottom. You need to right-click on this, and then select your USB device from the menu. The keyboard and mouse will also be listed. Don't select one of these by accident as it might turn them off.
- Now wait a minute or two. You will see Windows telling you again about device drivers, and then after that is done a window will appear on your VM desktop with icons for all the files on your USB drive. You can browse the files using that window -- it is more or less the same was Windows Explorer.
- You can also access the USB drive via the command line. It is called
/media/drivename
, wheredrivename
is whatever name you gave the device. For example, if in Windows your device is called "USB8GB", then you would access it using/media/USB8GB
. - The safest way to unmount the drive when you are done is to click on the little
eject icon in the file browser window. You can also do
umount /media/drivename
. As with windows, if any program is still using the drive, it won't unmount.
Exercise 0: Warm-up with some shell tricks.
Type these commands and follow the output on the screen. I want you to get to know this world a bit better:
> cd /usr/bin
change directory to /usr/bin, look at the prompt; it should have changed> cd ~
go home> pwd
print working directory> whoami
get current user name (sometimes that's pretty helpful)> echo $USER
same thing, the user name> ls
list contents in current directory ... wait what?> ls | more
list contents in directory and pipe it into a pager> man more
learn about more or other commands you find (exit with 'q')> man man
man is a command, too
A useful thing to know is that with the up
and down
arrows you can browse through the history of your commands. Quite handy if you
just typed a very long command and you want to do something quite similar again (Well, it that case it might be time for some shell scripting).
You can learn more about other environment variables by using env | more
.
You don't need to turn anything in from this exercise; just play around.
Exercise 1: Commands and Piping
Below we give a list of unix commands which we find useful and you'll get to know this week. Some of these
commands work on files/directories. The home directory of your virtual machine may have a directory
called lab07_ex1
, which contains a lot of files. If for some reason it is not there,
or if you accidentally mangle it, then look for the file lab07_ex1.tar.gz
in your home directory (if it is not there,
download it to your home directory). You can unpack it
using tar xfz lab07_ex1.tar.gz
. Now cd
into the newly created directory
lab07_ex1
.
Each of the commands given in the tables below does something and creates output that that can be piped into
the others (multiple pipes are perfectly fine). The general syntax of Unix commands given in the man-pages
is generally something like command [options] file(s)
. This means you write the command
name on the command line, you get to chose whether you want to use any of the options the command offers (square brackets
usually indicate that things can be left out), and then you operate on one or more files. Here are the commands:
command | useful options | explanation |
ls | -l, -s, -t, -r | list files in current directory |
wc | -l, -c, -w | count lines, characters (bytes), words |
head | -#### (### represents number) | output first part of a file |
tail | -#### (### represents number) | output last part of a file |
diff | compare files line by line | |
sort | -n -r -k | sort lines of text files |
history | none. | lists history of commands |
df / du | -k -h | show available disk-space / space used by files |
cat | display / concatenate files | |
top | -n | show process statistics, for piping use top -n 1 as it keeps running otherwise |
Here are a few examples that can be run in lab07_ex1
, which should illustrate how these commands work:
> ls -lt --color=auto
list the current directory contents in long format, sorted by modification date, with directories colored differently than files> ls crusde_src
list contents in directorycrusde_src
> ls -R --color=auto
list contents of current directory and all subdirectories (recursive)> cat ascii
dumps the ASCII table stored in the fileascii
to the screen (check withgedit ascii &
)> head -5 FAIR.pfiles
show first 5 lines of fileFAIR.pfiles
> sort -nrk2 tohoku/353093400_hori.gmtvec | more
sort the file353093400_hori.gmtvec
in the directorytohoku
by decreasing numerical values in the second column (latitude),i.e. sort from North to South, and use the pagermore
to control output.
The first thing you will do is play with these commands and their command line options. If you need any files to work on, explore the
example directory we gave you, maybe with the example commands listed above. You should go to the man pages and read up
on the options we've highlighted in the table if you want to know what they actually mean (> man COMMAND
). You will send to us
1) three commands that may include command line options of your choice (please tell us from which directory you invoke the command)
2) an explanation of what the command does, and 3) the output of each command.
Now it's time to step it up a bit: From the list of commands use as many as you like, pipe the output into other commands to create 5 new commands that do something you find useful, redirect the final output to a new file. Send in: (1) the command, (2) a description of the command, (3) a file with the redirected output for each command.
Exercise 2a: Permanently changing your Path and stuff
Since there is only one
user geos436
for the VirtualBox, but many Windows users that will use this log-in, everybody will
have to use the same directory name for the scripts. Create a directory btm_unix_scripts
. Check
here to see how to do this.
You will want to save a copy of your scripts directory to your USB drive, because you
might need to use another machine the next time, and the directory might or might not
be there next time (this is a limitation of the Department's computers; it has nothing to
do with the virtual machine or Linux).
Now that you all have this directory you will have to edit the .tcshrc
file. This is the
"Run Command" (rc) file for the tcsh-shell. It
is executed every time you log into a shell or open a new terminal Window or subshell. All environment variables,
aliases, etc. will therefore be available in any
shell session you start on this system. Here is a brief description of things that
happen during the login process
(for a shell). You might see that you can easily configure your working environment using this file. If it does not exist, you will have to create it.
(The leading dot is important; it's part of the filename and 'hides' the file in normal ls
listings. This is generally
used for configuration files and directories that have to be in your home directory; but you don't have much business messing around
with them (or so the developer thought) To see all the stuff that's in your directory, try ls -lisa
. The options l,i,s,a
are explained in the man pages of ls
.)
After all this talk, here's what to do (assuming you created /home/geos436/btm_unix_scripts
):
- go to your home directory (a
cd
w/o arguments will get you there) - Run
gedit .tcshrc &
to open the runConfig file in an editor - Recall from the lecture the syntax for setenv:
setenv VARIABLE value
- first add another environment variable called
BTM_BIN
and set the value to/home/geos436/btm_unix_scripts
- now modify the value of
PATH
: put.:${BTM_BIN}:
at the start of the long list of directories. - the '
:
' is a field separator which the shell uses to tell different directories in the path apart. - the '
.
' will also include the current directory into the search path. - save the file
- for these changes to take effect you need to open a new terminal window or try typing
rehash
in the currently open terminal - TEST 1:
cd $BTM_BIN
should beam you into~/btm_unix_scripts
- TEST 2:
env | grep btm_unix
will show you whether your changes were successful for the path ... and you learn something about variables - If either test fails, fix it!
- Question: What does
env | grep btm_unix
do? Consult man-pages, lecture notes, the Internet.
Exercise 2b: Writing a Shell script
- go to your
home directory
in a Terminal - There should be a directory called
lab07_ex2
there. If not, you will need to download it using wget (all one line):
wget http://www.gps.alaska.edu/jeff/Classes/GEOS436+636/lab07_ex2.tar.gz
(warning, it is 290MB in size, so it will take a few minutes). - Unpack the
*.tar.gz
archive using tar:tar xfz lab07_ex2.tar.gz
- Now you should have a directory
lab07_ex2
- Keep the downloaded tar file as a backup in case you mess your data directory up further down.
In this directory you will find GPS data for a certain day. The files are binary files and you can't read them directly, but knowing the contents of the files is not essential. The key point is that there are many, many files. Some of which are gzipped, others are duplicates: gzipped and unzipped. What I want you to do now is find all the duplicates and rename the unzipped files to all upper case:
- open a text editor (gedit)
- write a shell script which you will save to
$BTM_BIN
that will: - loop over all qm files in the current directory (use foreach)
- check whether the file exists in gzipped version (hint: if statement)
- echo the duplicate
While developing this script you might want to test it. To test it you will have to make it an executable and make sure
that the first line of the file reads #!/bin/tcsh
. To make a file
executable, in a terminal window:
cd $BTM_BIN
chmod u+x NAME_OF_YOUR_SCRIPT
- Question: What does
chmod u+x NAME_OF_YOUR_SCRIPT
do? Consult man-pages, lecture notes, the Internet.
You will have to repeat this for any script you want to be executed on the command line. Otherwise you'll get a "could not find ..." response.
The testing happens in ~/lab07_ex2
! Open a new terminal window (to refresh the path contents with your new executable), go to ~/lab07_ex2,
and execute whatever you called your script. And yes, I will do exactly that and expect your
script to work, no matter where it is stored.
Now the funky part: Rename the duplicate files such that all lower case letters are upper case. You can find a nearly complete solution at this website. You will have to do the conversion from lower case to upper case since they convert from upper case to lower case. I find this task challenging and yet rewarding enough that giving the solution away is fine with me. However, you still have to find the correct line on the website, copy it correctly into your script, modify and explain what this line does (use man pages and Internet to find answers).
As a guideline: my neatly formatted, yet uncommented solution script is 8 lines long.
There are also some tools you can use to do this in a different way with fewer
commands (see for example, rename
), but those may require cryptic options,
and this way will require you to put together a loop, tests, and some useful
shell commands and tools.
Once you're done try
ls *QM | wc -l
The result should be 944
. I think. It has been a scramble to get the virtual machines
up so I am not 100% sure that the one that was saved from last year had exactly that many
duplicate files.
You've just changed the name of 944 files. Given the boredom caused by doing the actual conversion by hand and the number of files, writing the script, testing, failing, fixing, testing, succeeding was still a lot faster.
There should be no files matching *qm any more:
> ls *qm
ls: No match.
To be fair though, in real life you might simply call: gzip *qm
and let gzip complain about existing files. But the point of this
exercise was to introduce you to a few unix tools, get you to do some scripting and do a simple task on many, many files. I hope this
objective was accomplished.
Dr. Jeffrey T. Freymueller
Professor of Geophysics
Geophysical Institute
University of Alaska, Fairbanks
Fairbanks, AK 99775-7320