Beyond the Mouse LAB 8: Unix Tools 2
October 31, November 2
Instructor: Jeff Freymueller
x7286 Elvey 413B jfreymueller@alaska.eduTA: Matvey Debolskiy
Last Updated: October 10, 2017
Due: Tuesday Nov 8 before class
Lab slides
none.
Note
As solution, send me your scripts and the answers to the questions. No datafiles please, I have plenty of those :)
Running the VirtualBox
Check here if you forgot how that works. Really. Go there if you forgot something.
Exercise 1: Data Handling with awk
Hopefully you remember Exercise 2 of Lab 05. If not, you may remember that, at some point in the past, we had you fiddle with pesky formatting strings to extract some data from a file with a lot more data. Now we'll go back to the FAIR.pfiles.txt text file and treat it with Unix tools to extract the information we want.
- Download the file FAIR.pfiles.txt to somewhere in your home directory
- On the command line, extract the epoch, longitude, latitude, and height from the
FAIR.pfiles.txt
file that has been passed usingawk
(HINT: separate variables with a comma in the print statement and they will be separated by a space in the output). - Now figure out how to redirect this output into a new file
FAIR2.llh
Now that you know how to do those two key actions, create a new tcsh script pfiles2llh
in $BTM_BIN
,
which generalizes this for any .pfiles
file it gets as a command line argument.
The format for executing this script at the command line should be like this:
> pfiles2llh STATION_NAME.pfiles
Command line arguments are given to a script in various forms. ONE is using the built-in variables
$0, $1 ... $N
. Inside your script $0
is
the program name that has been called. $1, $2, ..., $N
are the first argument, 2nd, ..., n-th argument for the
program that has been called. This convention is generally used when you have a few arguments that you expect to be handed
to the script in a certain order.
Here is an interesting article that tells you how to find the
maximum number of arguments for a shell command, if you are curious.
Here's what your script is expected to do:
- Your script should expect ONE file to be the FIRST argument when it is called. Bonus exercise: use
$#
to test for the existence of an argument (i.e. give error/usage message when there is NO/too many arguments). - extract the epoch, longitude, latitude, and height from the
.pfiles
file that has been passed usingawk
(you did this above) - redirect the output into a new file
STATION_NAME.llh
: - you will assume that a
pfiles
file is named following the conventionSTATION_NAME.pfiles
- Great, so you can extract the
STATION_NAME
using the programbasename
; save the value to a variable inside your script (e.g.sta_name
) - The output file should end up in the same directory as the input file. You can extract the path
or directory name of the input file using
dirname
. Again, save the result to a variable in the script (e.g.sta_path
). - now redirect the output of your
awk
call into${sta_path}/${sta_name}.llh
- HINT:
echo
the values of the variables you set for testing, to make sure you're doing things right! - You should be able to call your script from anywhere (e.g. other scripts) having data sit any other place.
- BONUS: Allow for an optional second argument that will be the output directory (not the filename though).
Exercise 2: More Data Handling with awk
For this exercise, you will do some further manipulation of the FAIR.pfiles.txt
file.
Columns 4, 5, 6 of this file are the longitude, latitude and height data for each day, in
decimal degrees and meters.
Columns 7, 8, 9 are the uncertainties in the east, north, and vertical component estimates,
in mm.
- The first step for this exercise will be to write code using awk to calculate the mean latitude, longitude and height. To do this, you will need to use internal awk variables to keep track of the sum of each individual estimate and the number of estimates, and then at the end of the file compute the mean.
- Now modify your code to store the means into a variable. You will need to use the set command and the backtick to get the output from awk into a variable.
- Finally, read the file again. For each line of the file, output the difference of that day's longitude, latitude and height from the mean, and the 3D uncertainty (the square root of the sum of squares of the uncertainties in the east, north and vertical components). Add some useful text to each line so that anyone can understand what your output means. (HINT: you can multiply numbers using the '*' operator, and the square root function is 'sqrt()'. You should use the -v option to awk to pass the mean values into the program).
Dr. Jeffrey T. Freymueller
Professor of Geophysics
Geophysical Institute
University of Alaska, Fairbanks
Fairbanks, AK 99775-7320