GPS analysis system

From GeodesyLab
Jump to: navigation, search

We use a GPS data analysis system based on the GIPSY software developed at JPL. Most of the GIPSY programs are called by shell scripts written by Jeff Freymueller. Using these scripts, we can analyze a large amount of data either as part of network solutions or in Precise Point Positioning (PPP) mode.

Documentation of Solution Strategies (new)

The links below describe our solution strategy as it has evolved over time.

1.0 1990s strategy.
2.0 Network solutions used from ~2002 through 2008.
2.5 PPP solutions but otherwise strategy 2.0
3.0 2010 new solution strategy

Where do you put RINEX files?

RINEX files should be put into the hopper, $RAWDATA/hopper. What, you don't have RINEX files yet? See RINEXing. Once files are in the hopper, you can either let the first processing stages happen automatically overnight (see next section), or run the autofront and autoclean programs manually.

What happens automatically?

Quite a lot.

Autoftp runs every night beginning at 6pm local time and fetches data files. These are placed into the hopper ($RAWDATA/hopper), a directory where all data files are put for entry into the system and processing. Autofront then runs at midnight to process all files in the hopper, including any placed there manually (from campaigns, for example). Finally, autoclean runs at 4am local to carry out automated screening for cycle slips and other bad data.

Autoftp

Autoftp is an efficient data-fetching tool that uses wget to automatically get data from any of several internet GPS data archives. It reads a list of desired sites from a request file, which contains the date in the filename, and attempts to find and download data from as many sites as possible. It is intended to run automatically on a daily basis under cron, and when acccompanied by another simple program to generate a standard request file every day, it can easily fetch a standard set of sites on a daily basis for analysis. Because it keeps track in the request file of sites that it has found already, autoftp can be run multiple times with the same request file and it will not repeatedly fetch data. This is ideal for the real world, in which data from some sites are available rapidly while data from other sites may require many hours or days to become available.

Autofront

Autofront is a script intended to run under cron that carried out the initial "front end" processing on a set of GPS data files. When executed, it will process all files in the hopper directory, and will place each resulting qm file into the appropriate week directory.

Autofront runs the following steps 1. Checks on the validity of RINEX file and repair of some common problems. 2. Depending on receiver type, clockprep -fixtags 3. (optional, presently not default) PhasEdit 4. ninja

Autoclean

Autoclean carries out automated cleaning of cycle slips, based on point positioning solutions. It is quite effective and at present it rarely misses cycle slips unless they are smaller than its minimum tolerance (10 cm). Autoclean operates on an edit-request file, which contains the name of the directory (week directory) and a list of qm files that need to be cleaned. It will clean all files on the list as long as orbits and clocks are available, and it marks off files that have been cleaned so that it can safely be run multiple times.

Autoclean operates in an iterative mode. It's zeroth iteration is to do a pseudorange-only solution and identify and delete extremely bad pseudorange data. In this step it uses a tolerance that catches only grossly biased data. (Explain it). It then carries out 1 or more iterations of screening the phase data. In each iteration, it uses postbreak to identify discontinuities in the residuals of a point positioning solution. Postbreak is run with an adaptive tolerance (minimum 10 cm), and it is critical that my slightly modified version of postbreak be used. If any cycle slips are discovered, they are flagged and another iteration is run. Autoclean runs a maximum of 4 iterations on the phase data.

Where do the data files go?

Data files from each station are stored in the QM format that is native to GIPSY. QM files (and all other) files are stored in directories by GPS week. For each week directory there are several subdirectories; qm files are stored in $ANALYSIS/wwww/qm, where wwww is the 4 character GPS week number (with a leading zero if needed).

Running Static Solutions

In the flt directory for each week there will (hopefully) be a UNIX script called
make-*
This script runs another script called
standard_*_solution
which again runs another script called
solve
for each network of stations (see each subnet with Google Earth) for each day. The solve script runs solutions for each of these networks based on the data from sites in the network that are available in the qm directory.

To run solutions, copy the make script to a file called make-flt (for example).

Check that the make-flt script contains all the days that you want to run.

Check which computer is free to run the script (to do so, type:)

  check-solves

log on to a free computer and type

  submit make-flt

As the script runs, files will appear in the flt directory for each network for each day. Usually this script will have been run once automatically, so there will often already be files in the flt directory ready to be cleaned and then re-run.


Solve, a very flexible script. (link to detailed help)

Philosophy of solve

Subnet files and campaign files

Standard solutions

(text of standard_Alaska_solution)

Running several days at once: make-make-flt and make-alaska

(text of a standard make-alaska file and variant)

Running several weeks at once

(text of sample rerun-* file)

Cleaning Static Solutions

Sometimes bad data (outliers and cycle slips) make it past the automatic editors. When this happens the bad data are removed from the qm files by either deleting points or by inserting new phase ambiguities to deal with cycle slips. The steps, commands and scrips to use are somewhat explained here. Once the data are cleaned, the files should be deleted from the flt directory and the solutions re-run (run make-flt again). Usually you will have to go through 2-3 iterations of the cleaning-rerunning cycle. To be clean, the .point files for a solution do not exist or are small (below 1000 bytes). Once a solution is clean the files should remain in the flt directory and the lines to rerun that solution should be deleted from the make-flt file.

Initial Explanation of terms

Expected residuals from a clean solution

Automated screening: postfit, the point file, postbreak

Checking for bad pseudorange data: badp, allbadp

Removing biased pseudorange data: del_pcode_arc

Automatically identified cycle slips: breaks, allbreaks

Quickly scanning through residuals: short_hand
The data problems can be identified and fixed using the program

  short_hand 

(follow the link to read about and ask for help to get started with this program).
Limitations of short_hand

Manually checking residuals and fixing problems

Procedure for Running Solutions for a week for the first time

A few special things need to be done the very first time solutions in a week are run. First, you need to make up a script to run all days of the week. This may need to be edited if the JPL final orbits are not available at the time. The standard_Alaska_solution uses the non-fiducial orbits and thus requires that the final JPL orbits be present. If they are not, you can run rapid_Alaska_solution instead.

Then, the log files from autoclean should be moved away to a subdirectory, and any problem stations identified by autoclean should be checked. Then, you are ready to run the solutions.

First, make a script to run solutions. For example, to make a script to run all days of the week for the Alaska solution:

cd $ANALYSIS
make-make-flt 1381
cd 1381/flt
vi make-alaska
#  Edit the file if needed so that you are ready to run the rapid solutions.

cat make-alaska
#!/bin/csh -f
#
setenv CAMP $ANALYSIS/1381
#
#standard_Alaska_solution 06jul01
#standard_Alaska_solution 06jun25
#standard_Alaska_solution 06jun26
#standard_Alaska_solution 06jun27
#standard_Alaska_solution 06jun28
#standard_Alaska_solution 06jun29
#standard_Alaska_solution 06jun30
#
rapid_Alaska_solution 06jul01
rapid_Alaska_solution 06jun25
rapid_Alaska_solution 06jun26
rapid_Alaska_solution 06jun27
rapid_Alaska_solution 06jun28
rapid_Alaska_solution 06jun29
rapid_Alaska_solution 06jun30

The script make-make-flt finds all unique dates for qm files in that week's directory, and uses that to generate the script, so if you run it before the end of the week you will get a partial script. If the final JPL orbits are not yet present, you will need to edit the script to change "standard" ro "rapid". Or better yet, copy all the lines and comment one set out, then modify the others to read "rapid_Alaska_solution <date>".

Next call mv_logfiles (being in the WEEK/flt directory!) which creates a subdirectory called logfiles and moves all of autoclean's logfiles of the format *____*.i* into this directory:

HOSTNAME WWWW/flt> mv_logfiles

Now look for a file called make-problems, which lists all files that autoclean had a problem with. Sometimes these files are almost clean, but sometimes they are full of junk or horribly mangled by the automated editing. There should be PPP solutions already run for these files, so they are ready to be checked.

Set the CAMP variable (if not set):

setenv CAMP $ANALYSIS/wwww

Again, wwww is the 4-digit GPS-week.

Now run the solutions. The first time you run the solutions, look at the residuals very carefully before trying short_hand. Uncompress the postlog, postfit and postbreak files, and then use allbadp to check the pseudorange and allbreaks to check for major cycle slips.

A very common problem is that for several stations per week, there will be one satellite arc of pseudorange data that all have residuals of roughly 2000 cm. If you see these, don't delete the data, but instead run del_pcode_arc to remove only the pseudorange data. I am not sure why these show up, but it could be either a hardware channel bias or a pre-processing glitch. They happen much more often with Ashtechs than any others, and are particuarly common with the US Coast Guard CORS sites. In the qm directory,

del_pcode_arc *02gus2* GUS2 GPS41

If you just run short_hand without looking, it will probably either throw out all the pseudorange for one site, or delete a lot of data (phase and pseudorange) where only the pseudorange needs to be deleted. So don't do that. Then I delete a batch of bad pseudorange data to get the number of pseudorange outliers under control for the next run.

cd $ANALYSIS/1381/flt
gunzip *alaska*post*
allbadp
allbreaks
# Based on this, run del_pcode_arc as above, and add ambiguities manually if needed.
#
delete_allbadp 50
#  This creates a file called delete.
vi delete
#    Remove lines for any points for which you have already run del_pocde_arc.
cd ../qm
sh ../flt/delete

At this point, don't worry too much about phase outliers. Basically we are trying to get the number of pseudorange outliers down into the range where short_hand will do the right thing when we run it later. Now may be a good time to run Alaska_cleaning_solution $date, which runs a smaller and much faster solution including the sites that most often need some cleaning.

Data Backup

RINEX file backups. There are either 1 or 2 separate backups of the raw rinex files. For data we collected ourselves, a copy of the original rinex files can be found in either the campaign directory (/gps/akda/Campaigns/Data.2007/<project>, where <project> is the project name, and Data.2007 will change with the year), or in the continuous site ftp area (/gps/akda/Permanent/2007/260/, where 2007 is the year and 160 is the day of year). Also, every rinex file put through the hopper is moved to a directory like $RAWDATA/2007/260/ (again, year and day of year may change). However, the $RAWDATA/2007/260/ directories are not really archived and eventually they will be deleted. But in practice we have most of the last few years of rinex files online in case something goes wrong.

QM file backups. Before autoclean makes any changes to a qm file, it copies the file to a subdirectory called "original" in the qm directory. so if you completely destroy a qm file by accident, you can still go back the original version. Of course, that loses all editing done to the file, but at least the original data can be recovered easily. In general, it is not a good idea to go back to the version in the original subdirectory unless you know what you are doing, because doing that can make a lot more work for everyone. Mostly we do that when files have been mangled by autoclean. It is actually hard to mangle data files using our usual editing procedures.


Customized Solutions

Sometimes customized solutions are required for various reasons. This link will provide some strategies that may improve your situation.
Kinematic Processing
Ambiguity Resolution

Products / file contents

Where to find certain information and what are the structure / content of output files? It is started to be summarized on the files page.

Velocity solutions

/> su akda
/> cd $ANALYZED
/> mkdir <USEFUL NEW PROJECTNAME>

Find another project and copy the following files into your new directory:

/> cp *.nml $ANALYZED/new_project
/> cp make_vel* $ANALYZED/new_project

Rename the *.nml file to something reasonable for your project and edit this file, it needs to know the stations that you want to include into your velocity solution as well as the input file locations where the data comes from. To get the input files in the correct syntax you might wanna use:

/> grep_vel_infiles.pl --from-week=WWWW --to-week=WWWW --infile-index=x --sum-id=<alaska2.0_nfxigs03 | NEAsia2.0_nfxigs03 | ...>

"WWWW" stands for the week you want to start / end with and "x" is the starting value for the id-counter. These id's are useful later to reference certain input files for additional editing purposes (see below). From-week, to-week, and infile-index are optional. "sum-id" is basically the solution name. Copy all lines of the format " infile(x) = '...'" into your namelist (nml) file (insert before the &end). (grep_vel_infiles.pl documentation)

Once the editing of the namelist file is finished:

/> refresh_zebu younamelist.nml outfile.ref

This is necessary to order and re-number the entries in your namelistfile which can contain comments. To not have you go through the namelist file whenever you want to try to throw out a station or some data and renumber everything, refresh_zebu does that for you.

Once you have a nice reference file (.ref):

/> rzebu2 outfile.ref > & out

You should redirect the output to a file so you can look at it later :).

For example right now. As soon as the solution is finished running, look at the total Chi_squared value at the bottom of the output in "out". It should be "1". If that's not the case which is likely if you run a solution for the first couple of times look for sites that cause the deviation from chi_squared=1. Note the site names with the larges chi squared values. Then you can do three things:

A) /> grep SITE outlier.inf
B) /> grep SITE residual.inf
C) /> vi $ANALYSIS/solution_timeseries/SITE.pfiles

In all three cases you want to find dates on which the sigmas for this site are rather large. Note down the date and find it in the namelist file (*.nml) and remove the respective site from the velocity solution for that day by adding a line:

removedat(a, infile_id) = 'SITENAME'

"a" is the id that simply counts how many removedats have been invoked on that one infile_id. "infile_id" is the counter I mentioned above. An example is probably best:

   infile(161) = '/gps/analysis/1222/post/03jun10NEAsia2.0_nfxigs03.sum'
    removedat(1,161) = 'ELD '
    removedat(2,161) = 'PETP'

Here I assume that both sites, "ELD" and "PETP" misbehave on June 10, 2003 for the North East Asia solution. Hence I remove them from the velocity solution.

Remove all the files created by rzebu2. Creating a Makefile of the form:

clean:
        rm *.ref ATWA ATY solution.* *.inf out nnr.* *.dat *.gmtvec *.vel fort.* gmt.format STACOV argus.weights

might be useful.

Repeat the above until your reduced chi squared is <= 1.0. If you can't get there, change the fudge_factor as follows:

new_fudge = old_fudge * Chi_squared

and rerun the solution one last time. The reduced chi squared value should be 1.0 .

Once you achieved that you can go to the next level and run one of the make_vel files you copied to your directory:

make_vel_Sella: velocities relative to a stable North America
make_vel_ITRF: velocities in ITRF 
make_vel_EURA: velocities relative to a stable Eurasia 
make_vel_XXXX: velocities with reference station XXXX

There might be others, or you could go ahead and edit these files to adapt them to your needs. The make files create *.gmtvec output which you can use to with e.g., psvelo in a GMT script.