Processing of pingdata with quick_adcp.py

This document was originally written for a unix-savvy data-literate person going on a cruise with both a NB150 (DAS2.48) and an OS75 (VmDAS). References are made to that cruise as an example.

Note

There are three sections here. If you are looking for a tutorial about pingdata processing, all you want is the first section, “Batch mode; one pass”.

Batch mode; one pass

The first step is to get through one complete pass through the processing. Sufficient documentation exists (online or contained withing a CODAS installation) to train a person on processing pingdata. The documentation assumes they have the time and the desire to learn how to do it. The approach of this document is to take a stab at a black box, and hope it works. I will describe the black box, the inputs and outputs, and some rough sanity checks on the data, liberally referring to the existing documentation (only linking to it in the html version).

CODAS (Common Ocean Data Access System) is fundamentally a database. We load the averaged data into it and then in various steps, access the database and write files to the disk, manipulate those files to create different ones, and put information from those new files back into the database.

Extractions are done with C programs, called as “action action.cnt” (where “action.cnt” is an ascii file with a specified format and contains configurable options for “action”). Manipulations are done with a combination of C and Matlab code. Usually the matlab code is a small script or wrapper which contains variables to change, and when run, calls outside scripts or functions to perform its duty. Changes to the database are implimented with more C code. “Pingdata” refers to the original 5-minute averages assembled by DAS2.48 and written to disk as “PINGDATA.???” files.

DAS2.48 runs on an old PC and has several additional programs that run with it,

  1. a “watchdog” which reboots the computer if no DAS activity is seen within a specified time, and
  2. ue4 which logs Ashtech data, does QC on it and keeps track of the Ashtech-gyro correction, and logs GGA navigation.

There are two mechanisms for getting the 5-minute averaged data to your computer:

1. Ue4 can also spit out the 5-minute data (a short blast of binary) on a serial port. With sufficient warning and Marine Tech willingness and participation, ue4 can be set up to do this. You must acquire the data on an appropriately-configured linux box with ser_bin (eg. as one-day files called “ensemble.DDD”, where DDD is the decimal yearday).

2. The acquisition PC may concievably be networked, or you could use sneakernet (once a day, to stop data acquisition and carry the pingdata files over to your computer by zip disk or something)

In this document, “ensemble.DDD” and “pingdata.???” are equivalent.

Because the steps in processing are quite regular...

  • scanping scanping.cnt
  • loadping loadping.cnt
  • ubprint ubprint.cnt (etc)

...we have written a Python script to automate these steps. The Python program quick_adcp.py runs through the steps, changing to the right directory, writing the control file (or the matlab file), running it, and changing back to the original directory, asking the user at each step whether to run the next step. Al the control files are suffixed with “.tmp”, all the matlab files end in “_tmp.m”, and a record of steps is written to a file ending in “.runlog”.

Complete “processing” of ADCP data actually requires several iterations:

1. get the velocity data and the navigation into the database, clean up (smooth) the navigation using an oceanic reference layer, rotate the headings according to the Ashtech-gyro correction calculate residual phase and amplitude calibration values

2. apply any phase or amplitude necessary (** see below) clean up (smooth) the navigation using an oceanic reference layer, calculate residual phase and amplitude calibration values

3. edit bad bins, profiles, and bottom interference (for historical reasons this is a two-stage process: (a) write ascii files to the disk which contain information about the bad data; (b) apply these flags to the database (we turn on bits indicating bad data, we do not actually delete any of the data) clean up (smooth) the navigation using an oceanic reference layer, calculate residual phase and amplitude calibration values

  1. make some matlab files suitable for vector or contour plotting

Calibration values: After the Ashtech-gyro correction has been applied, there may still be an offset in heading. Look for “phase” under the “mean” and “median” columns cal/watertrk/adccal.out and cal/botmtrk/btcaluv.out. If there are sufficient points (at least 10 watertrack and 40 bottomtrack is reasonable) in “edited”, then a value between the bottomtrack and watertrack values for the rotation. Amplitude for the Melville looks OK, so you shouldn’t have to make that adjustment. Ultimately, with enough points, we expect the final calibration amplitude mean to be around 0.997-1.003 and phase to be around -.01 to 0.01 (stddev around 0.3-0.4 is good). In this example, we are starting with an unknown calibration phase and amplitude, but once we know what they are we can build that in on the first step (see part II).

Data processing takes place in a directory created by adcptree.py. Do this once to start a new processing directory. Processing takes place in that directory. Manually, each step would take place within a subdirectory devoted to a particular steps (scanning, loading, navigation-related, calibration-related, editing, etc). For each step, quick_adcp.py changes directory to the appropraite location, runs the step, and changes directory back. Quick_adcp.py needs to know various things about the data, the most basic if which are:

variable name    :    what it is
-------------    :    ----------
yearbase         :    current year
datadir          :    where the data are
datafile_glob    :    wildcard expansion for files
                 :       !! QUOTE IT if it is on the command line
                 :       (do not quote if in a control file)
dbname           :    5-character basename for database
use_refsm        :    reference layer calculation (choose one of:
                 :       use_refsm
                 :       use_smoothr

NOTE:

quick_adcp.py must be run in one line of text. HTML is wrapping the line so you may be mislead

For data residing in /home/data/mv0407/adcp, named ensemble.???, the quick_adcp.py steps corresponding to the numbers above are:

  1. load the data
quick_adpc.py --yearbase 2004 --dbname a0407 --datadir  /home/data/mv0407/adcp --datafile_glob "ensemble.???" --use_refsm
  1. apply rotation
quick_adcp.py --yearbase 2004 --use_refsm --rotate_angle -2.0 --steps2rerun rotate:navsteps:calib
3a. do the editing
  • Go to the edit/ subdirectory, run gautoedit and edit the data (i.e. create the ascii files which will then become flags in the database.

3b. apply the editing to the database

quick_adcp.py --yearbase 2004 --use_refsm --steps2rerun apply_edit:navsteps:calib:matfiles
  1. (automatically generated when adding “matfiles” to the “steps2rerun” switch. see the following files, (do “help load_adcp” for info on reading):

    • vector/*.mat (for 1-hour 50m averages)
    • contour/*.mat (for 15minute, 10m averages)

automated batch mode first pass, but editing is manual

(All the data creates a database)

There is no such thing as “simply appending” to a CODAS database. Therefore, the next step in automation is to delete the database (or start a new processing directory) and automate the processing of one whole batch of files. Start by

  1. /bin/rm adcpdb/*blk - or -
  2. make a new processing directory with adcptree.py

You did the first run-through and found out what the phase and amplitude should be. (NOTE: since the Ashtech has been replaced, a new value is to be expected. But we found -2.0 from before, so we’ll use that in this example.)

Now, combining (first pass + calibrating [only if you already know the angle]) above becomes:

quick_adpc.py --yearbase 2004 --dbname a0407 --use_refsm --rotate_angle -2.0 --datadir /home/data/mv0407/adcp --datafile_glob "ensemble.???" --auto

(and run editing as above)

one-pass complete processing; apply calib and editing

(with previously-determined phase and amplitude corrections, using default editing parameters) (All the data creates a database)

  1. /bin/rm adcpdb/*blk - or -
  2. make a new processing directory with adcptree.py

Now, combining (first pass + calibrating + editing) becomes:

quick_adpc.py --yearbase 2004 --dbname a0407 --use_refsm --rotate_angle -2.0 --datadir /home/data/mv0407/adcp  --datafile_glob "ensemble.???" --find_pflags --auto

If this is too hard to read, you can run it with a control file as:

quick_adcp.py --cntfile mv0407qpy.cnt

where the control file "mv0407pqy.cnt" would be
#### begin mv0407pqy.cnt.  Anything after a "#" is a comment
--yearbase 2004                    ### current year
--dbname a0407                     ### prefix to adcpdb/*dir.blk
--use_refsm                        ### reference layer calculation
--rotate_angle -2.0                ### rotate by -2.0
# --rotate_amplitude 1.0           ### if you needed an amplitude correction
--datadir /home/data/mv0407/adcp   ### look for data in this directory
--datafile_glob ensemble.???       ### look for ensemble.??? NOT QUOTED HERE
--find_pflags                      ### find editing flags, apply when done
--auto                             ### don't ask. just do it
########    end of cnt file #####

If you are going to run this unattended, you might want to add the

--hidedisplay

option so Matlab won’t throw figures up to the screen when it runs.

“On-demand” Shipboard ADCP processing

Basically, there is no incremental pingdata processing. The recommended procedure is to delete the processing directory and start again with adcptree.py. All is not lost, however, if you do the following (example for cruise id mv0407)

  • start with a directory made for the purpose, eg at_sea

  • keep a readme, scripts, and other information in this directory, eg

    o quick_adcp.py control file o script to run adcptree.py o copy of a*.asc from the editing directory from previous run

Now, for each processing update:

  1. copy mv0407/edit/a*.asc to the at_sea directory
  2. delete the mv0407 (adcp processing) directory
  3. run adcptree.py again
  4. copy the quick_adcp.py control file in to the new mv0407
  5. run quick_adcp.py from mv0407
  6. copy a*.asc into the edit directory
  7. apply editing; add more editing