Using quick_adcp.py to process ADCP data
Table of Contents
Shipboard ADCP data processing requires several necessary steps. The reality of data (and all the things that can to wrong) make the steps more complicated than one might expect at first. Basic processing of a clean dataset is easy; any problem with the data increases the complexity of the processing.
CODAS (Common Oceanographic Data Access System) is a database designed to store ADCP data and associated information (eg. heading, time, position). The CODAS database is simply a vehicle for storage and organization of the ADCP data while various processing steps are run. “CODAS Processing” refers to the University of Hawaii collection of programs that use the CODAS database to process ADCP data.
CODAS processing steps are designed to be flexible (to cope with different data sources and problems encountered during processing) and automatable (so the basic steps can be run easily with minimal overhead per dataset). Many datasets do not need all of the flexibility available, but since some data streams sporadically fail and some improve over time, there are necessarily many options in CODAS.
Quick_adcp.py is a tool to streamline the basic processing steps and provide a uniform naming convention for the various files used in processing. It has various switches that are required and others that are used to specify the kind of data being processed.
This document is designed to introduce the CODAS processing steps run by quick_adcp.py, point to other tools and resources, and help the user understand how to use quick_adcp.py for their particular dataset.
NOTE: Prior to using Univ. Hawaii CODAS processing software, a computer must be set up with the appropriate software. Details of the computer setup are available here.
Here are the basic steps in CODAS processing of a shipboard ADCP dataset. You may have other steps that need to be addressed.
Stage averages on disk (one of these)
load pre-averaged ADCP data
- DAS2.48 (RDI’s DOS software created “pingdata files”, usually from a NB150 ADCP).
stage pre-averaged ADCP data on disk
- VmDAS (RDI’s Windows software creates “LTA” or “STA” files) Runs on BB, WH, or OS (not NB).
make averages from single-ping data, stage on disk
- UHDAS (Univ Hawaii linux acquisition system, can create averages for use with CODAS processing)
- VmDAS (RDI’s Windows software creates ENR files that can be converted to a UHDAS-like directory structure (along with supporting ancillary data) and averaged as if they are UHDAS data.
Transect (RDI Windows 3.1 software, usually used with BB, is not supported by CODAS Python processing)
Return to TOP
ADCP processing is done in a directory that is created by running adcptree.py. The processing directory is initialized with a particular collection of subdirectories and files. Some of those files are present for all CODAS processing directories, and some are specific to particular kinds of datasets. The processing directory should be in a working area of your disk, NOT in the UH programs directory tree.
Type “adcptree.py for usage, specifically if processing averaged data (LTA, STA, pingdata). Type the following to get more usage information for single-ping processing:
NOTE: Examples for this document were run on a linux machine, with UH programs rooted at /home/currents/programs.
It is important to keep the processing directory relatively free of clutter. If the data are worth using, it is likely that someone will look around in the processing directory for information about how the data were processed and what anomalies were present. Preserving a relatively linear path from the start of processing to the end of processing helps the person later figure out what was done.
Sometimes the best thing to do is simply delete the whole processing directory and start over. It is therefore best to keep original data and processing notes out of the actual processing directory until you are sure you won’t be deleting it for some reason.
We recommend a directory structure like this for adcp processing:
Bearing that in mind, here is one approach to processing strategy
- make a directory for the cruise that will hold:
- the processing directory
- summary notes (metadata file; instrument configuration, dates...)
- detailed notes (eg. comments about the data, such as gaps, biases)
- instructions (suitable for cut-and-paste for your next attempt)
- quality directory for exploration of the data
- run “adcptree.py” to make your processing directory
- write down the adcptree command you used in a file
- name the file something related to the processing directory name
- keep the documentation file outside the processing directory until you’re done with it. That makes it easier to redo the steps from scratch without losing the file (if you have to start over)
- type the following to see the right prompts for the datatype you have (eg. LTA)
- type the appropriate help query for quick_adcp.py, eg:
quick_adcp.py --commands lta
proceed to process the data
write down enough of what you do that you can
- can use it in 4 months to remind yourself of what to do with the next dataset
- can show it to someone else as a tutorial
- start over if you need to change processing parameters or made a mistake.
Inside the processing directory is a file called cruise_info.txt which has the beginning of a reasonable text file. Start with that; it is already tailored for your processing directory.
The CODAS database is actually a collection of binary files whose names are composed of a prefix, a 3-digit number, the suffix “blk”. There is one database directory file (also binary), whose name has the same prefix, and ends in “dir.blk”. Here is an example:
ademo001.blk ademo002.blk ademo003.blk ademo004.blk ademo005.blk ademodir.blk
The prefix here, ademo is called the database name. In the control files used by quick_adcp.py, you will see examples of a relative path to the database, such as
This name is specified when quick_adcp.py is run, and is the prefix for many files. Those files are distinguishable by the directory they are in and the suffix they have.
Anytime a database name is referred to in this document, the example will use ademo. Anytime the string ademo is used, it is referring to the database name.
Following the instructions in the demo or in the quick_adcp.py help, use adcptree.py to create a processing directory.
The processing directory is created with the following subdirectories:
The ping directory is the default repository of pingdata files (usually called PINGDATA.* (The data location can be overridden in quick_adcp.py)
The config directory holds UHDAS single-ping processing parameters. It is only relevant if you are going to re-process the data. For post-processing (edit, calibration) it is irrelevant
The scan directory will be used to hold a list of time ranges and time info for each data file. The full time range of the dataset is also stored here
The load directory will be used for loading the data into the database. For pingdata, one executable does this. For all other data types, a two-step process exists:
- three sets of files are generated: a pair (suffix bin and cmd) with data for the database, and another set (suffix gps2) with time and position at the end of each averaging period.
- load *.bin files into the database; stage *.gps2 for naigation
The adcpdb directory will contain the database and the configurations used during acquisition
The nav directory will be used for navigation calculations, including smoothed reference layer and plots of same.
The edit directory will be used by gautoedit (graphical editing).
The cal directory will be used for calibration calculations.
- cal/rotate (time-dependent heading correction stored here)
- cal/watertrk (watertrack calibration and ADCP-GPS offset)
- cal/botmtrk (bottom track calibration)
- cal/heading (not used by quick_adcp.py)
The contour directory is used to store data suitable for making contour plots (eg. 15 minute averages of 10-20m vertical bins)
The vector directory is used to store coarser averages suitable for vector plots (eg hourly averages of 50m vertical bins)
other directories – These are sort of fossils
- The grid directory will be used to grid the data for plotting.
- The quality directory contains Matlab scripts for plotting on-station and underway profile statistics; this is a good place to stage your own QC investigations.
- The stick directory contains programs to make summary plots of some specific spectral information, treating the data as a time-series (not used by quick_adcp.py)
Return to TOP
Under most circumstances, a UHDAS dataset brought back from sea is already processed, and all that remains is manual post-processing (check heading correction, check calibration, edit, document)
Examples for quick_adcp.py processing start here. You should work through the example and become familiar with the postprocessing steps (heading correction, calibration, editing) before attempting to process a single-ping dataset from scratch.
Return to TOP
Quick_adcp.py contains quite a bit of documentation about itself.
Running “quick_adcp.py –help” shows how to get help with the usual kinds of datasets. On-line access to that help is here
Always run quick_adcp.py from the root of the processing directory (the directory created by adcptree.py)
Return to TOP
Prior to loading the database, we scan the data files in order to determine whether there are issues with timestamps that need to be addressed. The “Scan” step performs two operations:
- list time ranges and perhaps otherinformation about the data files
- create a file with the time range of the data
Pingdata are scanned using the executable “scanping”. Go to the PINGDATA demo (Python Processing), and look at the “scan” section for a complete description of the output of “scanping”.
All other data (VmDAS and UHDAS) are scanned by the processing engine (Python). In either case a little standalone program is written and then run. The main point of this step is to get the time range.
If the database name of this example is “ademo”, the following files are written:
- ademo.scn (contains timestamp information about the data files)
- ademo.tr (a human-readable time range, extracted from ademo.scn)
CODAS time stamps come in two flavors.
CODAS database timestamps refer to the end of the ensemble
Return to TOP
The “load” step refers to “loading” the data into the database. Historically (“Pingdata”), the files were loaded into the database with one executable program. For modern (VmDAS, UHDAS) data, files with averaged profile information are staged in this directory and then the ldcodas program reads data and creates the database from those files.
The averaged profiles are stored in the load directory as pairs of files (*cmd and *bin), with the cmd files containing instructions to “ldcodas” and the bin files containing the data. The *.gps2 files contain times and positions at the end of the ensembles. Those are used later in the “nav” dirctory.
For singleping data, more configuration information is required, since the raw data do not yet have corrected timestamps or any ancillary data.
Processing UHDAS data entails reading the single-ping ADCP data (correcting time and adding navigation and attitude), editing the single pings, averaging the data, and writing the *bin and *cmd files. Then ldcodas creates the database.
Processing UHDAS data from scratch requires a good grounding in CODAS processing, which can be obtained by working through examples with LTA data or by post-processing a UHDAS dataset.
VmDAS data include three kinds of single-ping data. Although CODAS Python programs can read ENS and ENX data, they may contain timing or other errors, so we use the ENR data. Configuration information is required to properly parse the ancillary data. These are reformed into fake UHDAS data, and then processed as if they were acquired as UHDAS data.
“Pingdata” refers to averaged data acquired by DAS2.48. See this note.
Return to TOP
After any quick_adcp.py step, you may want to check the database to see what changes you have made and to ensure that everything is working as expected. This is not a step run by quick_adcp.py. You can go to another commandline window and run the following command as you work your way through the quick_adcp.py steps:
The program showdb is ascii menu-driven commandline utility exists to probe the database and determine what is stored in it. There is no substitute for this important but old-fashioned program.
The PINGDATA demo (Python Processing) explains showdb in detail, using the original pingdata demo, in which a database was created using two pingdata files. There are various examples of showdb throughout the original pingdata demo instructions.
You can use showdb to check various aspects of the database. For instance, immediately after loading, the database will contain measured velocities, depths, heading, and various configuration information, but the positions will show MAX, i.e. bad values: positions are loaded in a later step.
Return to TOP
Accurate heading is essential for high-quality ADCP data. An error in heading of X degrees causes an error in the cross-track direction that scales as
error = shipspeed * sin(X).
For a ship travelling at typical cruising speeds, i.e. 5m/s (10kts), a one degree error in heading causes a cross-track error of 10cm/s. Gyros, especially older gyros and especially in low latitudes, can wander significantly, causing completely spurious cross-track errors that manifest as “eddies” in the data.
An example of a few degrees offset is shown deep in the documentation for the gui editing utility, “gautoedit”.
This page graphically illustrates the difference an error of 2 degrees degrees makes on a dataset.
Headings can come from a variety of sources, some more accurate than others. Your access to headings depends on a variety of factors
the acquisition system (UHDAS, VmDAS, DAS2.48+ue4, DAS2.48)
Ashtech, Seapath, POSMV, various optical gyros)
instrument failed? data recorded elsewhere?)
You need to know (or find out) the sources of heading for your dataset, what is available where (which files contain which information), and what heading source was used for processing. If there is only one heading device, your options are limited. If there are both gyro and some other heading source, you can compare them to see what the differences are in quality and behavior, and either correct the data (if acquired with gyro) to the other source, or (if processed with the other source) possibly make a statement about that instrument’s data quality.
If you have UHDAS data, the strategy is to use gyro data for the initial conversion from beam to earth coordinates, and correct that with the 5-minute average of the difference between the gyro and the other heading device. In older processing, this was done in a batch mode, rotating the database after the database was created. In newer UHDAS processing, the heading correction is built into the averages (*bin and *cmd files) before they are loaded into the database, and the values used are recorded to disk (cal/rotate/ens_hcorr.ang)
If you have VmDAS data, you may have more than one source for heading, depending on whether there is a synchro gyro input (Ocean Surveyor only). Check the N1R`, N2R, and N3R files for heading messages. If gyro was used as the primary heading device, a time-dependent heading correction might be computed and be applied to the database.
If you have pingdata, and if ue4 was used, and if there was an ashtech on board, the likely source of heading correction data is extracted from ubprint. After loading the database, the time-dependent heading correction can be examined, (corrected if necessary), and applied to the database.
The rotation step takes place in the cal/rotate directory. If there is a time-dependent heading correction file, plot the corrections and make sure you are applying something reasonable to the database.
Return to TOP
Bottom track calibration uses bottomtrack data (if there is any) to determine the remaining transducer offset to make the ship track over ground match the track measured by the ADCP.
Watertrack calibration uses the idea that the water velocities should not look any different whether the ship is stopped or moving; turning or going straight. It uses parameters to identify times when a significant acceleration was detected (turn and/or speed change) and calculates what rotation and scale factor would be necessary to make sure the ocean velocity looks the same before and after the acceleration. The calculation is necessarily noisy, and depends on ship behavior. For instance there are probably no watertrack calibration points on a transit, and many on a CTD hydrography cruise or a bathymetric mapping cruise.
The parameters used to detect watertrack calibration points can be tuned, but quick_adcp.py uses particular set and runs the calibration steps during the first pass. These are diagnostics, and give the user some information about further rotation or scaling necessary.
A summary of calibration files and the sign convention (plus or minus) is here.
A new calibration calculation has been introduced which is sometimes important: it estimates the horizontal offset between the ADCP transducer and the GPS (used to calculate ship speed). A preliminary page documenting this is here.
Return to TOP
After you’ve run quick_adcp.py for the first pass, you must edit the averaged data (to remove things like wire interference, data below the bottom, bad profiles), investigate the necessity for further rotation, and decide whether a scale factor is required. This is an iterative process.
If there is a large constant heading error or a time-dependent heading error necessary, it is much harder to edit the data because changes in speed will cause changes in the ocean velocity which may look like errors.
To converge on your final dataset,
Ensure that the time-dependent heading correction (if it exists) as good as possible. If you change the heading correction, rerun quick_adcp.py with --steps2rerun navsteps:calib
look at cal/watertrk/guess_xducerxy.out to see what transducer offset was guessed. This value represents the offset remaining between the GPS and the ADCP taking into account the values used so far.
- “Signal” values less than 500-1000 probably indicate a poor guess
- depending on how the ship is driven, and the cruisetrack for your cruise, offsets less than 5m may not make a big difference.
- Apply offsets if one offset is 10m or greater AND the signal is over 1000.
Look at bottomtrack and watertrack calculations, and apply any gross (larger than 1/2 degree) phase correction. If you rotate the database, rerun quick_adcp.py with --steps2rerun navsteps:calib
Apply any scale factor if necessary (see below). This “should” be unnecessary for ocean surveyors, but is not unexpected for fixed transducer heads, such as NB150 or WH300.
Go through the data with gautoedit deleting obvious additional bad data. Apply the editing by running quick_adcp.py with these options:
quick_adcp.py --steps2rerun apply_edit:navsteps:calib --auto
- repeat until they do not change: check editing; apply calibrations (normally there will be up to 3 passes through the editing, with the last requiring no additional flagging, and up to two applications of phase and scale factor calibration values)
More discussion follows:
Heading correction has two components: a time-dependent correction of gyro to Ashtech (POSMV, or Seapath), and a remaining contant offset. See the Heading Correction section for more detail.
To inspect the heading correction used during in an at-sea UHDAS processing directory, go to the cal/rotate directory and look at the ens_hcorr.asc file.
If you need to fix the heading correction (eg. there are gaps where no heading correction was applied), you must remove the already-existing heading correction by “unrotating” the database. Then fix the heading correction file, and rotate using the new file. See references to patch_hcorr.py
After the time-dependent correction is made, there may still be a constant offset. Estimates of that value are in the “phase” in watertrack and bottom track calibration files, or from “recip.m” (if there is a reciprocal track available). More details are in the PINGDATA demo (Python Processing) for bottom track and watertrack calibrations.
For a fixed transducer instrument (NB, BB, or WH) a scale factor may be necessary. Check the thermistor temperature to make sure the thermistor is not broken. You may need to fix the speed of sound. It is possible that application of constant scale factor is all that is necessary.
See the PINGDATA demo (Python Processing) discussion about thermistor checking.
Although the Ocean Surveyors should not (in theory) need a scale factor, experience suggests it is not uncommon to need about 1.003-1.005; these are reasonable numbers. If, after editing, the scale factor for an ocean surveyor is still greater than 1%, either there is still a problem with the data (eg. underway bias not edited out) or there is a problem with the instrument. You may need to look at additional datasets or talk to others who have used data from this instrument to try and determine whether you have a problem.
Quick_adcp.py sets up the edit directory for the graphical editing tool, gautoedit (gautoedit docs here ).
This tool was designed to screen data for things like ringing, on-station wire interference, jittery navigation or velocities, and data below the bottom.
A UHDAS at-sea processing directory already has the default gautoediting parameters applied. You can use showdb to see that this is true.
NOTE: “Flagging data as bad” is mostly a one-way trip: you can add flags to the collection, but if you want to remove them, things get complicated. This page discusses various scenarios.
Use gautoedit to look through the database and decide whether any of the missing data should be unflagged, or whether you just need to flag some additional data. Click on the button “do not show autoedit editing” to see what is in the database. Click on “do not show profile flags” to see the original data.
In general, the gautoedit summary is:
When you are finished, apply the editing as
quick_adcp.py --steps2rerun apply_edit:navsteps:calib --auto
More detail regarding profile flags (and the underlying application of PROFILE_FLAGS to the database) can be found
After you think you are done editing, use “dataviewer.py” to review the data. You should not see any of these signatures in the ocean velocities
- transitions between on/off station
- ship turn
- bias in the direction of ship motion (usually with low PG)
- big stripes of missing data at turns or accelerations
If you have sufficient calibration values to tell, your watertrack and bottom track calibrations should have phase within a few tenths of a degree of zero, with all estimates agreeing (mean, median, watertrack and bottom track) also to within a few tenths of a degree (if there are enough points). Scale factor should be within a fraction of a percent of 1.00 (0.997-0.003) and different estimates should agree (mean, median, watertrack and bottom track)