3.1.4. Cruise directory contents¶
The data for the present cruise is under the cruise name used for logging. During a cruise, all logging and processing take place in this directory. Figures and data accessible through the web interface are generated using these data. After the ADCP Operator has stopped ADCP data acquisition and clicked “End Cruise” the archived figures are copied to this directory, and it then contains all pertinent data from the ADCP.
3.1.4.1. Data for Scientists¶
There are three categories of data, all located in the logging directory, /home/data/CRUISEID. The cruise distribution should contain the entire /home/data/CRUISEID directory. The png_archive directory in each instrument processing directory is supposed to contain a summary collection of PNG files from the cruise.
This table is a short description of the contents:
subdirectory |
contents |
importance |
back up for … |
raw |
all raw data |
critical |
archiving |
rbin |
intermediate files |
nice to have |
anyone who gets ‘raw’ |
gbin |
intermediate files |
nice to have |
anyone who gets ‘raw’ |
proc |
processed data
|
final at-sea product |
science CD after cruise |
reports |
mini-webpage with metadata and overview of processed data |
nice to have (only in modern cruise directories) |
science CD after cruise |
If the cruise name is “VG1711” :
Data are logged under /home/data/VG1711 on the ADCP linux computer “currents”. The active UHDAS cruise directory contains the following subdirectories:
raw: logging goes on here in directories named by instrument
ADCP (wh300, nb150, os150, os75, os38)
ancillary (gpsnav, gyro, ashtech, posmv, seapath, mahrs, phins…)
config (configuration directory)
log (diagnostics directory)
rbin: intermediate version of ancillary ascii data in “raw” subdirectory, stored as binary.
ancillary only (gpsnav, gyro, ashtech, posmv, seapath, mahrs, phins…)
gbin: time-matched ADCP, navigation, attitude; time in UTC
enhancements added for Python processing make the Python-generated gbin directory slightly different from the Matlab-generated gbin directory.
proc: one CODAS processing directory for each instrument+pingtype possible;
standard CODAS processing subdirectories
config (instrument configurations used at sea)
png_archive (figures from the web site)
matlab and netCDF files extracted from CODAS database
reports: web page with summary information about the cruise (recent addition)
calibration
heading correction quality
summary of ocean data from the figures at sea
3.1.4.2. Cruise Directory Details¶
3.1.4.2.1. UHDAS “raw” (logged) data:¶
For a ship with an NB150 and an OS38 (for example) in the raw subdirectory,
ADCP data
Directories with instrument names (such as nb150 or os38) contain:
*.raw (binary files with ADCP data)
*.log (ascii file with time stamps that match each ping)
The raw instrument files are ready to be loaded into Matlab using an overloaded methods read function in your path if you run radcppath.m (must specify instrument and logging program). Type help read in Matlab for more info.
Serial ascii data:
For example gpsnav, gyro, posmv, ashtech, soundspeed.
All files are ascii, and contain alternating time stamps and messages:
$UNIXD, (time stamps)
$NMEA,ascii NMEA message
For instruments that send multiple messages, there should be a Unix time stamp for every message, e.g., for the Ashtech there should be:
$UNIXD …
$PASHR,ATT …
$UNIXD …
$GPGGA …
Directory tree for raw logging and processing
(example for VG1711), root directory is VG1711, all others below that:
VG1711/
raw/
config/ (logging configurations:
(end-cruise snapshot '/home/adcp/config'))
log/ (serial data dialogs
(end-cruise snapshot of '/home/adcp/log')
os38/ (raw instrument data)
nb150/ (raw instrument data)
gpsnav/ (serial data)
gyro/ (serial data) -----------------------
posmv/ (serial data) |
ashtech/ (serial data) |
simrad/ (serial data) |
rbin/ |
gpsnav/ (binary version of serial data) |
gyro/ (binary version of serial data) <---
posmv/ (binary version of serial data)
ashtech/ (binary version of serial data)
simrad/ (binary version of serial data)
gbin/
ztimefit.txt (time coefficients for instrument/time) (Python processing)
os38/
time/ (time-matching for os38)
gpsnav/ (time-matched
gyro/ ( binary
posmv/ ( version
ashtech/ ( of serial
simrad/ ( data)
nb150/
time/ (time-matching for nb150)
gpsnav/ (time-matched
gyro/ ( binary
posmv/ ( version
ashtech/ ( of serial
simrad/ ( data)
heading/ (all specified headings on one gps timestamp) (Python Processing)
proc/
os38bb/ (os38 bb processing directory)
os38nb/ (os38 nb processing directory)
nb150/ (nb150 processing directory)
reports/
cals.txt (calibration summary)
index.html (open in a browser!)
where:
os38 and nb150 are instrument names
os38bb is for os38 ‘bb’ pings
os38nb is for os38 ‘nb’ pings
nb150 is for nb150 pings
3.1.4.2.2. UHDAS “rbin” data¶
As ascii data are logged to serial directories in ‘raw/’ (e.g. ‘raw/ashtech’), a python thread created by UHDAS continually extracts the useful numbers from each NMEA line and appends them to the end of files in a parallel rbin directory (e.g. ‘rbin/ashtech’).
The rbin files have a header followed by a binary array. They are read with BinfileSet.
e.g., for the *$GPGGA* message, the process (ahd header) look like this
from pycurrents.file.binfile_n import BinfileSet
data=BinfileSet('endeavor/rbin/gpsnav/en2019_243_2*gps.rbin')
data.columns
The columns are:
['u_dday', 'dday', 'lon', 'lat', 'quality', 'hdop', 'm_dday']
For an instrument like Ashtech where multiple messages are logged, there is an rbin file created for each type of data, i.e., one for the position (GGA) and one for the attitude (ATT) messages.
3.1.4.2.3. UHDAS “gbin” files:¶
The gbin file name is based on the raw data.
The gbin is the gridded binary (gbin) file where all fields are gridded in time. The times in the instrument log file (e.g., raw/nb150/*.log) are the times that determine the grid, i.e., the Unix times of instrument pings. The Ashtech, GPS, seapath, heading, soundspeed are all interpolated to the Unix times of the pings and put into gbin files.
Now, the best time must be determined (decisions made to keep time monotonic increasing, and to track any other time corrections).
The results of this go into the files in the time subdir. *.tim.gbin has 3 rows containing: the Unix pingtimes, the “best” times, and the pc_sec-UTC difference. *.best.gbin is the gridded “best” values of position and heading, given that there are multiple sources for these.
The decision whether to use seapath or Ashtech, for example, or which is the primary navigation device, is NOT made by the program on the fly, but rather is set initially at the start of the cruise in CRUISE_proc.py.
Python processing adds two elements to this directory:
ztimefit.txt (used to correct ADCP time to UTC for each file)
heading (contains an independent directory of binfiles with all the specified heading devices gridded onto the same time base. QC has already been applied to these files, so (for example) bad Ashtech or POSMV data will be NaNs
3.1.4.2.4. UHDAS “proc” directory:¶
The proc directory contains one CODAS processing directory per instrument+pingtype. For the example of a ship with an NB150 and an OS38, the processing directories would be
nb150
os38bb
os38nb
Pingtypes (“bb” and “nb”) are processed separately because the singleping editing criteria are different, and because it is a useful diagnostic to compare the two datasets. Averaging and merging can be done by the user after postprocessing.
Inside each processing directory
CODAS processing is documented in more detail here.
The contents of a CODAS processing directory are (in order of processing stages)
directories: pre-processing and metadata
config
(metadata and processing configuration)
scan
(charactierize timestamps, time range of data)
ping
(default data location for pingdata, LTA, ENX (not UHDAS))
load
(stage averages in the form of*.bin
,*.cmd
files)
adcpdb
(contains*.blk
(the CODAS files))
directories: CODAS processing
nav
(positions : reference-layer smoothing of positions here)
edit
(editing of averaged (CODAS) data; plot temperature)
cal
rotate
(apply correction (time-dependent or constant))
watertrk
(watertrack calibration calculation)
bottomtrk
(bottomtrack calibration calculation)
rotate
(testing effects of rotation; staging area)
directories: output for users
grid
(staging lon/lat or time grid for data extraction)
vector
(coarsly-gridded data, suitable for vector plots)
contour
(finely-gridded data, suitable for contour plots)
png_archive
(after “end cruise” on UHDAS installations, archive of png files (one per day) from the at-sea web site)
files: metadata and cruise info
Cruise ID, such as
km1001
is CRUISEIDDatabase name, such as
a_km
is DBNAME“decimal day” is floating point days of the year, starting with 0.0
file name |
contents |
---|---|
|
for the user: documentation template with metadata and processing (for user) |
|
parameters used during processing |
|
output of most recent quick_adcp.py command |
|
control file used to process these data |
|
|
|
transducer temperature |
|
watertrack calibration file |
|
bottom track calibration file |
|
heading correction QC file (if relevant) |
|
time range of database (YYYY/MM/DD hh:mm:ss) |
|
time ranges for each block file |