Gentle introduction to EOFs via a velocity time series¶

The central idea behind EOF analysis is illustrated most easily using a time series with only two variable. The horizontal velocity components in a moored current meter record are ideal.

%matplotlib notebook

import os
import urllib

import numpy as np
import matplotlib.pyplot as plt
from pycurrents.file import matfile
from pycurrents.num import eof
#plt.rcParams['figure.dpi'] = 70

We will use a nice current meter record from the Alenuihaha Channel. I don't know the exact location, or even the year, but that doesn't matter for present purposes.

fname = 'uv_alenuihaha.mat'
url = 'http://currents.soest.hawaii.edu/ocn_data_analysis_files/data/' + fname
if not os.path.exists(fname):
    urllib.request.urlretrieve(url, filename=fname)
mcm = matfile.loadmatbunch(fname)
print(matfile.showmatbunch(mcm))
print('start time: %.3f  last time: %.3f' % (mcm.t[0], mcm.t[-1]))

t  : ndarray, shape (2900,), dtype >f8
u  : ndarray, shape (2900,), dtype >f8
v  : ndarray, shape (2900,), dtype >f8

start time: 331.583  last time: 452.375

fig, ax = plt.subplots()
ax.plot(mcm.t, mcm.u, label='u', alpha=0.5)
ax.plot(mcm.t, mcm.v, label='v', alpha=0.5)
ax.set_xlabel('days')
ax.set_ylabel('cm/s')
ax.legend(loc='upper right')

<matplotlib.legend.Legend at 0x11a0e0a58>

We can see from the raw u, v time series that the two velocity components are correlated, and have roughly similar standard deviations. Another way to look at the correlation is with a scatter plot.

fig, ax = plt.subplots()
ax.plot(mcm.u, mcm.v, '.', alpha=0.3)
ax.set_xlabel('u (cm/s)')
ax.set_ylabel('v (cm/s)')
ax.set_aspect('equal')
ax.axhline(0, color='0.5')
ax.axvline(0, color='0.5')

<matplotlib.lines.Line2D at 0x11a12ce48>

The east-north coordinate system is arbitrary. It looks like a more natural coordinate system would be rotated about the mean value, with the first axis aligned with the long axis of the scatter cloud. To look at this more closely, we will glue the u and v components together as columns in an (n,2) matrix, and then use that to calculate the variance-covariance matrix. If we call this $U$, then the variance-covariance matrix will be $U^TU$.

mcm.udm = mcm.u - mcm.u.mean()
mcm.vdm = mcm.v - mcm.v.mean()
uv = np.vstack((mcm.udm, mcm.vdm)).T
print('The shape of U is %s.' % (uv.shape,))
nt = len(mcm.u)

The shape of U is (2900, 2).

The variance-covariance matrix shows strong covariance between u and v.

vcv = (uv.T @ uv) / nt
print(vcv)

[[610.70253438 614.80282843]
 [614.80282843 997.52676612]]

Let's use a function that returns the semi-major and semi-minor axes, and the angle of the semi-major axis with respect to the $x$-axis, of the standard deviation ellipse for a velocity time series. The first two arguments returned are the square root of the eigenvalues of the variance-covariance matrix.

s1, s2, ang = eof.std_ellipse_from_vel(mcm.u, mcm.v)
# The function above might be changed to do the following conversion
# to scalars:
if s1.ndim == 1:
    s1 = s1[0]
    s2 = s2[0]
    ang = ang[0]

print('semi-major: %.0f cm/s,  semi-minor: %.0f cm/s,  angle: %.0f degrees CCW' 
      % (s1, s2, np.rad2deg(ang)))

semi-major: 38 cm/s,  semi-minor: 13 cm/s,  angle: 54 degrees CCW

Expressing 2-D vectors as complex numbers and multiplying by a complex exponential provides a simple method of rotating the vector, or equivalently, of transforming to a rotated coordinate system. Here, the semi-major axis is $\theta = 54^\circ$ CCW from the $x$-axis. To put all the velocities in a coordinate system in which this is the $x^\prime$-axis, we rotate them CW by this amount. In the complex plane this is a multiplication by $e^{-i\theta}$.

uvc = uv[:, 0] + 1j * uv[:, 1]

uvc_rot = uvc * np.exp(-1j * ang)
uv_rot = np.vstack((uvc_rot.real, uvc_rot.imag)).T

vcv_rot = (uv_rot.T @ uv_rot) / nt
print(vcv_rot)

[[1.44862281e+03 1.11648315e-13]
 [1.11648315e-13 1.59606493e+02]]

Above, we verified that the variance-covariance matrix of the rotated velocity matrix is diagonal, to within numerical precision. Transformation to the new coordinate system has diagonalized the matrix. We can illustrate this by adding the new coordinate axes to the original scatter plot.

xgrid = np.array([-80, 120])
um, vm = mcm.u.mean(), mcm.v.mean()
y_xprime = vm + (xgrid - um) * np.tan(ang)
y_yprime = vm + (xgrid - um) * np.tan(ang + np.pi/2)

fig, ax = plt.subplots()
ax.plot(mcm.u, mcm.v, '.')
ax.set_xlabel('u (cm/s)')
ax.set_ylabel('v (cm/s)')
ax.set_aspect('equal')
ax.axhline(0, color='0.5')
ax.axvline(0, color='0.5')
ax.autoscale(enable=False)

ax.plot(xgrid, y_xprime, color='r')
ax.plot(xgrid, y_yprime, color='c')

[<matplotlib.lines.Line2D at 0x11a188240>]

The velocity components in the new coordinates are:

\begin{align} u^\prime &= u \cos(\theta) + v \sin(\theta)\\ v^\prime &= -u \sin(\theta) + v \cos(\theta) \end{align}

In matrix form we have

\begin{equation} \left[ \begin{array}{c} u^\prime\\v^\prime \end{array} \right] = \left[ \begin{array}{cc} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{array} \right] \left[ \begin{array}{c} u \\ v \end{array} \right] \end{equation}

so we see that the primed components are the original components projected onto the new unit vectors forming the rows of the rotation matrix. The unit vector along the red line and oriented to the upper right in the plot above is the top row.

fig, ax = plt.subplots()
ax.plot(mcm.t, uv_rot[:, 0], 'r', 
        label=r'$u^\prime = u\, \cos(\theta) + v\, \sin(\theta)$')
ax.plot(mcm.t, uv_rot[:, 1], 'c', 
        label=r'$v^\prime = -u\, \sin(\theta) + v\, \cos(\theta)$')
ax.legend(loc='lower right')

<matplotlib.legend.Legend at 0x11a1c3278>

Make the two unit vectors. Again, e1 is the first row of the rotation matrix, e2 is the second row.

e1 = np.array([np.cos(ang), np.sin(ang)])
e2 = np.array([-np.sin(ang), np.cos(ang)])
print(f"e1 = {e1}")
print(f"e2 = {e2}")

e1 = [0.59156885 0.80625448]
e2 = [-0.80625448  0.59156885]

Check: are these really eigenvectors of the variance-covariance matrix? If $A$ is the vcv matrix, then we need to verify that $A e_i = \lambda_i e_i$, where $\lambda_i = s_i^2$. (Recall that in the calculations above, s1 and s2 are the square roots of the eigenvalues of the variance-covariance matrix (vcv), $U^TU$.)

Programming: in the cells above and below, note the use of the new "f-string" syntax for strings. The expression inside each set of curly brackets is evaluated in the current scope and then converted to a string.

print(f"A e1 = {vcv @ e1}, lambda_1 e1 = {s1**2 * e1}")
print(f"A e2 = {vcv @ e2}, lambda_2 e2 = {s2**2 * e2}")

A e1 = [ 856.96013494 1167.95863143], lambda_1 e1 = [ 856.96013494 1167.95863143]
A e2 = [-128.6834506    94.41823043], lambda_2 e2 = [-128.6834506    94.41823043]