The central idea behind EOF analysis is illustrated most easily using a time series with only two variable. The horizontal velocity components in a moored current meter record are ideal.
%matplotlib notebook
import os
import urllib
import numpy as np
import matplotlib.pyplot as plt
from pycurrents.file import matfile
from pycurrents.num import eof
#plt.rcParams['figure.dpi'] = 70
We will use a nice current meter record from the Alenuihaha Channel. I don't know the exact location, or even the year, but that doesn't matter for present purposes.
fname = 'uv_alenuihaha.mat'
url = 'http://currents.soest.hawaii.edu/ocn_data_analysis_files/data/' + fname
if not os.path.exists(fname):
urllib.request.urlretrieve(url, filename=fname)
mcm = matfile.loadmatbunch(fname)
print(matfile.showmatbunch(mcm))
print('start time: %.3f last time: %.3f' % (mcm.t[0], mcm.t[-1]))
fig, ax = plt.subplots()
ax.plot(mcm.t, mcm.u, label='u', alpha=0.5)
ax.plot(mcm.t, mcm.v, label='v', alpha=0.5)
ax.set_xlabel('days')
ax.set_ylabel('cm/s')
ax.legend(loc='upper right')
We can see from the raw u, v time series that the two velocity components are correlated, and have roughly similar standard deviations. Another way to look at the correlation is with a scatter plot.
fig, ax = plt.subplots()
ax.plot(mcm.u, mcm.v, '.', alpha=0.3)
ax.set_xlabel('u (cm/s)')
ax.set_ylabel('v (cm/s)')
ax.set_aspect('equal')
ax.axhline(0, color='0.5')
ax.axvline(0, color='0.5')
The east-north coordinate system is arbitrary. It looks like a more natural coordinate system would be rotated about the mean value, with the first axis aligned with the long axis of the scatter cloud. To look at this more closely, we will glue the u and v components together as columns in an (n,2) matrix, and then use that to calculate the variance-covariance matrix. If we call this $U$, then the variance-covariance matrix will be $U^TU$.
mcm.udm = mcm.u - mcm.u.mean()
mcm.vdm = mcm.v - mcm.v.mean()
uv = np.vstack((mcm.udm, mcm.vdm)).T
print('The shape of U is %s.' % (uv.shape,))
nt = len(mcm.u)
The variance-covariance matrix shows strong covariance between u and v.
vcv = (uv.T @ uv) / nt
print(vcv)
Let's use a function that returns the semi-major and semi-minor axes, and the angle of the semi-major axis with respect to the $x$-axis, of the standard deviation ellipse for a velocity time series. The first two arguments returned are the square root of the eigenvalues of the variance-covariance matrix.
s1, s2, ang = eof.std_ellipse_from_vel(mcm.u, mcm.v)
# The function above might be changed to do the following conversion
# to scalars:
if s1.ndim == 1:
s1 = s1[0]
s2 = s2[0]
ang = ang[0]
print('semi-major: %.0f cm/s, semi-minor: %.0f cm/s, angle: %.0f degrees CCW'
% (s1, s2, np.rad2deg(ang)))
Expressing 2-D vectors as complex numbers and multiplying by a complex exponential provides a simple method of rotating the vector, or equivalently, of transforming to a rotated coordinate system. Here, the semi-major axis is $\theta = 54^\circ$ CCW from the $x$-axis. To put all the velocities in a coordinate system in which this is the $x^\prime$-axis, we rotate them CW by this amount. In the complex plane this is a multiplication by $e^{-i\theta}$.
uvc = uv[:, 0] + 1j * uv[:, 1]
uvc_rot = uvc * np.exp(-1j * ang)
uv_rot = np.vstack((uvc_rot.real, uvc_rot.imag)).T
vcv_rot = (uv_rot.T @ uv_rot) / nt
print(vcv_rot)
Above, we verified that the variance-covariance matrix of the rotated velocity matrix is diagonal, to within numerical precision. Transformation to the new coordinate system has diagonalized the matrix. We can illustrate this by adding the new coordinate axes to the original scatter plot.
xgrid = np.array([-80, 120])
um, vm = mcm.u.mean(), mcm.v.mean()
y_xprime = vm + (xgrid - um) * np.tan(ang)
y_yprime = vm + (xgrid - um) * np.tan(ang + np.pi/2)
fig, ax = plt.subplots()
ax.plot(mcm.u, mcm.v, '.')
ax.set_xlabel('u (cm/s)')
ax.set_ylabel('v (cm/s)')
ax.set_aspect('equal')
ax.axhline(0, color='0.5')
ax.axvline(0, color='0.5')
ax.autoscale(enable=False)
ax.plot(xgrid, y_xprime, color='r')
ax.plot(xgrid, y_yprime, color='c')
The velocity components in the new coordinates are:
\begin{align} u^\prime &= u \cos(\theta) + v \sin(\theta)\\ v^\prime &= -u \sin(\theta) + v \cos(\theta) \end{align}
In matrix form we have
\begin{equation} \left[ \begin{array}{c} u^\prime\\v^\prime \end{array} \right] = \left[ \begin{array}{cc} \cos(\theta) & \sin(\theta) \\ -\sin(\theta) & \cos(\theta) \end{array} \right] \left[ \begin{array}{c} u \\ v \end{array} \right] \end{equation}
so we see that the primed components are the original components projected onto the new unit vectors forming the rows of the rotation matrix. The unit vector along the red line and oriented to the upper right in the plot above is the top row.
fig, ax = plt.subplots()
ax.plot(mcm.t, uv_rot[:, 0], 'r',
label=r'$u^\prime = u\, \cos(\theta) + v\, \sin(\theta)$')
ax.plot(mcm.t, uv_rot[:, 1], 'c',
label=r'$v^\prime = -u\, \sin(\theta) + v\, \cos(\theta)$')
ax.legend(loc='lower right')
Make the two unit vectors. Again, e1
is the first row of the rotation matrix, e2
is the second row.
e1 = np.array([np.cos(ang), np.sin(ang)])
e2 = np.array([-np.sin(ang), np.cos(ang)])
print(f"e1 = {e1}")
print(f"e2 = {e2}")
Check: are these really eigenvectors of the variance-covariance matrix? If $A$ is the vcv matrix, then we need to verify that $A e_i = \lambda_i e_i$, where $\lambda_i = s_i^2$. (Recall that in the calculations above, s1
and s2
are the square roots of the eigenvalues of the variance-covariance matrix (vcv), $U^TU$.)
Programming: in the cells above and below, note the use of the new "f-string" syntax for strings. The expression inside each set of curly brackets is evaluated in the current scope and then converted to a string.
print(f"A e1 = {vcv @ e1}, lambda_1 e1 = {s1**2 * e1}")
print(f"A e2 = {vcv @ e2}, lambda_2 e2 = {s2**2 * e2}")