At work I told a friend of mine that the covariance matrix can have imaginary eigenvectors. This supposedly would encode rotation of some sort in empirically collected data. However, I was wrong. The covariance matrix cannot have imaginary eigenvectors, because it is a real symmetric matrix, and all eigenvalues of a real symmetric matrix are real.
How can I save face? Maybe I was not actually wrong?
Those were the thoghts that went through that made me write this post.
The covariance matrix
The covariance matrix is commonly defined as:
\[ \begin{aligned} &\text{cov}(X,Y) := \mathbb{E}[(X - \mathbb{E}(X))(Y - \mathbb{E}(Y))^T] \\ &\text{where} \\ &\quad\quad X, Y : (\Omega, \mathcal{F}, P) \xrightarrow{\text{measurable}} (\mathbb{R}^n, \mathcal{B}(\mathbb{R}^n)) \\ &\quad\quad (\Omega, \mathcal{F}, P) \text{ is a probability space} \\ &\quad\quad \mathcal{B}(\mathbb{R}^n) \text{ is the Borel sigma-algebra on } \mathbb{R}^n \end{aligned} \]
If we are given only one random variable \(X\), we can define the covariance matrix of \(X\) as \(\text{cov}(X) := \text{cov}(X,X)\), which is also called the auto-covariance matrix of \(X\). Clearly this auto-covariance matrix is real and symmetric:
\[ \begin{aligned} \text{cov}(X) &= \mathbb{E}[(X - \mathbb{E}(X))(X - \mathbb{E}(X))^T] \\ &= \begin{bmatrix} \mathbb{E}[(X_1 - \mathbb{E}(X_1))^2] & \mathbb{E}[(X_1 - \mathbb{E}(X_1))(X_2 - \mathbb{E}(X_2))] & \cdots & \mathbb{E}[(X_1 - \mathbb{E}(X_1))(X_n - \mathbb{E}(X_n))] \\ \mathbb{E}[(X_2 - \mathbb{E}(X_2))(X_1 - \mathbb{E}(X_1))] & \mathbb{E}[(X_2 - \mathbb{E}(X_2))^2] & \cdots & \mathbb{E}[(X_2 - \mathbb{E}(X_2))(X_n - \mathbb{E}(X_n))] \\ \vdots & \vdots & \ddots & \vdots \\ \mathbb{E}[(X_n - \mathbb{E}(X_n))(X_1 - \mathbb{E}(X_1))] & \mathbb{E}[(X_n - \mathbb{E}(X_n))(X_2 - \mathbb{E}(X_2))] & \cdots & \mathbb{E}[(X_n - \mathbb{E}(X_n))^2] \end{bmatrix} \\ &= \text{cov}(X)^T \end{aligned} \]
As a real symmetric matrix, it cannot have imaginary eigenvalues: Remember that an eigenvalue \(\lambda\) of a matrix \(A\) is defined by the equation:
\[ Av = \lambda v \]
and it therefore can be obtained by solving the characteristic polynomial:
\[ \chi_A(\lambda) = \det(A - \lambda I) = 0. \]
Remember that the determinant of a matrix can be thought of as measuring the “volume” of the parallelepiped spanned by the columns of the matrix. Stating that \(\chi_A\) only has real roots geometrically means that the “volume” of the parallelepiped spanned by the columns of \(A - \lambda I\) can only be zero for real values of \(\lambda\).
TODO: explain fully why the eigenvalues of a real symmetric matrix are real.
Breaking the symmetry
How would we be able to have a non-symmetric auto-covariance matrix? We would need to have \[ \exists i, j \in \{1, \ldots, n\} : \mathbb{E}[(X_i - \mathbb{E}(X_i))(X_j - \mathbb{E}(X_j))] \neq \mathbb{E}[(X_j - \mathbb{E}(X_j))(X_i - \mathbb{E}(X_i))]. \]
Here, let us first establish without loss of generality that all \(X_i\) are centered, i.e. \(\mathbb{E}(X_i) = 0\) for all \(i\). Then the above condition can be rewritten as:
\[ \exists i, j \in \{1, \ldots, n\} : \mathbb{E}[X_i X_j] \neq \mathbb{E}[X_j X_i]. \]
The multiplication of real random variables is commutative, as it is lifted from the multiplication of real numbers. We will have to consider a non-commutative multiplication to break the symmetry of the auto-covariance matrix, obviously! We can now move over to another ring. In this example, we will consider \(\mathbb{H}\), the ring of quaternions, which is a non-commutative extension of the complex numbers. The quaternions are defined as:
\[ \mathbb{H} := \{a + bi + cj + dk : a, b, c, d \in \mathbb{R}, i^2 = j^2 = k^2 = ijk = -1\} \]
Alternatively, we could mod out the the following ideal from the polynomial ring \(\mathbb{R}[i,j,k]\):
\[ \mathbb{H} \cong_\text{Ring} \mathbb{R}[i,j,k] / (i^2 + 1, j^2 + 1, k^2 + 1, ij - k, jk - i, ki - j). \]
We call the elements \(bi + cj + dk\) imaginary, in analogy to the complex numbers. The elements \(a\) are called real.
We get an obvious injection:
\[ \iota : \mathbb{R}^3 \hookrightarrow \mathbb{H}, \\ \iota(x,y,z) = xi + yj + zk. \]
and by using \(jk = i\) we can find a non-symmetric auto-covariance matrix:
\[ X := \begin{bmatrix} 0 \\ i \\ j \end{bmatrix} \implies \text{cov}(X) = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & ij \\ 0 & ji & 0 \end{bmatrix} = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & k \\ 0 & -k & 0 \end{bmatrix} \neq \text{cov}(X)^T. \]
and this matrix has imaginary eigenvalues \(\pm k\).
TODO: add calculations for the eigenvalues of the above matrix.
TODO: explain how this relates to the rotation of data.