What is so special about symmetric matrices? While not every square matrix is diagonalizable, every symmetric matrix can be diagonalized. Diagonal matrices are easier to work with and have many fascinating properties. In addition, every symmetric matrix can be represented as a spectral decomposition. The decomposition itself plays key roles in multivariate statistics and some other disciplines.

 

Here is our first theorem concerning symmetric matrices and their diagonalizability.

 

Theorem 1

If A is an n×n matrix, then the following are equivalent.

  1. A is orthogonally diagonalizable.
  2. A has an orthonormal set of n eigenvectors
  3. A is symmetric

Note:

  1. A square matrix A is called orthogonally diagonalizable if there is an orthogonal matrix P such that P-1AP (= PtAP) is diagonal; the matrix P is said to orthogonally diagonalize A.
  2. A set of vectors in an inner product space is called an orthogonal set if all pairs of distinct vectors in the set are orthogonal. An orthogonal set in which each vector has norm 1 is called orthonormal.

 

Example 1

Consider the matrix A = \begin{bmatrix} 2 & -1 & -1 \\ -1 & 2 & -1 \\ -1 & -1 & 2 \end{bmatrix}.

  1. Find an orthogonal matrix P which diagonalizes A.
  2. Compute P-1AP and verify that it is a diagonal matrix.

Answer

Part a

The characteristic equation of A is \vert A - \lambda I  \vert = 0. It is equivalent to λ(λ-3)2 = 0. This gives two eigenvalues λ1 = 0 and λ2 = 3. Using the method as discussed in the article First Encounter with Eigenvalues and Eigenvectors, it can be shown that the eigenspaces of A corresponding to λ = 0 and λ = 3 are, respectively, E_1 = \begin{Bmatrix} t \begin{bmatrix}1 \\ 1 \\ 1 \end{bmatrix} \Bigr| t \in \mathbb{R} \end{Bmatrix} and E_2 = \begin{Bmatrix} s \begin{bmatrix}1 \\ -1 \\ 0 \end{bmatrix} + t \begin{bmatrix} 1 \\ 0 \\ -1 \end{bmatrix} \Bigr| s, t \in \mathbb{R} \end{Bmatrix}. The matrix whose columns are the basis vectors of the eigenspaces, i.e. \begin{bmatrix} 1 & 1 & 1 \\ 1 & -1 & 0 \\ 1 & 0 & -1 \end{bmatrix}, indeed diagonalizes A. But it is not an orthogonal matrix as required in part a. Theorem A at the end of this article implies that the column vectors of P must form an orthonormal set in \mathbb{R}^3. Since \vec{u}_1 = \begin{bmatrix}1 \\ 1 \\ 1 \end{bmatrix} comes from E1 and the other column vectors come from E2, by Theorem B at the end of this article we can conclude that \vec{u}_1 is orthogonal to both other column vectors of the matrix. By normalizing \vec{u}_1, we get \vec{v}_1 = \begin{bmatrix} 1/\sqrt{3} \\ 1/\sqrt{3} \\ 1/\sqrt{3}  \end{bmatrix}, which also spans E1. It remains to find an orthonormal basis for E2. By applying the Gram-Schmidt process, we obtain the orthonormal basis \begin{Bmatrix}\vec{v}_2, \: \vec{v}_3 \end{Bmatrix} where \vec{v}_2 = \begin{bmatrix}1/\sqrt{2} \\ -1/\sqrt{2} \\ 0 \end{bmatrix} and \vec{v}_3 = \begin{bmatrix}1/\sqrt{6} \\ 1/\sqrt{6} \\ -2/\sqrt{6}\end{bmatrix}. Therefore, the desired orthogonal matrix P has \vec{v}_1, \: \vec{v}_2, \: \vec{v}_3 as its column vectors, i.e. P = \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{2} & 1/\sqrt{6} \\ 1/\sqrt{3} & -1/\sqrt{2} & 1/\sqrt{6} \\ 1/\sqrt{3} & 0 & -2/\sqrt{6} \end{bmatrix}.

Part b

Since P is orthogonal, P-1 = Pt. Consequently, P-1AP = PtAP. It will be shown that PtAP is a diagonal matrix.

P^t = \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{3} & 1/\sqrt{3} \\ 1/\sqrt{2} & -1/\sqrt{2} & 0 \\ 1/\sqrt{6} & 1/\sqrt{6} & -2/\sqrt{6} \end{bmatrix}

P^t AP = \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{3} & 1/\sqrt{3} \\ 1/\sqrt{2} & -1/\sqrt{2} & 0 \\ 1/\sqrt{6} & 1/\sqrt{6} & -2/\sqrt{6} \end{bmatrix} \begin{bmatrix} 2 & -1 & -1 \\ -1 & 2 & -1 \\ -1 & -1 & 2 \end{bmatrix} \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{2} & 1/\sqrt{6} \\ 1/\sqrt{3} & -1/\sqrt{2} & 1/\sqrt{6} \\ 1/\sqrt{3} & 0 & -2/\sqrt{6} \end{bmatrix}

P^t AP = \begin{bmatrix} 1/\sqrt{3} & 1/\sqrt{3} & 1/\sqrt{3} \\ 1/\sqrt{2} & -1/\sqrt{2} & 0 \\ 1/\sqrt{6} & 1/\sqrt{6} & -2/\sqrt{6} \end{bmatrix} \begin{bmatrix}0 & 3/\sqrt{2} & 3/\sqrt{6} \\ 0 & -3/\sqrt{2} & 3/\sqrt{6} \\ 0 & 0 & -6/\sqrt{6}\end{bmatrix} = \begin{bmatrix}0 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 3\end{bmatrix}

That P-1AP is a diagonal matrix is verified.

 

Theorem 2 below describes two other important properties of symmetric matrices.

 

Theorem 2

  1. The characteristic equation of a symmetric matrix A has only real roots.
  2. If an eigenvalue λ of a symmetric matrix A is repeated k times as a root of the characteristic equation, then the eigenspace corresponding to λ is k-dimensional.

 

In Example 1, the characteristic equation is λ(λ-3)2 = 0. It has two roots, i.e.  λ1 = 0 and λ2 = 3, which serve as the eigenvalues. In terms of algebraic multiplicity, we can say that the algebraic multiplicity of  λ1 = 0 is 1 and that of λ2 = 3 is 2. The eigenspaces corresponding to λ1 = 0 and λ2 = 3 (i.e. E1 and E2 above) are 1- and 2-dimensional, respectively. In terms of geometric multiplicity, we can say that the geometric multiplicity of λ1 = 0 and λ2 = 3 are, respectively, 1 and 2. In each case, the algebraic multiplicity equals the geometric multiplicity. This is not a coincidence. The equality is guaranteed by Theorem 2 part b.

 

Example 2

Consider the matrix A = \begin{bmatrix} 4 & 3 & 1 & 1 \\ 3 & 4 & 1 & 1 \\ 1 & 1 & 4 & 3 \\ 1 & 1 & 3 & 4 \end{bmatrix}.

  1. Verify that the characteristic equation of A has only real roots.
  2. Find the eigenspaces corresponding to all the eigenvalues. Verify that the algebraic and geometric multiplicities of each eigenvalue are equal.
  3. Find an orthogonal matrix P which diagonalizes A.

Answer

Part a

It can be checked that the characteristic equation of A is (λ-9)(λ-5)(λ-1)2 = 0. The roots are λ1 = 9, λ2 = 5, and λ3 = 1. They are all real numbers. Thus, the characteristic equation of A has only real roots.

Part b

Using the method as discussed in the article First Encounter with Eigenvalues and Eigenvectors, it can be shown that the eigenspaces of A corresponding to λ = 9, λ = 5, and λ = 1 are, respectively, E_1 = \begin{Bmatrix} t \begin{bmatrix}1 \\ 1 \\ 1 \\ 1 \end{bmatrix} \Bigr| t \in \mathbb{R} \end{Bmatrix}, E_2 = \begin{Bmatrix} t \begin{bmatrix}-1 \\ -1 \\ 1 \\ 1 \end{bmatrix} \Bigr| t \in \mathbb{R} \end{Bmatrix}, and E_3 = \begin{Bmatrix} s \begin{bmatrix}-1 \\ 1 \\ 0 \\ 0 \end{bmatrix} + t \begin{bmatrix} 0 \\ 0 \\ -1 \\ 1 \end{bmatrix} \Bigr| s, t \in \mathbb{R} \end{Bmatrix}. What is left is to compare the algebraic and geometric multiplicities of each eigenvalue. For λ = 9, its algebraic multiplicity is 1 because (λ-9) occurs once as a factor of the characteristic polynomial. The corresponding eigenspace is E1, which is spanned by exactly one basis vector. Thus the dimension of E1 is 1. It follows that the geometric multiplicity of λ = 9 is 1, which is equal to its algebraic multiplicity. Likewise, for λ = 5, its algebraic multiplicity is 1 as (λ-5) occurs once as a factor of the characteristic polynomial. The corresponding eigenspace is E2, whose dimension is 1. Consequently, the geometric multiplicity of λ = 5 is 1, which equals its algebraic multiplicity. Unlike the previous cases, the algebraic multiplicity of λ = 1 is 2 because (λ-1) occurs twice as a factor of the characteristic polynomial. Its geometric multiplicity is 2 because the eigenspace corresponding to the eigenvalue λ = 1, i.e. E3, is 2-dimensional. Therefore, the algebraic and geometric multiplicities of each eigenvalue are equal.

Part c

Let \vec{u}_1 = \begin{bmatrix}1 \\ 1 \\ 1 \\ 1 \end{bmatrix}, \:  \vec{u}_2 = \begin{bmatrix}-1 \\ -1 \\ 1 \\ 1 \end{bmatrix}, \: \vec{u}_3 = \begin{bmatrix}-1 \\ 1 \\ 0 \\ 0 \end{bmatrix},  and \vec{u}_4 = \begin{bmatrix}0 \\ 0 \\ -1 \\ 1 \end{bmatrix}. Since \vec{u}_1 and \vec{u}_2 are from different eigenspaces, by Theorem B it can be concluded that they are orthogonal to each other. Moreover, (also by Theorem B) each of them is orthogonal to both \vec{u}_3 and \vec{u}_4. Fortunately, in this case \vec{u}_3 is orthogonal to \vec{u}_4. It can be easily checked by computing the inner product of the two vectors. As a consequence, we don’t have to apply the complete Gram-Schmidt process to produce an orthonormal set of vectors. All we have to do is normalizing each of the vectors. This will ultimately result in the orthogonal matrix P = \begin{bmatrix}1/2 & -1/2 & -1/\sqrt{2} & 0 \\ 1/2 & -1/2 & 1/\sqrt{2} & 0 \\ 1/2 & 1/2 & 0 & -1/\sqrt{2} \\ 1/2 & 1/2 & 0 & 1/\sqrt{2}\end{bmatrix}, which diagonalizes A.

 

SUPPORTING THEOREMS:

Theorem A

Let A be an n×n matrix. The following are equivalent:

  1. A is orthogonal.
  2. The row vectors of A form an orthonormal set in \mathbb{R}^n with the Euclidean inner product.
  3. The column vectors of A form an orthonormal set in \mathbb{R}^n with the Euclidean inner product.

Note:

By definition, an n×n matrix A is an orthogonal matrix if A-1 = At, where At denotes the transpose of A.

 

Theorem B

If A is a symmetric matrix, then eigenvectors from different eigenspaces are orthogonal.

 

Leave a Reply

Your email address will not be published. Required fields are marked *