Skip to main content

Section 10.5 Diagonalization and powers of a matrix

Activity 10.5.1.

Suppose that \(A\) is a \(2\times2\) matrix having eigenvectors \(\vvec_1\) and \(\vvec_2\) with associated eigenvalues \(\lambda_1=3\) and \(\lambda_2 = -6\text{.}\)
  1. What are the products \(A\vvec_1\) and \(A\vvec_2\) in terms of \(\vvec_1\) and \(\vvec_2\text{?}\)
  2. If we form the matrix \(P = \begin{bmatrix} \vvec_1 \amp \vvec_2 \end{bmatrix} \text{,}\) what is the product \(AP\) in terms of \(\vvec_1\) and \(\vvec_2\text{?}\)
  3. Use the eigenvalues to form the diagonal matrix \(D = \begin{bmatrix} 3 \amp 0 \\ 0 \amp -6 \end{bmatrix}\) and determine the product \(PD\) in terms of \(\vvec_1\) and \(\vvec_2\text{.}\)
  4. Suppose that \(A=\begin{bmatrix} -3 \amp 6 \\ 3 \amp 0 \\ \end{bmatrix}\text{.}\) Verify that \(\vvec_1=\twovec11\) and \(\vvec_2=\twovec2{-1}\) are eigenvectors of \(A\) with eigenvalues \(\lambda_1 = 3\) and \(\lambda_2=-6\text{.}\)
  5. The results from the previous two parts of this activity demonstrate that \(AP=PD\text{.}\) Explain why \(P\) is invertible and that we must have \(A=PDP^{-1}\text{.}\)
  6. Use the Sage cell below to define the matrices \(P\) and \(D\) and then verify that \(A=PDP^{-1}\text{.}\)
Solution.
  1. We have \(A\vvec_1=3\vvec_1\) and \(A\vvec_2 = -6\vvec_2\text{.}\)
  2. \(AP = \begin{bmatrix} A\vvec_1 \amp A\vvec_2 \end{bmatrix} = \begin{bmatrix} 3\vvec_1 \amp -6\vvec_2 \end{bmatrix}\text{.}\)
  3. \(PD = \begin{bmatrix} \vvec_1 \amp \vvec_2 \end{bmatrix} \begin{bmatrix} 3 \amp 0 \\ 0 \amp -6\\ \end{bmatrix} = \begin{bmatrix} 3\vvec_1 \amp -6\vvec_2 \end{bmatrix} \text{.}\) Comparing the result of this part of the activity to the previous, we see that \(AP = PD\text{.}\)
  4. Since \(\det P \ne 0 \text{,}\) \(P\) is invertible. Multiplying the equation \(AP=PD\) on the right by \(P^{-1}\) gives \(APP^{-1} = A = PDP^{-1}\text{.}\)
The rest of the activity can be verified using Sage.

Example 10.5.1.

We have seen that \(A = \begin{bmatrix} 1 \amp 2 \\ 2 \amp 1 \\ \end{bmatrix}\) has eigenvectors \(\vvec_1 = \twovec11\) and \(\vvec_2=\twovec{-1}1\) with associated eigenvalues \(\lambda_1 = 3\) and \(\lambda_2 = -1\text{.}\) Forming the matrices
\begin{equation*} P = \begin{bmatrix} \vvec_1 \amp \vvec_2 \end{bmatrix} = \begin{bmatrix} 1 \amp -1 \\ 1 \amp 1 \\ \end{bmatrix},~~~ D = \begin{bmatrix} 3 \amp 0 \\ 0 \amp -1 \\ \end{bmatrix}, \end{equation*}
we see that \(A=PDP^{-1}\text{.}\)

Definition 10.5.2.

We say that the matrix \(A\) is diagonalizable if there is a diagonal matrix \(D\) and invertible matrix \(P\) such that
\begin{equation*} A = PDP^{-1}. \end{equation*}

Subsection 10.5.1 Powers of a diagonalizable matrix

Activity 10.5.2.

  1. Let’s begin with the diagonal matrix
    \begin{equation*} D = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -1 \\ \end{array}\right]\text{.} \end{equation*}
    Find the powers \(D^2\text{,}\) \(D^3\text{,}\) and \(D^4\text{.}\) What is \(D^k\) for a general value of \(k\text{?}\)
  2. Suppose that \(A\) is a matrix with eigenvector \(\vvec\) and associated eigenvalue \(\lambda\text{;}\) that is, \(A\vvec = \lambda\vvec\text{.}\) By considering \(A^2\vvec\text{,}\) explain why \(\vvec\) is also an eigenvector of \(A\) with eigenvalue \(\lambda^2\text{.}\)
  3. Suppose that \(A= PDP^{-1}\) where
    \begin{equation*} D = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -1 \\ \end{array}\right]\text{.} \end{equation*}
    Remembering that the columns of \(P\) are eigenvectors of \(A\text{,}\) explain why \(A^2\) is diagonalizable and find a diagonalization in terms of \(P\) and \(D\text{.}\)
  4. Give another explanation of the diagonalizability of \(A^2\) by writing
    \begin{equation*} A^2 = (PDP^{-1})(PDP^{-1}) = PD(P^{-1}P)DP^{-1}\text{.} \end{equation*}
  5. In the same way, find a diagonalization of \(A^3\text{,}\) \(A^4\text{,}\) and \(A^k\text{.}\)
  6. Suppose that \(A\) is a diagonalizable \(2\times2\) matrix with eigenvalues \(\lambda_1 = 0.5\) and \(\lambda_2=0.1\text{.}\) What happens to \(A^k\) as \(k\) becomes very large?
Solution.
  1. We have
    \begin{equation*} D^2=\mattwo4001, D^3=\mattwo800{-1}, \ldots, D^k=\mattwo{2^k}00{(-1)^k}\text{.} \end{equation*}
  2. We know that \(A^2\vvec = \lambda A\vvec = \lambda^2\vvec\) so that \(\vvec\) is also an eigenvector of \(A^2\) with associated eigenvalue \(\lambda^2\text{.}\)
  3. Since eigenvectors of \(A\) are also eigenvectors of \(A^2\text{,}\) we can use the matrix \(P\) to diagonalize \(A^2\text{.}\) The eigenvalues are squared, however, so we have \(A^2=PEP^{-1}\) where \(E=D^2=\mattwo4001\text{.}\)
  4. We can also see this by noting that
    \begin{equation*} \begin{aligned} A^2 \amp = (PDP^{-1})(PDP^{-1}) = PD(P^{-1}P)DP^{-1} \\ \amp = PDIDP^{-1} = PD^2P^{-1}\text{.}\\ \end{aligned} \end{equation*}
  5. \(A^3=PD^3P^{-1}\text{,}\) \(A^4=PD^4P^{-1}\text{,}\) and \(A^k=PD^kP^{-1}\text{.}\)
  6. We can write \(A=PDP^{-1}\) where \(D=\mattwo{0.5}00{0.1}\text{.}\) Therefore, \(A^k=PD^kP^{-1}\) where \(D^k=\mattwo{0.5^k}00{0.1^k}\text{.}\) As \(k\) becomes very large, \(0.5^k\) and \(0.1^k\) become very close to zero. Hence \(D^k\) and \(A^k\) become very close to the zero matrix.

Activity 10.5.3.

Let’s consider an example that illustrates how we can put these ideas to use.
Suppose that we work for a car rental company that has two locations, \(P\) and \(Q\text{.}\) When a customer rents a car at one location, they have the option to return it to either location at the end of the day. After doing some market research, we determine:
  • 80% of the cars rented at location \(P\) are returned to \(P\) and 20% are returned to \(Q\text{.}\)
  • 40% of the cars rented at location \(Q\) are returned to \(Q\) and 60% are returned to \(P\text{.}\)
  1. If we let \(P_k\) and \(Q_k\) be the number of cars at locations \(P\) and \(Q\text{,}\) respectively, at the end of day \(k\text{,}\) we then have
    \begin{equation*} \begin{aligned} P_{k+1}\amp {}={} 0.8P_k + 0.6Q_k \\ Q_{k+1}\amp {}={} 0.2P_k + 0.4Q_k\text{.} \\ \end{aligned} \end{equation*}
    We can write the vector \(\xvec_k = \twovec{P_k}{Q_k}\) to reflect the number of cars at the two locations at the end of day \(k\text{,}\) which says that
    \begin{equation*} \xvec_{k+1} = \twovec{P_{k+1}}{Q_{k+1}}= \left[\begin{array}{rr} 0.8 \amp 0.6 \\ 0.2 \amp 0.4 \\ \end{array}\right] \twovec{P_k}{Q_k} \end{equation*}
    or \(\xvec_{k+1} = A\xvec_k\) where \(A=\left[\begin{array}{rr}0.8 \amp 0.6 \\ 0.2 \amp 0.4 \end{array}\right]\text{.}\) That is, we have
    \begin{equation*} \begin{aligned} \xvec_1 \amp {}={} A\xvec_0 \\ \xvec_2 \amp {}={} A\xvec_1 = A^2\xvec_0 \\ \xvec_3 \amp {}={} A\xvec_2 = A^3\xvec_0\text{.} \\ \end{aligned} \end{equation*}
    Suppose that
    \begin{equation*} \vvec_1 = \twovec{3}{1}, \qquad \vvec_2 = \twovec{-1}{1}\text{.} \end{equation*}
    Compute \(A\vvec_1\) and \(A\vvec_2\) to demonstrate that \(\vvec_1\) and \(\vvec_2\) are eigenvectors of \(A\text{.}\) What are the associated eigenvalues \(\lambda_1\) and \(\lambda_2\text{?}\)
  2. What will happen to the number of cars at the two locations after a very long time?
Solution.
The solution to this activity is given in the text below.
Let’s revisit Activity 10.5.3 where we had the matrix \(A = \begin{bmatrix} 0.8 \amp 0.6 \\ 0.2 \amp 0.4 \\ \end{bmatrix}\) and the initial vector \(\xvec_0 = \ctwovec{1000}0\text{.}\) We were interested in understanding the sequence of vectors \(\xvec_{k+1} = A\xvec_k\text{,}\) which means that \(\xvec_k = A^k\xvec_0\text{.}\)
In particular, we want to find \(\xvec_k=A^k\xvec_0\) and determine what happens as \(k\) becomes very large. If a matrix \(A\) is diagonalizable, writing \(A=PDP^{-1}\) can help us understand powers of \(A\) more easily.
We have that \(\vvec_1 = \twovec31\) and \(\vvec_2 = \twovec{-1}1\) are eigenvectors of \(A\) having associated eigenvalues \(\lambda_1=1\) and \(\lambda_2 = 0.2\text{.}\) This means that \(A = PDP^{-1}\) where
\begin{equation*} P = \begin{bmatrix} 3 \amp -1 \\ 1 \amp 1 \\ \end{bmatrix},~~~ D = \begin{bmatrix} 1 \amp 0 \\ 0 \amp 0.2 \\ \end{bmatrix}. \end{equation*}
Therefore, the powers of \(A\) have the form \(A^k = PD^kP^{-1}\text{.}\)
Notice that \(D^k = \begin{bmatrix} 1^k \amp 0 \\ 0 \amp 0.2^k \\ \end{bmatrix} = \begin{bmatrix} 1 \amp 0 \\ 0 \amp 0.2^k \end{bmatrix} \text{.}\) As \(k\) increases, \(0.2^k\) becomes closer and closer to zero. This means that for very large powers \(k\text{,}\) we have
\begin{equation*} D^k \approx \begin{bmatrix} 1 \amp 0 \\ 0 \amp 0 \\ \end{bmatrix} \end{equation*}
and therefore
\begin{equation*} A^k = PD^kP^{-1} \approx P\begin{bmatrix} 1 \amp 0 \\ 0 \amp 0 \\ \end{bmatrix}P^{-1} = \begin{bmatrix} \frac 34 \amp \frac 34 \\ \frac 14 \amp \frac 14 \end{bmatrix}. \end{equation*}
Beginning with the vector \(\xvec_0 = \ctwovec{1000}{0}\text{,}\) we find that \(\xvec_k = A^k\xvec_0\approx \twovec{750}{250}\) when \(k\) is very large. Which means that after a very long time, 750 cars will end up at location P and 250 will end up at location Q. This will be the case, regardless of the initial distribution of cars! Convince yourself that this is true by computing: \(\xvec_k = A^k\xvec_0\) where \(\xvec_0 = \ctwovec{t}{1000-t}\) where \(t\) is a number between 0 and 1000 representing the initial number of cars at location P.