Section 8.3 Orthogonal bases
We frequently ask to write a given vector \(\bvec\)as a linear combination of given basis vectors. We do this by solving the system \(A\xvec=\bvec\text{,}\) where the columns of the matrix \(A\) are the basis vectors and the solution \(\xvec\) provides the weights for the linear combination. This next activity illustrates how this task can be simplified when the basis vectors are orthogonal to each other.
Exploration 8.3.1.
For this activity, it will be helpful to recall the distributive property of dot products:
\begin{equation*}
\bvec\cdot(x_1\wvec_1+x_2\wvec_2) = x_1\bvec\cdot\wvec_1 +
x_2\bvec\cdot\wvec_2\text{.}
\end{equation*}
We’ll work with the basis of \(\real^2\) formed by the vectors
\begin{equation*}
\wvec_1=\twovec12,\hspace{24pt}
\wvec_2=\twovec{-2}1\text{.}
\end{equation*}
Verify that the vectors \(\wvec_1\) and \(\wvec_2\) are orthogonal.
Suppose that \(\bvec =\twovec74\) and find the dot products \(\wvec_1\cdot\bvec\) and \(\wvec_2\cdot\bvec\text{.}\)
We would like to express \(\bvec\) as a linear combination of \(\wvec_1\) and \(\wvec_2\text{,}\) which means that we need to find weights \(x_1\) and \(x_2\) such that
\begin{equation*}
\bvec = x_1\wvec_1 + x_2\wvec_2\text{.}
\end{equation*}
To find the weight \(x_1\text{,}\) dot both sides of this expression with \(\wvec_1\text{:}\)
\begin{equation*}
\bvec\cdot\wvec_1 = (x_1\wvec_1 +
x_2\wvec_2)\cdot\wvec_1\text{,}
\end{equation*}
and apply the distributive property.
In a similar fashion, find the weight \(x_2\text{.}\)
Verify that \(\bvec = x_1\wvec_1+x_2\wvec_2\) using the weights you have found.
Solution.
We can compute that \(\wvec_1\cdot\wvec_2 = 0\text{.}\)
\(\bvec\cdot\wvec_1 = 15\) and \(\bvec\cdot\wvec_2 = -10\text{.}\)
\(x_1 = \frac{\bvec\cdot\wvec_1}{\wvec_1\cdot\wvec_1}
= 3\text{.}\)
\(x_2 = \frac{\bvec\cdot\wvec_2}{\wvec_2\cdot\wvec_2}
= -2\text{.}\)
\(\bvec = 3\wvec_1 - 2\wvec_2\text{.}\)
Definition 8.3.1.
By an orthogonal set of vectors, we mean a set of nonzero vectors each of which is orthogonal to the others.
Example 8.3.2.
The 3-dimensional vectors
\begin{equation*}
\wvec_1 = \threevec1{-1}1,\hspace{24pt}
\wvec_2 = \threevec1{1}0,\hspace{24pt}
\wvec_3 = \threevec1{-1}{-2}.
\end{equation*}
form an orthogonal set, which can be verified by computing
\begin{equation*}
\begin{array}{rcl}
\wvec_1\cdot\wvec_2 \amp {}={} \amp 0 \\
\wvec_1\cdot\wvec_3 \amp {}={} \amp 0 \\
\wvec_2\cdot\wvec_3 \amp {}={} \amp 0\text{.} \\
\end{array}
\end{equation*}
Notice that this set of vectors forms a basis for \(\real^3\text{.}\)
Example 8.3.3.
The vectors
\begin{equation*}
\wvec_1 = \fourvec1111,\hspace{24pt}
\wvec_2 = \fourvec11{-1}{-1},\hspace{24pt}
\wvec_3 = \fourvec1{-1}1{-1}
\end{equation*}
form an orthogonal set of 4-dimensional vectors. Since there are only three vectors, this set does not form a basis for \(\real^4\text{.}\) It does, however, form a basis for a 3-dimensional subspace \(W\) of \(\real^4\text{.}\)
Suppose that a vector \(\bvec\) is a linear combination of an orthogonal set of vectors \(\wvec_1,\wvec_2,\ldots,\wvec_n\text{;}\) that is, suppose that
\begin{equation*}
c_1\wvec_1 + c_2\wvec_2 + \cdots + c_n\wvec_n = \bvec.
\end{equation*}
Just as in the preview activity, we can find the weight \(c_1\) by dotting both sides with \(\wvec_1\) and applying the distributive property of dot products:
\begin{align*}
(c_1\wvec_1 + c_2\wvec_2 + \cdots + c_n\wvec_n)\cdot\wvec_1
\amp = \bvec\cdot\wvec_1\\
c_1\wvec_1\cdot\wvec_1 + c_2\wvec_2\cdot\wvec_1 +\cdots +
c_n\wvec_n\cdot\wvec_1 \amp = \bvec\cdot\wvec_1\\
c_1\wvec_1\cdot\wvec_1 \amp = \bvec\cdot\wvec_1\\
c_1 \amp =
\frac{\bvec\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\text{.}
\end{align*}
Notice how the presence of an orthogonal set causes most of the terms in the sum to vanish. In the same way, we find that
\begin{equation*}
c_i = \frac{\bvec\cdot\wvec_i}{\wvec_i\cdot\wvec_i}
\end{equation*}
so that
\begin{equation*}
\bvec = \frac{\bvec\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 +
\frac{\bvec\cdot\wvec_2}{\wvec_2\cdot\wvec_2}\wvec_2 +
\cdots +
\frac{\bvec\cdot\wvec_n}{\wvec_n\cdot\wvec_n}\wvec_n\text{.}
\end{equation*}
We’ll record this fact in the following proposition.
Proposition 8.3.4.
If a vector \(\bvec\) is a linear combination of an orthogonal set of vectors \(\wvec_1,\wvec_2,\ldots,\wvec_n\text{,}\) then
\begin{equation*}
\bvec = \frac{\bvec\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 +
\frac{\bvec\cdot\wvec_2}{\wvec_2\cdot\wvec_2}\wvec_2 +
\cdots +
\frac{\bvec\cdot\wvec_n}{\wvec_n\cdot\wvec_n}\wvec_n\text{.}
\end{equation*}
Using this proposition, we can see that an orthogonal set of vectors must be linearly independent. Suppose, for instance, that \(\wvec_1,\wvec_2,\ldots,\wvec_n\) is a set of nonzero orthogonal vectors and that one of the vectors is a linear combination of the others, say,
\begin{equation*}
\wvec_3 = c_1\wvec_1 + c_2\wvec_2\text{.}
\end{equation*}
We therefore know that
\begin{equation*}
\wvec_3 =
\frac{\wvec_3\cdot\wvec_1}{\wvec_1\cdot\wvec_1}\wvec_1 +
\frac{\wvec_3\cdot\wvec_2}{\wvec_2\cdot\wvec_1}\wvec_2
= \zerovec\text{,}
\end{equation*}
which cannot happen since we know that \(\wvec_3\) is nonzero. This tells us that
Proposition 8.3.5.
An orthogonal set of vectors \(\wvec_1,\wvec_2,\ldots,\wvec_n\) is linearly independent.
Activity 8.3.2.
Consider the vectors
\begin{equation*}
\wvec_1 = \threevec1{-1}1,\hspace{24pt}
\wvec_2 = \threevec1{1}0,\hspace{24pt}
\wvec_3 = \threevec1{-1}{-2}.
\end{equation*}
-
Verify that this set forms an orthogonal set of \(3\)-dimensional vectors.
Explain why we know that this set of vectors forms a basis for \(\real^3\text{.}\)
Suppose that
\(\bvec=\threevec24{-4}\text{.}\) Find the weights
\(c_1\text{,}\) \(c_2\text{,}\) and
\(c_3\) that express
\(\bvec\) as a linear combination
\(\bvec=c_1\wvec_1 + c_2\wvec_2 + c_3\wvec_3\) using
Proposition 8.3.4.
-
If we multiply a vector \(\vvec\) by a positive scalar \(s\text{,}\) the length of \(\vvec\) is also multiplied by \(s\text{;}\) that is, \(\len{s\vvec} = s\len{\vvec}\text{.}\)
Using this observation, find a vector
\(\uvec_1\) that is parallel to
\(\wvec_1\) and has length 1. Such vectors are called
unit vectors.
Similarly, find a unit vector \(\uvec_2\) that is parallel to \(\wvec_2\) and a unit vector \(\uvec_3\) that is parallel to \(\wvec_3\text{.}\)
Construct the matrix
\(Q=\begin{bmatrix}
\uvec_1 \amp \uvec_2 \amp \uvec_3
\end{bmatrix}\) and find the product
\(Q^TQ\text{.}\) Use
Proposition 8.2.3 to explain your result.
Answer.
We compute the dot products \(\wvec_1\cdot\wvec_2=0\text{,}\) \(\wvec_1\cdot\wvec_3=0\text{,}\) and \(\wvec_2\cdot\wvec_3=0\text{.}\)
An orthogonal set of vectors is linearly independent.
\(\bvec = -2\wvec_1 + 3\wvec_2 +
\wvec_3\text{.}\)
\(\displaystyle \uvec_1 = \frac{1}{\sqrt{3}}\wvec_1 =
\threevec{1/\sqrt{3}}{-1/\sqrt{3}}{1/\sqrt{3}}\)
We find that
\begin{equation*}
\uvec_2 =
\threevec{1/\sqrt{2}}{1/\sqrt{2}}0,\hspace{24pt}
\uvec_3 =
\threevec{1/\sqrt{6}}{-1/\sqrt{6}}{-2/\sqrt{6}}
\end{equation*}
\(\displaystyle Q^TQ=I\)
Solution.
We compute the dot products \(\wvec_1\cdot\wvec_2=0\text{,}\) \(\wvec_1\cdot\wvec_3=0\text{,}\) and \(\wvec_2\cdot\wvec_3=0\text{.}\)
We know that an orthogonal set of vectors is linearly independent. Therefore, we have a set of three linearly independent vectors in \(\real^3\) so they must form a basis for \(\real^3\text{.}\)
We find that \(\bvec = -2\wvec_1 + 3\wvec_2 +
\wvec_3\text{.}\)
Since \(\len{\wvec_1} = \sqrt{3}\text{,}\) we find
\begin{equation*}
\uvec_1 = \frac{1}{\sqrt{3}}\wvec_1 =
\threevec{1/\sqrt{3}}{-1/\sqrt{3}}{1/\sqrt{3}}
\end{equation*}
We find that
\begin{equation*}
\uvec_2 =
\threevec{1/\sqrt{2}}{1/\sqrt{2}}0,\hspace{24pt}
\uvec_3 =
\threevec{1/\sqrt{6}}{-1/\sqrt{6}}{-2/\sqrt{6}}
\end{equation*}
We find \(Q^TQ=I\) since each entry in this matrix product is the dot product of two columns of \(Q\text{.}\)
This activity introduces an important way of modifying an orthogonal set so that the vectors in the set have unit length. Recall that we may multiply any nonzero vector \(\wvec\) by a scalar so that the new vector has length 1. For instance, we know that if \(s\) is a positive scalar, then \(\len{s\wvec} = s\len{\wvec}\text{.}\) To obtain a vector \(\uvec\) having unit length, we want
\begin{equation*}
\len{\uvec} = \len{s\wvec} = s\len{\wvec} = 1
\end{equation*}
so that \(s=1/\len{\wvec}\text{.}\) Therefore,
\begin{equation*}
\uvec = \frac{1}{\len{\wvec}}\wvec
\end{equation*}
becomes a unit vector parallel to \(\wvec\text{.}\)
Orthogonal sets in which the vectors have unit length are called orthonormal and are especially convenient.
Definition 8.3.6.
An orthonormal set is an orthogonal set of vectors each of which has unit length.
Example 8.3.7.
The vectors
\begin{equation*}
\uvec_1=\twovec{1/\sqrt{2}}{1/\sqrt{2}},\hspace{24pt}
\uvec_2=\twovec{-1/\sqrt{2}}{1/\sqrt{2}}
\end{equation*}
are an orthonormal set of vectors in \(\real^2\) and form an orthonormal basis for \(\real^2\text{.}\)
If we form the matrix
\begin{equation*}
Q=\begin{bmatrix}
\uvec_1 \amp \uvec_2
\end{bmatrix}
= \begin{bmatrix}
1/\sqrt{2} \amp -1/\sqrt{2} \\
1/\sqrt{2} \amp 1/\sqrt{2} \\
\end{bmatrix}\text{,}
\end{equation*}
\begin{equation*}
Q^TQ = \begin{bmatrix}
\uvec_1\cdot\uvec_1 \amp \uvec_1\cdot\uvec_2 \\
\uvec_2\cdot\uvec_1 \amp \uvec_2\cdot\uvec_2 \\
\end{bmatrix}
=
\begin{bmatrix}
1 \amp 0 \\
0 \amp 1 \\
\end{bmatrix}
\end{equation*}
The previous activity and example illustrate the next proposition.
Proposition 8.3.8.
If the columns of the \(m\times n\) matrix \(Q\) form an orthonormal set, then \(Q^TQ = I_n\text{,}\) the \(n\times n\) identity matrix.
Proposition 8.3.9. \(QR\) factorization.
If \(A\) is an \(m\times n\) matrix whose columns are linearly independent, we may write \(A=QR\) where \(Q\) is an \(m\times n\) matrix whose columns form an orthonormal basis for \(\col(A)\) and \(R\) is an \(n\times n\) upper triangular matrix.
Example 8.3.10.
We’ll consider the matrix \(A=\begin{bmatrix}
2 \amp -3 \amp -2 \\
-1 \amp 3 \amp 7 \\
2 \amp 0 \amp 1 \\
\end{bmatrix}\) whose columns, which we’ll denote \(\vvec_1\text{,}\) \(\vvec_2\text{,}\) and \(\vvec_3\text{,}\) are the basis of \(\real^3\text{.}\) Using the Gramm-Shmidt procedure that begins with a bases for a subspace and produces an orthogonal basis for the same subspace, we we find an orthogonal basis \(\wvec_1\text{,}\) \(\wvec_2\text{,}\) and \(\wvec_3\) that satisfied
\begin{align*}
\vvec_1 \amp {}={} \wvec_1\\
\vvec_2 \amp {}={} -\wvec_1 + \wvec_2\\
\vvec_3 \amp {}={} -\wvec_1 + 2\wvec_2 + \wvec _3\text{.}
\end{align*}
In terms of the resulting orthonormal basis \(\uvec_1\text{,}\) \(\uvec_2\text{,}\) and \(\uvec_3\text{,}\) we had
\begin{equation*}
\wvec_1 = 3 \uvec_1,\hspace{24pt}
\wvec_2 = 3 \uvec_2,\hspace{24pt}
\wvec_3 = 3 \uvec_3
\end{equation*}
so that
\begin{align*}
\vvec_1 \amp {}={} 3\uvec_1\\
\vvec_2 \amp {}={} -3\uvec_1 + 3\uvec_2\\
\vvec_3 \amp {}={} -3\uvec_1 + 6\uvec_2 + 3\uvec _3\text{.}
\end{align*}
Therefore, if \(Q=\begin{bmatrix} \uvec_1 \amp \uvec_2 \amp
\uvec_3 \end{bmatrix}\text{,}\) we have the \(QR\) factorization
\begin{equation*}
A = Q\begin{bmatrix}
3 \amp -3 \amp -3 \\
0 \amp 3 \amp 6 \\
0 \amp 0 \amp 3 \\
\end{bmatrix}
=QR\text{.}
\end{equation*}