Mastering the Least-Squares Problem in Linear Algebra
Dive into the world of least-squares problems and unlock powerful techniques for data fitting and optimization. Learn to solve overdetermined systems and apply your skills to real-world challenges in engineering and science.

Get the most by viewing this topic in your current grade. Pick your course now.

Now Playing:Least squares problem – Example 0a
Intros
  1. Least Squares Problem Overview:
  2. Least Squares Problem Overview:
    The Least Squares Solution
    Ax=bAx=b give no solution
    • Approximate closest solution x^\hat{x}
    • The least Squares Solution x^=(ATA)1ATb\hat{x} =(A^T A)^{-1} A^T b
    • Not always a unique solution
  3. Least Squares Problem Overview:
    The Least Squares Error
    • Finding the error of the solution x^\hat{x}
    • Use the formula bAx^\lVert b-A\hat{x}\rVert
Examples
  1. Finding the Least Squares Solutions with ATAx^=ATbA^T A\hat{x} =A^T b
    Find a least-squares solution of Ax=bAx=b if
    Find a least-squares solution
    Inner product, length, and orthogonality
    Jump to:Notes
    Notes

    Least-squares problem



    Method of least squares


    In linear algebra, we have talked about the matrix equation Ax=b and the unique solutions that can be obtained for the vector xx, but, sometimes Ax=bAx=b does not have a solution, and so, for those cases our best approach is to approximate as much as possible a value for xx.
    Therefore, the topic for the lesson of today is a technique dedicated to obtain that close-enough solution. The technique has been called least-squares and is based on the principle that since we cannot obtain the solution of the vector xx for the matrix equation Ax=bAx=b, then we have to find an xx that produces a multiplication AxAx as close as possible to the vector bb

    If AA is an m×nm \times n matrix and bb is a vector in RnR^n, then the least-squares solution for Ax=bAx=b is the vector xxin RnR^n where the following condition applies for all xx in RnR^n:

    bAx^     bAx \parallel b- A\hat{x}\parallel\ \; \leq \; \, \parallel b-Ax\parallel
    Equation 1: Condition for least-squares solution

    This means that if we cannot obtain a solution for the equation Ax=bAx=b, then our best approximation to the true value of xx, which we call xx or the least-squares solution (also called the least squares approximation), can be obtained if the condition in equation 1 is met. The left hand side on this equation is what we call the magnitude of the smallest possible error of the best approximation to the vector xx, in other words, is the smallest possible error in xx. The right hand side is the smallest possible error in xx.

    Having seen how the method of least squares is related to errors, we can talk about how this relates to a real life scenario. Basically, the idea of the least square method is that as much as we want to be accurate while performing measurements during studies or experiments, the truth is that there is just so much precision we can obtain as humans with non-perfect tools in a non-perfect world. From the scale of the measurements, to random changes in the environment, etc. There will be scenarios in which our measurements are not good enough and for that we have to come up with a solution that approximates as much as possible to the true value of the element we are measuring. This is where the least-squares solution brings a way to find a value that otherwise is unattainable (and actually, this is just the best approximation we can have but not the real value itself).

    Least squares solution


    Having a m×nm \times n matrix AA and column vector bb in RnR^n, using the next matrix equation for the relationship of the unit vector xx and the matrix AA and its transpose:

    ATAx^=ATbA^TA\hat{x} = A^Tb
    Equation 2: Matrix equation for a matrix resulting from the multiplication of a matrix and its transpose

    We can solve for the unit vector x^\hat{x} and we have that:

    x^=(ATA)1ATb\hat{x} = (A^TA)^{-1}A^Tb
    Equation 3: Least-squares solution

    This is what is called the least-squares solution of a matrix equation Ax=bAx=b

    The steps to obtain the least-squares solution x^\hat{x} for a problem where you are provided with the matrix AA and the vector bb are as follows:

    1. If you follow the matrix equation found in equation 2:
      1. Find the transpose of vector A:ATA: A^T
      2. Multiply ATA^T times matrix AA, to obtain a new matrix: ATAA^TA
      3. Multiply the transpose of matrix AA with the vector bb
      4. Construct the matrix equation ATAx^=ATbA^TA \hat{x}=A^Tb using the results from steps b and c
      5. Transform the matrix equation above into an augmented matrix
      6. Row reduce the augmented matrix into its echelon form to find the components of vector x^\hat{x}
      7. Construct the least squares solution x^ \hat{x} with the components found.

    2. If you follow the formula for x^\hat{x} \, that is found in equation 3:
      1. Start by obtaining the transpose of matrix A:ATA: A^T
      2. Multiply ATA^T times matrix AA, to obtain a new matrix: ATAA^TA
      3. Find the inverse of matrix ATAA^TA
      4. Multiply the transpose of matrix AA with the vector bb
      5. Multiply the results you obtained in steps 4 and 5 together to obtain: x^=(ATA)1ATb \hat{x} =(A^TA)^{-1} A^Tb

    There are also some alternative calculations in order to find the least-square solution of a matrix equation. For example: let A be a mxn matrix where a1,a2,...,ana_1,a_2,...,a_n are the columns of AA. If Col(A)=Col(A)={a1,a2,...,ana_1,a_2,...,a_n} form an orthogonal set (meaning that all the columns inside matrix AA are orthogonal to each other), then we can find the least-squares solutions using the equation:

    Ax^=b^ A\hat{x} = \hat{b}
    Equation 4: Linear transformation of hat x

    Where we have that b^\hat{b} encompases the orthogonal projection of b onto the columns of AA:

    b^= \hat{b} = projCol(A)b=ba1a1a1a1  +...  +banananan_{Col(A)}b = \frac{b\cdot a_1}{a_1 \cdot a_1}a_1 \; + ... \; + \frac{b\cdot a_n}{a_n \cdot a_n}a_n
    Equation 5: Orthogonal projection of b onto the columns of A

    Before we continue onto the next section, let us focus for a moment in what is called the least-squares error: Remember in the first section we called the left hand side of equation 1 as the magnitude of the smallest possible error of the best approximation to the vector xx, in other words, the magnitude of the smallest possible error in x^\hat{x}. This part of equation one is what is called the least-squares error:

    bAx^ \parallel b- A\hat{x}\parallel
    Equation 6: Least square error


    Solving least squares problems


    During the next exercises we will take a look at different examples on the method of least squares. Make sure you follow all of the operations and if in doubt, do not hesitate to message us!

    Example 1

    Given AA and bb as shown below, find a least squares estimate solution for equation Ax=bAx=b.

    Least squares problem
    Equation 7: Matrix A and vector b

    This problem is asking to solve for x^\hat{x} using the least squares formula as defined in equation 3, and so, we must follow the steps we have described before. The first thing to do is to obtain the transpose of matrix AA. Remember that the transpose of a matrix is obtained by swapping the elements that used to form the rows in the original matrix as the columns in the transform, and so, we swap rows to columns and columns to rows to find ATA^T:

    Least squares problem
    Equation 8: Transpose of matrix A

    Then we multiply the matrix obtaining through the transpose in equation 8 with matrix AA, and so, ATAA^TA goes as follows:

    Least squares problem
    Equation 9: Matrix multiplication ATA

    Now we find the inverse of matrix ATAA^TA:

    Least squares problem
    Equation 10: Inverse of the matrix resulting from the multiplication ATA

    On the other hand, now we multiply the transpose of matrix AA, found in equation 8, with the vector bb:

    Least squares problem
    Equation 11: Multiplication ATbA^Tb

    Now we can finally obtain x^\hat{x} by multiplying the results found in equations 10 and 11: x^=(ATA)1ATb\hat{x}=(A^TA)^{-1} A^Tb

    Least squares problem
    Equation 12: Least squares solution


    Example 2

    Describe all least square solutions of equation Ax=bAx=b if:

    Least squares problem
    Equation 13: Matrix A and vector b

    Now following the steps to solve for x^\hat{x} using the least squares equation 2: ATAx^=ATbA^TA\hat{x}=A^Tb
    We start by finding the transpose ATA^T:

    Least squares problem
    Equation 14: Transpose of matrix A

    Next, we multiply the transpose to matrix AA:

    Least squares problem
    Equation 15: Matrix multiplication ATA

    Now we multiply the transpose of matrix AA with the vector bb:

    Least squares problem
    Equation 16: Multiplication of the transpose and vector b

    With the results found in equations 15 and 16, we can construct the matrix equation ATAx^=ATbA^TA\hat{x} =A^Tb as follows:

    Least squares problem
    Equation 17: Constructing the matrix equation from equation 2

    Transform the matrix equation above into an augmented matrix and row reduce it into its echelon form to find the components of vector x^\hat{x}:

    Least squares problem

    Least squares problem

    Least squares problem
    Equation 18: Solving for the components of hatx\small hat{x}

    And so, we can construct the least squares solution x^\hat{x} with the components found:

    Least squares problem
    Equation 19: Least squares solution


    Example 3

    You are given that the least squares solution of Ax=bAx=b is

    Least squares problem
    Equation 20: Least squares solution

    Compute the least square error if AA and b are as follows:

    Least squares problem
    Equation 21: Matrix A and vector b

    This problem is asking us to calculate the magnitude of the smallest possible error in x^\hat{x}. Such error is defined in equation 6, so now, we just have to follows the next simple steps to compute it:
    Start by computing the multiplication Ax^A\hat{x}:

    Least squares problem
    Equation 22: Multiplication of matrix A and hatx \small hat{x}

    Having found the column vector from the multiplication above, we need to perform a subtraction of vectors by subtracting the vector found in equation 22 to vector b just as shown below:

    Least squares problem
    Equation 23: Subtraction of vector b and Ahatx \small hat{x}

    And now, you just need to find the length of the vector found in equation 23. Remember we do that by computing the square root of the addition of the components of the vectors squared:

    Least squares problem
    Equation 24: Least square error

    And we are done, the least square error is equal to 27 \sqrt{27}

    Example 4

    Find the orthogonal projections of bb onto the columns of AA and find a least-squares solution of Ax=bAx = b.

    Least squares problem
    Equation 25: Matrix A and vector b

    In order to find the orthogonal projections and the least squares solution for this problem we need to use and alternative approach, and so, for this case we start by computing the orthogonal projections of b onto the columns of AA using the formula found in equation 5, to the go ahead and solve for the least squares solution using equation 4.
    For the orthogonal projections we have that:

    b^=\large \hat{b} = projCol(A)b=ba1a1a1a1  +  ba2a2a2a2_{Col(A)}b = \frac{b\cdot a_1}{a_1 \cdot a_1}a_1 \; +\; \frac{b\cdot a_2}{a_2 \cdot a_2}a_2
    Equation 26: Orthogonal projections of b onto the columns of A

    Where a1a_1 and a2a_2 are the column vectors that compose the matrix AA. Therefore, solving for b^\hat{b} goes as follows:

    b^= \large \hat{b} = projCol(A)b=(1)(1)+(1)(1)+(3)(1)+(2)(0)(1)(1)+(1)(1)+(1)(1)+(0)(0)a1  +  (1)(1)+(1)(1)+(3)(0)+(2)(1)(1)(1)+(1)(1)+(0)(0)+(1)(1)a2=33a1  +  0a2_{Col(A)}b = \frac{(1)(1)+(-1)(1)+(3)(-1)+(2)(0)}{(1)(1)+(1)(1)+(-1)(-1)+(0)(0)}a_1 \; +\; \frac{(1)(-1)+(-1)(1)+(3)(0)+(2)(1)}{(-1)(-1)+(1)(1)+(0)(0)+(1)(1)}a_2 =-\frac{3}{3}a_1 \;+ \; 0a_2

    Least squares problem
    Equation 27: Solving for b-hat.

    And so, we have found the orthogonal projections. Now for the least squares solution we use as reference equation 4 and take the matrix AA and vector b^\hat{b} so we can form the matrix equation:

    Least squares problem
    Equation 28: Matrix Equation concerning the least squares solution

    Where we can easily obtain the vector x^\hat{x} since its components are x1=1  x_1 = -1 \; and  x2=0 \; x_2 = 0, and so:

    Least squares problem
    Equation 29: Least squares solution


    ***
    To finalize this lesson we would like to recommend you to take a look to this lesson on least squares that contains a thorough explanation of the topic plus some extra examples.
    This is it for our lesson on least squares, we hope you enjoyed it, and see you in the next lesson, the final one for this course!
    In linear algebra, we have dealt with questions in which Ax=bAx=b does not have a solution. When a solution does not exist, the best thing we can do is to approximate xx. In this section, we will learn how to find a xx such that it makes AxAx as close as possible to bb.

    If AA is an m×nm \times n matrix and bb is a vector in Rn\Bbb{R}^n, then a least-squares solution of Ax=bAx=b is a x^\hat{x} in Rn\Bbb{R}^n where
    bAx^bAx\lVert b-A \hat{x}\rVert \leq \lVert b-Ax\rVert

    For all xx in Rn\Bbb{R}^n.

    The smaller the distance, the smaller the error. Thus, the better the approximation. So the smallest distance gives the best approximation for xx. So we call the best approximation for xx to be x^\hat{x}.

    The Least-Squares Solution

    The set of least-square solutions of Ax=bAx=b matches with the non-empty set of solutions of the matrix equation ATAx^=ATbA^T A \hat{x}=A^T b.

    In other words,
    ATAx^=ATbA^T A \hat{x}=A^T b
    x^=(ATA)1ATb \hat{x} = (A^TA)^{-1}A^Tb

    Where xx is the least square solutions of Ax=bAx=b.

    Keep in mind that xx is not always a unique solution. However, it is unique if one of the conditions hold:
    1. The equation Ax=bAx=b has unique least-squares solution for each b in Rm\Bbb{R}^m.
    2. The columns of AA are linearly independent.
    3. The matrix ATAA^T A is invertible.

    The Least-Squares Error
    To find the least-squares error of the least-squares solution of Ax=bAx=b, we compute

    bAx^\lVert b - A \hat{x} \rVert

    Alternative Calculations to Least-Squares Solutions
    Let AA be a m×nm \times n matrix where a1,,ana_1,\cdots,a_n are the columns of AA. If Col(A)=Col(A)={a1,,ana_1,\cdots,a_n } form an orthogonal set, then we can find the least-squares solutions using the equation
    Ax^=b^A \hat{x}=\hat{b}

    where b^=projCol(A)b.\hat{b}=proj_{Col(A)}b.

    Let AA be a m×nm \times n matrix with linearly independent columns, and let A=QRA=QR be the QRQR factorization of AA . Then for each bb in Rm\Bbb{R}^m, the equation Ax=bAx=b has a unique least-squares solution where
    x^=R1QTb\hat{x}=R^{-1} Q^T b
    Rx^=QTb R\hat{x}=Q^T b