Instead of finding the least squares solution of , we will be finding it for where
→ design matrix
→ parameter vector
→ observation vector
Least-Squares Line
Suppose we are given data points, and we want to find a line that best fits the data points. Let the best fit line be the linear equation
And let the data points be . The graph should look something like this:
data:image/s3,"s3://crabby-images/846c2/846c29592200f0472e58ba06b3689fb274b0933b" alt="best fit liney=beta_0 + beta_1 x"
Our goal is to determine the parameters and . Let's say that each data point is on the line. Then
data:image/s3,"s3://crabby-images/bb060/bb06021f1045604186b47fbdfdb8f59106932ae4" alt="best fit line data points beta_0 + beta_1 x"
This is a linear system which we can write this as:
data:image/s3,"s3://crabby-images/f7e48/f7e48b02e9f4b9aaf43cd8b2deb8d12219fc5fd6" alt="linear system"
Then the least squares solution to will be .
General Linear Model
Since the data points are not actually on the line, then there are residual values. Those are also known as errors. So we introduce a vector called the residual vector , where
→
Our goal is to minimize the length of (the error), so that is approximately equal to . This means we are finding a least-squares solution of using .
Least-Squares of Other Curves
Let the data points be and we want to find the best fit using the function , where are parameters. Technically we are using a best fit quadratic function instead of a line now.
data:image/s3,"s3://crabby-images/7c269/7c269dbfdb92cd2c611b729e4dc451b9a675c635" alt="best fit quadratic function"
Again, the data points don't actually lie on the function, so we add residue values where
data:image/s3,"s3://crabby-images/3e7ef/3e7ef1198935da3eddaccb6e175aabf45fcc0ad7" alt="add residue values
to data points"
Since we are minimizing the length of , then we can find the least-squares solution using . This can also be applied to other functions.
Multiple Regression
Let the data points be and we want to use the best fit function , where are parameters.
data:image/s3,"s3://crabby-images/28362/28362e3cddc646cc73e815073a86b208c9932e30" alt="best fit function Multiple Regression"
Again, the data points don't actually lie on the function, so we add residue values where
data:image/s3,"s3://crabby-images/6049e/6049ef05758be6406e30e9b2d2d999e34bff8cb0" alt="add residue values to data points of the function"
Since we are minimizing the length of , then we can find the least-squares solution using . This can also be applied to other multi-variable functions.