14 Regression 14.4 (Optional) Linear Regression in Excel Alternatives

14.5 (Optional) Derivation of Linear Regression Line

Goal:

\bullet

Learn how to derive the formulas for finding the slope and intercept for linear regression;

\bullet

Learn how calculus is involved in a statistical concept.

Note that this section requires knowledge of calculus.

Assume we have data $x_{1},x_{2},\ldots,x_{n}$ from a random variable $X$ and $y_{1},y_{2},\ldots,y_{n}$ from a random variable $Y$ such that $x_{i}$ is paired with $y_{i}$ , for $i=1,\ldots n$ . Writing these as pairs, we have

(x_{1},y_{1}),(x_{2},y_{2}),\ldots,(x_{n},y_{n}).

The goal is to find $a$ and $b$ in $Y=bX+a$ . For each point $(x_{i},y_{i})$ , for $i=1,\ldots,n$ , we can calculate the error obtained by using such a formula. That is, for each point, we have

\text{Error at }i\text{-th point}=e_{i}=(bx_{i}+a)-y_{i}.

Note that $e_{i}$ could be positive or negative based on whether the value of $(bx_{i}+a)$ is an overestimate or an underestimate. The line of best fit will be the line that incurs the least amount of error at each point. So we care more about the error being closer to zero than the sign. By squaring the error, we can then remedy this.⁵⁹⁵⁹Absolute value would also do this, but differentiability becomes an issue.

Our problem is now to find the slope and intercept, $b$ and $a$ , that minimize the sum of squared errors. That is,

\displaystyle\min\limits_{(a,b)\in\mathbb{R}^{2}}\sum_{i=1}^{n}e_{i}^{2}

\displaystyle=\min\limits_{(a,b)\in\mathbb{R}^{2}}\sum\limits_{i=1}^{n}(bx_{i}% +a-y_{i})^{2}

We do this by letting

E(a,b)=\sum\limits_{i=1}^{n}(bx_{i}+a-y_{i})^{2}.

Calculus techniques state that we want to find critical values first, then check to see if they satisfy conditions for local extrema. First, we find any critical values. Finding the partials with respect to $a$ and $b$ , respectively, we have

\displaystyle E_{a}(a,b)=2na+2b\sum\limits_{i=1}^{n}x_{i}-2\sum\limits_{i=1}^{% n}y_{i}

and

\displaystyle E_{b}(a,b)=2a\sum\limits_{i=1}^{n}x_{i}+2b\sum\limits_{i=1}^{n}x% _{i}^{2}-2\sum\limits_{i=1}^{n}x_{i}y_{i}.

Setting each of them equal to zero, we obtain

$\displaystyle 2na^{*}$	$\displaystyle=2b\sum\limits_{i=1}^{n}x_{i}-2\sum\limits_{i=1}^{n}y_{i}$
$\displaystyle a^{*}$	$\displaystyle=\dfrac{1}{n}\sum\limits_{i=1}^{n}y_{i}-\dfrac{b}{n}\sum\limits_{% i=1}^{n}x_{i}$
$\displaystyle a^{*}$	$\displaystyle=\bar{y}-b\bar{x}$	(14.5)

and

$\displaystyle(\bar{y}-b^{}\bar{x})\sum\limits_{i=1}^{n}x_{i}+b^{}\sum\limits% _{i=1}^{n}x_{i}^{2}$	$\displaystyle=\sum\limits_{i=1}^{n}x_{i}y_{i}$
$\displaystyle b^{*}\left(\sum\limits_{i=1}^{n}x_{i}^{2}-\bar{x}\sum\limits_{i=% 1}^{n}x_{i}\right)$	$\displaystyle=\sum\limits_{i=1}^{n}x_{i}y_{i}-\bar{y}\sum\limits_{i=1}^{n}x_{i}$
$\displaystyle b^{*}$	$\displaystyle=\dfrac{\sum\limits_{i=1}^{n}x_{i}y_{i}-\bar{y}\sum\limits_{i=1}^% {n}x_{i}}{\sum\limits_{i=1}^{n}x_{i}^{2}-\bar{x}\sum\limits_{i=1}^{n}x_{i}}$
$\displaystyle b^{*}$	$\displaystyle=\dfrac{n\sum\limits_{i=1}^{n}x_{i}y_{i}-\sum\limits_{i=1}^{n}x_{% i}\sum\limits_{i=1}^{n}y_{i}}{n\sum\limits_{i=1}^{n}x_{i}^{2}-\left(\sum% \limits_{i=1}^{n}x_{i}\right)}$	(14.6)

So, Eqs. (14.5) and (14.6) are the values of $a$ and $b$ that form a critical value $(a^{*},b^{*})$ . Let’s check if this point is a relative minimum. In order for this to happen, we check to see if the following conditions, regarding the second partial derivatives, hold.

•

$E_{aa}(a^{*},b^{*})E_{bb}(a^{*},b^{*})-\left[E_{ab}(a^{*},b^{*})\right]^{2}>0$ ;
•

$E_{aa}(a^{*},b^{*})>0$ .

Indeed, we have

		$\displaystyle E_{aa}(a,b)=2n$
		$\displaystyle E_{ab}(a,b)=2\sum\limits_{i=1}^{n}x_{i}$
		$\displaystyle E_{bb}(a,b)=2\sum\limits_{i=1}^{n}x_{i}^{2}.$

Clearly,

E_{aa}(a^{*},b^{*})=2n>0.

In addition,

	$\displaystyle E_{aa}(a^{},b^{})E_{bb}(a^{},b^{})-\left[E_{ab}(a^{},b^{})% \right]^{2}$	$\displaystyle=\left(2n\right)\left(2\sum\limits_{i=1}^{n}x_{i}^{2}\right)-% \left(2\sum\limits_{i=1}^{n}x_{i}\right)^{2}$
		$\displaystyle=4\left(n\sum\limits_{i=1}^{n}x_{i}^{2}-\left(\sum\limits_{i=1}^{% n}x_{i}\right)^{2}\right)>0.$

Hence, $(a^{*},b^{*})$ given by Eqs. (14.5) and (14.6), respectively, minimize the squared errors. These formulas are the intercept and slope as in Eqs. (14.3) and (14.2).⁶⁰⁶⁰We leave it to the reader to determine why the last line is positive.