14 Regression

14.5 (Optional) Derivation of Linear Regression Line

Goal:
Learn how to derive the formulas for finding the slope and intercept for linear regression;
Learn how calculus is involved in a statistical concept.

Note that this section requires knowledge of calculus.

Assume we have data x1,x2,,xn from a random variable X and y1,y2,,yn from a random variable Y such that xi is paired with yi, for i=1,n. Writing these as pairs, we have

(x1,y1),(x2,y2),,(xn,yn).

The goal is to find a and b in Y=bX+a. For each point (xi,yi), for i=1,,n, we can calculate the error obtained by using such a formula. That is, for each point, we have

Error at i-th point=ei=(bxi+a)-yi.

Note that ei could be positive or negative based on whether the value of (bxi+a) is an overestimate or an underestimate. The line of best fit will be the line that incurs the least amount of error at each point. So we care more about the error being closer to zero than the sign. By squaring the error, we can then remedy this.5959Absolute value would also do this, but differentiability becomes an issue.

Our problem is now to find the slope and intercept, b and a, that minimize the sum of squared errors. That is,

min(a,b)2i=1nei2 =min(a,b)2i=1n(bxi+a-yi)2

We do this by letting

E(a,b)=i=1n(bxi+a-yi)2.

Calculus techniques state that we want to find critical values first, then check to see if they satisfy conditions for local extrema. First, we find any critical values. Finding the partials with respect to a and b, respectively, we have

Ea(a,b)=2na+2bi=1nxi-2i=1nyi

and

Eb(a,b)=2ai=1nxi+2bi=1nxi2-2i=1nxiyi.

Setting each of them equal to zero, we obtain

2na* =2bi=1nxi-2i=1nyi
a* =1ni=1nyi-bni=1nxi
a* =y¯-bx¯ (14.5)

and

(y¯-b*x¯)i=1nxi+b*i=1nxi2 =i=1nxiyi
b*(i=1nxi2-x¯i=1nxi) =i=1nxiyi-y¯i=1nxi
b* =i=1nxiyi-y¯i=1nxii=1nxi2-x¯i=1nxi
b* =ni=1nxiyi-i=1nxii=1nyini=1nxi2-(i=1nxi) (14.6)

So, Eqs. (14.5) and (14.6) are the values of a and b that form a critical value (a*,b*). Let’s check if this point is a relative minimum. In order for this to happen, we check to see if the following conditions, regarding the second partial derivatives, hold.

  • Eaa(a*,b*)Ebb(a*,b*)-[Eab(a*,b*)]2>0;

  • Eaa(a*,b*)>0.

Indeed, we have

Eaa(a,b)=2n
Eab(a,b)=2i=1nxi
Ebb(a,b)=2i=1nxi2.

Clearly,

Eaa(a*,b*)=2n>0.

In addition,

Eaa(a*,b*)Ebb(a*,b*)-[Eab(a*,b*)]2 =(2n)(2i=1nxi2)-(2i=1nxi)2
=4(ni=1nxi2-(i=1nxi)2)>0.

Hence, (a*,b*) given by Eqs. (14.5) and (14.6), respectively, minimize the squared errors. These formulas are the intercept and slope as in Eqs. (14.3) and (14.2).6060We leave it to the reader to determine why the last line is positive.