The Multiple regression least-square estimation

2 minute read

Published: November 05, 2021

This post covers Introduction to probability from Statistics for Engineers and Scientists by William Navidi.

Basic Ideas

The estimation of the Coefficients
- In any multiple regression model, the estimates $ \hat \beta_0, \hat \beta_1,\ldots, \hat \beta_p$ are computed by least squares, just as in simple linear regression. The equation
  $ \hat y = \hat \beta_0 + \hat \beta_1x_1 + \ldots + \hat \beta_p x_p $
  is called the least-squares equation or fitted regression equation.
- Thus we wish to minimize the sum $\Sigma^n_{i=1} (y_i − \hat \beta_0 − \hat \beta_1x_{1i} − \ldots -\hat \beta_p x_{pi})^2$
- We can do this by taking partial derivatives with respect to $ \hat \beta_0,\hat \beta_1,\ldots, \hat \beta_p$, setting them equal to $0$, and solving the resulting $p + 1$ equations in $p + 1$ unknowns.
- The expressions obtained for $ \hat \beta_0, \hat \beta_1,…, \hat \beta_p$ are complicated.
- For each estimated coefficient $ \beta_i$, there is an estimated standard deviation $s_{\beta_i}$
Sums of Squares
- In the multiple regression model, the following sums of squares are defined:
  $y_i = \beta_0 + \beta_1 x_{1i} + \ldots + \beta_p x_{pi} + \epsilon_i,$
  - Regression sum of squares:
    $ SSR = \Sigma^n_{i =1} ( \hat y_i − \overline y)^2$
  - Error sum of squares:
  $ SSE = \Sigma^n_{i =1}(y_i - \hat y_i)^2$
  - Total sum of squares:
    $ SST = \Sigma^n_{i =1}(y_i − \overline y)^2$
  - It can be shown that
  $ SST = SSR + SSE $
  Equation is called the analysis of variance identity.
Assumptions for Errors in Linear Models
In the simplest situation, the following assumptions are satisfied:
- The errors $\epsilon_1,\ldots, \epsilon_n$ are random and independent. In particular, the magnitude of any error $ \epsilon_i$ does not influence the value of the next error $\epsilon_{i+1}$.
- The errors $\epsilon_1,\ldots, \epsilon_n$ all have mean $0$.
- The errors $\epsilon_1,\ldots, \epsilon_n$ all have the same variance, which we denote by $\sigma^2$.
- The errors $\epsilon_1,\ldots, \epsilon_n$ are normally distributed.
In the multiple regression model $y_i = \beta_0 + \beta_1x_{1i} + \ldots + \beta_p x_{pi} + \epsilon_i$, under assumptions 1 through 4, the observations $y_1,…, y_n$ are independent random variables that follow the normal distribution.
- The mean and variance of $y_i$ are given by
$\mu_{y_i} = \beta_0 + \beta_1 x_{1i} +⋯+ \beta_p x_{pi}$
$𝜎^2_{y_i} = \sigma^2$
Each coefficient $\beta_i$ represents the change in the mean of $y$ associated with an increase of one unit in the value of $x_i$, when the other $x$ variables are held constant.

Share on

Twitter Facebook LinkedIn

The Multiple regression least-square estimation

Basic Ideas

Share on

You May Also Enjoy

Advising@Marist

Internship@Marist

Mirai and Bashlite

Basic Ideas

Quantum Computing