User Tools

Site Tools


regression

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
regression [2023/05/17 08:40] hkimscilregression [2023/05/24 08:53] (current) – [Slope test] hkimscil
Line 663: Line 663:
  
  
-**__r-square:__**+===== r-square =====
   * $\displaystyle r^2=\frac{SS_{total}-SS_{res}}{SS_{total}} = \frac{\text{Explained sample variability}}{\text{Total sample variability}}$   * $\displaystyle r^2=\frac{SS_{total}-SS_{res}}{SS_{total}} = \frac{\text{Explained sample variability}}{\text{Total sample variability}}$
  
Line 672: Line 672:
  
  
-**__Adjusted r-square:__** +===== Adjusted r-square =====
   * $\displaystyle r^2=\frac{SS_{total}-SS_{res}}{SS_{total}} = 1 - \frac{SS_{res}}{SS_{total}} $ ,   * $\displaystyle r^2=\frac{SS_{total}-SS_{res}}{SS_{total}} = 1 - \frac{SS_{res}}{SS_{total}} $ ,
  
Line 695: Line 695:
   * Therefore, the Adjusted r<sup>2</sup> = 1- (.367 / 1.5) = 0.756 (green color cell)   * Therefore, the Adjusted r<sup>2</sup> = 1- (.367 / 1.5) = 0.756 (green color cell)
  
-**__Slope test__**+===== Slope test =====
 If we take a look at the ANOVA result: If we take a look at the ANOVA result:
  
Line 706: Line 706:
 | b Dependent Variable: y    ||||||| | b Dependent Variable: y    |||||||
 <WRAP clear /> <WRAP clear />
 +F test recap. 
   * ANOVA, F-test, $F=\frac{MS_{between}}{MS_{within}}$   * ANOVA, F-test, $F=\frac{MS_{between}}{MS_{within}}$
-  * MS_between? +    * MS_between? 
-  * MS_within? +    * MS_within? 
-  * MS for residual +  * regression에서 within 에 해당하는 것 == residual 
    * $s = \sqrt{s^2} = \sqrt{\frac{SS_{res}}{n-2}} $    * $s = \sqrt{s^2} = \sqrt{\frac{SS_{res}}{n-2}} $
-   * random difference (MS<sub>within</sub> ): $s^2 = \frac{SS_{res}}{n-2} $ +   왜냐하면 이 ss residual이 random difference 를 말하는 것이므로 (MS<sub>within</sub> ): $s^2 = \frac{SS_{res}}{n-2} $ 
   * MS for regression . . . Obtained difference   * MS for regression . . . Obtained difference
    * do the same procedure at the above in MS for <del>residual</del> regression.    * do the same procedure at the above in MS for <del>residual</del> regression.
Line 729: Line 729:
  
   * Why do we do t-test for the slope of X variable? The below is a mathematical explanation for this.     * Why do we do t-test for the slope of X variable? The below is a mathematical explanation for this.  
-  * Sampling distribution of b: +  * Sampling distribution of error around the slope line b:
    * $\displaystyle \sigma_{b_{1}} = \frac{\sigma}{\sqrt{SS_{x}}}$    * $\displaystyle \sigma_{b_{1}} = \frac{\sigma}{\sqrt{SS_{x}}}$
 +     * We remember that $\displaystyle \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$ ?
    * estimation of $\sigma_{b_{1}}$ : substitute sigma with s    * estimation of $\sigma_{b_{1}}$ : substitute sigma with s
  
 +만약에 error들이 (residual들) slope b를 중심으로 포진해 있고, 이것을 따로 떼어내서 distribution curve를 그려보면 평균이 0이고 standard deviation이 위의 standard error값을 갖는 normal distribution을 이루게 될 것이다. 
   * t-test   * t-test
- 
    * $\displaystyle t=\frac{b_{1} - \text{Hypothesized value of }\beta_{1}}{s_{b_{1}}}$    * $\displaystyle t=\frac{b_{1} - \text{Hypothesized value of }\beta_{1}}{s_{b_{1}}}$
- +   * Hypothesized value of 값은 (혹은 beta) 0. 따라서 t 값은
-   * Hypothesized value of beta 값은 대개 0. 따라서 t 값은 +
    * $\displaystyle t=\frac{b_{1}}{s_{b_{1}}}$    * $\displaystyle t=\frac{b_{1}}{s_{b_{1}}}$
 +   * 기울기에 대한 표준오차는 (se) 아래와 같이 구한다
  
 \begin{eqnarray*} \begin{eqnarray*}
 \displaystyle s_{b_{1}} & = & \sqrt {\frac {MSE}{SS_{X}}} \\ \displaystyle s_{b_{1}} & = & \sqrt {\frac {MSE}{SS_{X}}} \\
- & = & \sqrt { \frac{1}{n-2}* \frac{SSE}{SS_{X}}} \\  + & = & \displaystyle \sqrt { \frac{1}{n-2} * \frac{SSE}{SS_{X}}} \\  
- & = & \frac{\sqrt{\frac{SSE}{n-2}}}{\sqrt{SS_{X}}} \\ + & = & \displaystyle \sqrt { \frac{1}{n-2} \frac{ \Sigma{(Y-\hat{Y})^2} }{ \Sigma{ (X_{i} - \bar{X})^2 } } } \\
- & = & \displaystyle \frac{\sqrt{\frac{\Sigma{(Y-\hat{Y})^2}}{n-2}}}{\sqrt{\Sigma{(X_{i}-\bar{X})^2}}} \\+
 \end{eqnarray*} \end{eqnarray*}
  
Line 760: Line 757:
  
 Regression formula: y<sub>predicted</sub> = -0.1 + 0.7 X  Regression formula: y<sub>predicted</sub> = -0.1 + 0.7 X 
-SSE = Sum of Square Error+SSE = Sum of Square Error = SS_residual
 기울기 beta(b)에 대한 표준오차값은 아래와 같이 구한다.  기울기 beta(b)에 대한 표준오차값은 아래와 같이 구한다. 
 +
 \begin{eqnarray*} \begin{eqnarray*}
 se_{\beta} & = & \frac {\sqrt{SSE/n-2}}{\sqrt{SSX}} \\ se_{\beta} & = & \frac {\sqrt{SSE/n-2}}{\sqrt{SSX}} \\
Line 770: Line 768:
 따라서 t = b / se = 3.655631 따라서 t = b / se = 3.655631
  
-<code>x <- c(1, 2, 3, 4, 5) 
-y <- c(1, 1, 2, 2, 4) 
-mody <- lm(y ~ x)  
-</code> 
- 
-<code> 
-> x <- c(1, 2, 3, 4, 5) 
-> y <- c(1, 1, 2, 2, 4) 
-> mody <- lm(y ~ x)  
-> summary(mody) 
- 
-Call: 
-lm(formula = y ~ x) 
- 
-Residuals: 
-                  2          3          4          5  
- 4.000e-01 -3.000e-01 -3.886e-16 -7.000e-01  6.000e-01  
- 
-Coefficients: 
-            Estimate Std. Error t value Pr(>|t|)   
-(Intercept)  -0.1000     0.6351  -0.157   0.8849   
-x             0.7000     0.1915   3.656   0.0354 * 
---- 
-Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
- 
-Residual standard error: 0.6055 on 3 degrees of freedom 
-Multiple R-squared:  0.8167, Adjusted R-squared:  0.7556  
-F-statistic: 13.36 on 1 and 3 DF,  p-value: 0.03535 
  
- 
-</code> 
 ====== E.g., 4. Simple regression ====== ====== E.g., 4. Simple regression ======
 Another example of simple regression: from {{:elemapi.sav}} \\ Another example of simple regression: from {{:elemapi.sav}} \\
regression.1684280409.txt.gz · Last modified: 2023/05/17 08:40 by hkimscil

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki