regression

# Differences

This shows you the differences between two versions of the page.

 regression [2018/11/09 07:24]hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] regression [2019/05/20 08:20] (current)hkimscil [e.g. Simple Regression] Both sides previous revision Previous revision 2019/05/20 08:20 hkimscil [e.g. Simple Regression] 2019/05/20 08:10 hkimscil [e.g. Simple Regression] 2019/05/20 07:54 hkimscil [e.g. Simple Regression] 2019/05/20 07:47 hkimscil [E.g., 2. Simple regression] 2019/05/13 08:27 hkimscil 2019/05/09 11:47 hkimscil 2018/11/09 07:24 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:32 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:29 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:18 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2018/10/30 08:00 hkimscil [표준오차 잔여변량 (standard error residual)] 2018/10/30 07:59 hkimscil [표준오차 잔여변량 (standard error residual)] 2018/05/09 08:32 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2018/05/09 07:41 hkimscil [표준오차 잔여변량 (standard error, residual)] 2017/05/24 09:12 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/24 08:56 hkimscil [Data examination] 2017/05/19 11:36 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 09:28 hkimscil [E.g., 2. Simple regression] 2017/05/19 09:19 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 09:02 hkimscil old revision restored (2017/05/19 09:07)2017/05/19 09:01 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:52 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:46 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:45 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:37 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:36 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:28 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:13 hkimscil 2017/05/12 09:12 hkimscil [Regression] 2016/06/23 18:47 hkimscil 2016/05/11 08:23 hkimscil [E.g., 3. Simple regression] 2016/05/03 08:16 hkimscil Next revision Previous revision 2019/05/20 08:20 hkimscil [e.g. Simple Regression] 2019/05/20 08:10 hkimscil [e.g. Simple Regression] 2019/05/20 07:54 hkimscil [e.g. Simple Regression] 2019/05/20 07:47 hkimscil [E.g., 2. Simple regression] 2019/05/13 08:27 hkimscil 2019/05/09 11:47 hkimscil 2018/11/09 07:24 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:32 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:29 hkimscil [E.g., 1. Simple regression & F-test for goodness of fit] 2018/11/06 08:18 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2018/10/30 08:00 hkimscil [표준오차 잔여변량 (standard error residual)] 2018/10/30 07:59 hkimscil [표준오차 잔여변량 (standard error residual)] 2018/05/09 08:32 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2018/05/09 07:41 hkimscil [표준오차 잔여변량 (standard error, residual)] 2017/05/24 09:12 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/24 08:56 hkimscil [Data examination] 2017/05/19 11:36 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 09:28 hkimscil [E.g., 2. Simple regression] 2017/05/19 09:19 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 09:02 hkimscil old revision restored (2017/05/19 09:07)2017/05/19 09:01 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:52 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:46 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:45 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:37 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:36 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:28 hkimscil [E.g., 3. Simple regression: Adjusted R squared & Slope test] 2017/05/19 08:13 hkimscil 2017/05/12 09:12 hkimscil [Regression] 2016/06/23 18:47 hkimscil 2016/05/11 08:23 hkimscil [E.g., 3. Simple regression] 2016/05/03 08:16 hkimscil 2016/05/03 08:13 hkimscil 2016/05/03 08:06 hkimscil [E.g., 3. Simple regression] 2016/05/03 07:39 hkimscil [E.g., 3. Simple regression] 2016/05/03 07:19 hkimscil [E.g., 3. Simple regression] 2016/05/03 07:14 hkimscil [E.g., 3. Simple regression] 2016/05/03 06:51 hkimscil 2016/04/27 08:27 hkimscil 2016/04/27 08:12 hkimscil 2016/04/27 08:11 hkimscil 2016/04/27 08:10 hkimscil 2016/04/27 07:46 hkimscil 2016/04/27 07:41 hkimscil 2016/04/27 07:40 hkimscil 2016/04/27 07:35 hkimscil 2016/04/27 07:24 hkimscil 2016/04/27 07:23 hkimscil 2016/04/27 07:20 hkimscil 2016/04/27 07:17 hkimscil 2016/04/27 07:12 hkimscil Line 187: Line 187: ^  __ prediction for y values with__ $\overline{Y}$ ​ ^^^ ^  __ prediction for y values with__ $\overline{Y}$ ​ ^^^ - | bankaccount ​  | error   | error<​sup>​2 ​ | + | bankaccount ​  | prediction  ​| error   | error<​sup>​2 ​ | - | 6   | -2   | 4  | + | 6   | 8  ​| -2   | 4  | - | 5   | -3   | 9  | + | 5   | 8  ​| -3   | 9  | - | 7   | -1   | 1  | + | 7   | 8  ​| -1   | 1  | - | 7   | -1   | 1  | + | 7   | 8  ​| -1   | 1  | - | 8   | 0   | 0  | + | 8   | 8  ​| 0   | 0  | - | 10   | 2   | 4  | + | 10   | 8  ​| 2   | 4  | - | 8   | 0   | 0  | + | 8   | 8  ​| 0   | 0  | - | 11   | 3   | 9  | + | 11   | 8  ​| 3   | 9  | - | 9   | 1   | 1  | + | 9   | 8  ​| 1   | 1  | - | 9   | 1   | 1  | + | 9   | 8  ​| 1   | 1  | - |  $\overline{Y}=8$ ​  ​| ​   |  $SS_{total} = 30$   ​| ​ + |  $\overline{Y}=8$ ​  ​| ​  |   |  $SS_{total} = 30$   ​| ​ 위에서 제곱한 값의 합은? 30이다. 이는 사실, SS (Sum of Square)값이 30이라는 이야기이다. 그리고, 위에서 설명한 것처럼, 이 값은 $SS_{total}$ 이라고 할 수 있으며 __전체에러__ 변량이라고 할 수 있겠다. 위에서 제곱한 값의 합은? 30이다. 이는 사실, SS (Sum of Square)값이 30이라는 이야기이다. 그리고, 위에서 설명한 것처럼, 이 값은 $SS_{total}$ 이라고 할 수 있으며 __전체에러__ 변량이라고 할 수 있겠다. Line 204: Line 204: __SS<​sub>​res​ , Residual error__ __SS<​sub>​res​ , Residual error__ <​code>​ <​code>​ + > head(datavar) + . . . . + > mod <- lm(bankaccount ~ income, data = datavar) + > summary(mod) + Residuals: Residuals: Min      1Q  Median ​     3Q     ​Max ​ Min      1Q  Median ​     3Q     ​Max ​ Line 281: Line 286: |  Model   ​| ​     |  Sum of Squares ​  ​| ​ df   ​| ​ Mean Square ​  ​| ​ F   ​| ​ Sig.  | |  Model   ​| ​     |  Sum of Squares ​  ​| ​ df   ​| ​ Mean Square ​  ​| ​ F   ​| ​ Sig.  | |  1.000    |  Regression ​  | @white: 18.934 ​   | @grey: 1.000    |  18.934 ​   |  13.687 ​   |  0.006   ​| ​ |  1.000    |  Regression ​  | @white: 18.934 ​   | @grey: 1.000    |  18.934 ​   |  13.687 ​   |  0.006   ​| ​ - |     ​| ​ Residual ​  | @orange: 11.066 ​   | @green: 8.000    |  1.383    |     ​| ​   | + |     ​| ​ Residual ​  | @orange: 11.066 ​   | @green: 8.000    |  1.383*    ​| ​    ​| ​   | |     ​| ​ Total   | @yellow: 30.000 ​   |  9.000    |     ​| ​    ​| ​   | |     ​| ​ Total   | @yellow: 30.000 ​   |  9.000    |     ​| ​    ​| ​   | | a Predictors: (Constant), bankIncome ​ income \\ b Dependent Variable: bankbook ​ number of bank  |||||||  ​ | a Predictors: (Constant), bankIncome ​ income \\ b Dependent Variable: bankbook ​ number of bank  |||||||  ​ + + * 1.383 = SS<​sub>​res​ / n-2 = standard error 표준오차 ​ + * standard error = 표준오차는 [[:​t-test]]를 배울 때의 t = 차이/se 에서와 같은 의미 + * 따라서 MS<​sub>​regression​인 18.934 를 표준오차로 나눈 값을 F 값이라고 부른다. + __ SS<​sub>​total​ SS<​sub>​reg​ SS<​sub>​res​ 를 이용한 F-test__ __ SS<​sub>​total​ SS<​sub>​reg​ SS<​sub>​res​ 를 이용한 F-test__ Line 341: Line 351: - ====== E.g., 2. Simple regression ====== + ====== E.g., Simple regression ====== data: data: {{:​acidity.sav}} \\ {{:​acidity.sav}} \\ Line 578: Line 588: ​SS<​sub>​total​ = 87.733 ​SS<​sub>​total​ = 87.733 ​r<​sup>​2​ = SS<​sub>​reg​ / SS<​sub>​total​ = 42.462 / 87.733 = .484. ​r<​sup>​2​ = SS<​sub>​reg​ / SS<​sub>​total​ = 42.462 / 87.733 = .484. + + ====== e.g. Simple Regression ====== + {{:​AllenMursau.data.csv}} + + <​code>​datavar <- read.csv("​http://​commres.net/​wiki/​_media/​allenmursau.data.csv"​) + ​ + + <​code>>​ mod <- lm(Y ~ X, data=datavar) + > summary(mod) + + Call: + lm(formula = Y ~ X, data = datavar) + + Residuals: + Min      1Q  Median ​     3Q     ​Max ​ + -250.22 -132.28 ​  ​33.09 ​ 165.53 ​ 187.78 ​ + + Coefficients:​ + Estimate Std. Error t value Pr(>​|t|) ​ + (Intercept) ​ 300.976 ​   229.754 ​  ​1.310 ​   0.219 + X             ​10.312 ​     3.124   ​3.301 ​   0.008 ** + --- + Signif. codes: ​ 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + + Residual standard error: 170.5 on 10 degrees of freedom + Multiple R-squared: ​ 0.5214,​ Adjusted R-squared: ​ 0.4736 ​ + F-statistic: ​ 10.9 on 1 and 10 DF,  p-value: 0.008002 + + ​ + <​code>>​ anova(mod) + Analysis of Variance Table + + Response: Y + Df Sum Sq Mean Sq F value   ​Pr(>​F) ​ + X          1 316874 ​ 316874 ​ 10.896 0.008002 ** + Residuals 10 290824 ​  ​29082 ​                   ​ + --- + Signif. codes: ​ 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 + > ​ + + <​code>​ + > ss_total <- var(datavar\$Y)*11 + > round(ss_total) + [1] 607698 + > 316874 + 290824 ​ # 위의 아웃풋에서 Sum Sq for X와 Residuals를 더한 값 + [1] 607698 + ​ + ​위의 anova 아웃풋 박스에서 R square value를 구할 수 있는가? + + + ​ + ====== E.g., 3. Simple regression: Adjusted R squared & Slope test ====== ====== E.g., 3. Simple regression: Adjusted R squared & Slope test ======