Differences

This shows you the differences between two versions of the page.

--- multiple_regression_exercise [2023/12/10 21:41] – old revision restored (2022/11/02 23:41) hkimscil
+++ multiple_regression_exercise [2025/10/30 13:28] (current) – [추가설명] hkimscil
@@ Line 310: / Line 310: @@
 아래는 Advertising을 없애질 않고 ShelveLoc을 없앤것이라서 약간 다르게 보이지만 요지는 같습니다.
 <code>
->
 > lm.1 <- lm(Sales ~ Advertising + Advertising:ShelveLoc, data = Carseats)
-> lm.1
+> summary(lm.1)
 Call:
 lm(formula = Sales ~ Advertising + Advertising:ShelveLoc, data = Carseats)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-6.7650 -1.7351 -0.1523  1.5481  8.1350
 Coefficients:
-        (Intercept)                  Advertising    Advertising:ShelveLocGood  Advertising:ShelveLocMedium
+                            Estimate Std. Error t value Pr(>|t|)
-.76500                     -0.05412                      0.35597                      0.14922
+(Intercept)                  6.76500    0.17503  38.650  < 2e-16 ***
+Advertising                 -0.05412    0.03136  -1.726   0.0852 .
+Advertising:ShelveLocGood    0.35597    0.03901   9.126  < 2e-16 ***
+Advertising:ShelveLocMedium  0.14922    0.03346   4.460 1.07e-05 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 2.476 on 396 degrees of freedom
+Multiple R-squared:  0.237,	Adjusted R-squared:  0.2312
+F-statistic:    41 on 3 and 396 DF,  p-value: < 2.2e-16
+>
 </code>
 위에서 prediction model 식을 완성한다고 하면
-Sales hat = 6.76500 - 0.05412 * Advertising  --> ShelveLocBad 인경우
+  * **Sales hat = 6.76500 - 0.05412 * Advertising**  --> **ShelveLocBad** 인경우
-Sales hat = 6.76500 - (0.05412 + 0.14922) * Advertising  --> ShelveLocMedium 인경우
+  * Sales hat = 6.76500 + (-0.05412 + 0.14922) * Advertising  --> ShelveLocMedium 인경우
-Sales hat = 6.76500 - (0.05412 + 0.35597) * Advertising  --> ShelveLocGood 인경우
+    * Sales hat = **6.76500 - (0.0951)*Advertising**  --> **ShelveLocMedium** 인경우
+  * Sales hat = 6.76500 + (-0.05412 + 0.35597) * Advertising  --> ShelveLocGood 인경우
+    * **Sales hat = 6.76500 - (0.30185)*Advertising**  --> **ShelveLocGood** 인경우
 라고 하겠습니다.
@@ Line 398: / Line 414: @@
 왜 같은지를 보기 위해서 아래를 보면
 <code>
-> lm.2 <- lm(Sales ~ Advertising:ShelveLoc, data = Carseats)
+> lm.2 <- lm(Sales ~ Advertising:ShelveLoc, data = Carseats) # ------- (1)
-> lm.2
+> lm.1 <- lm(Sales ~ Advertising + Advertising:ShelveLoc, data = Carseats) # ------- (2)
+> summary(lm.2)
 Call:
 lm(formula = Sales ~ Advertising:ShelveLoc, data = Carseats)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-6.7650 -1.7351 -0.1523  1.5481  8.1350
 Coefficients:
-        (Intercept)     Advertising:ShelveLocBad    Advertising:ShelveLocGood  Advertising:ShelveLocMedium
+                            Estimate Std. Error t value Pr(>|t|)
-.76500                     -0.05412                      0.30185                      0.09510
+(Intercept)                  6.76500    0.17503  38.650  < 2e-16 ***
->
+Advertising:ShelveLocBad    -0.05412    0.03136  -1.726   0.0852 .
+Advertising:ShelveLocGood    0.30185    0.02982  10.123  < 2e-16 ***
+Advertising:ShelveLocMedium  0.09510    0.02222   4.281 2.34e-05 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 2.476 on 396 degrees of freedom
+Multiple R-squared:  0.237,	Adjusted R-squared:  0.2312
+F-statistic:    41 on 3 and 396 DF,  p-value: < 2.2e-16
 </code>
-) ShelveLoc Good일 경우, ShelveLOcGood = 1이고 나머지는 0이므로
+) Bad 인 경우는
-Sales hat = Intercept + Advertising:ShelveLocGood 이 됩니다. 즉,
+  * **Sales hat = 6.76500 + -0.05412*Advertising** 이 됩니다.
-Sales hat = 6.76500 + 0.30185 * Advertising
-) ShelveLoc Medium일 경우는
+) ShelveLoc Good일 경우, ShelveLOcGood = 1이고 나머지는 0이므로
-Sales hat = 6.76500 + 0.09510 * Advertising 이 됩니다
+  * Sales hat = Intercept + Advertising:ShelveLocGood 이 됩니다. 즉,
+  * **Sales hat = 6.76500 + 0.30185*Advertising**
+    * 위의 <fc #ff0000>* Advertising</fc>은
+    * Advertising:ShelveLocGood 에서
+    * Ad coefficient (0.30185) * SelfveLocGood (1)  이라는 이야기입니다.
-) Bad 인 경우는
+) ShelveLoc Medium일 경우는
-Sales hat = 6.76500 + -0.05412 * Advertising 이 됩니다.
+  * **Sales hat = 6.76500 + 0.09510 * Advertising** 이 됩니다
 위는 결국 lm.1 과 같은 모델이라는 뜻입니다.
 참고로 위에서 원래 lm.1에서 우리가 살펴본 각 라인은 아래와 같았었습니다.
@@ Line 445: / Line 479: @@
 Coefficients:
                             Estimate Std. Error t value Pr(>|t|)
-(Intercept)                  5.01234    0.31932  15.697  < 2e-16 ***
+(Intercept)                  5.01234    0.31932  15.697  < 2e-16 ***  # (1)
-Advertising                  0.08210    0.03570   2.300    0.022 *
+Advertising                  0.08210    0.03570   2.300    0.022 *    # (2)
-ShelveLocGood                4.43573    0.48146   9.213  < 2e-16 ***
+ShelveLocGood                4.43573    0.48146   9.213  < 2e-16 ***  # (3)
-ShelveLocMedium              1.59511    0.38378   4.156 3.97e-05 ***
+ShelveLocMedium              1.59511    0.38378   4.156 3.97e-05 ***  # (4)
-Advertising:ShelveLocGood    0.02206    0.05075   0.435    0.664
+Advertising:ShelveLocGood    0.02206    0.05075   0.435    0.664      # (5)
-Advertising:ShelveLocMedium  0.02482    0.04236   0.586    0.558
+Advertising:ShelveLocMedium  0.02482    0.04236   0.586    0.558      # (6)
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
@@ Line 458: / Line 492: @@
 F-statistic: 47.05 on 5 and 394 DF,  p-value: < 2.2e-16
 </code>
+ShelveLocBad case
+  * (1) = intercept, 5.01234
+  * (2) = Advertising, 0.08210*Advertising
+  * (3) = 0
+  * (4) = 0
+  * (5) = Advertising * 0
+  * (6) = Advertising * 0
+  * y.hat = 5.01234 + (0.08210)*advertising
+Good case
+  * (1) = intercept, 5.01234
+  * (2) = Advertising, 0.08210*Advertising
+  * (3) = 4.43573
+  * (4) = 0
+  * (5) = Advertising * 1 = 0.02206
+  * (6) = Advertising * 0 = 0
+  * y.hat = 5.01234 + (0.08210)*Advertising + 4.43573 + (0.02206)*Advertising
+  * y.hat = 9.44807 + 0.10416*Advertising
+Med case
+  * (1) = intercept, 5.01234
+  * (2) = Advertising, 0.08210*Advertising
+  * (3) = 0
+  * (4) = 1.59511
+  * (5) = Advertising * 0 = 0
+  * (6) = Advertising * 0.02482 = 0.02482*Advertising
+  * y.hat = 5.01234 + (0.08210)*Advertising + 1.59511 + (0.02482)*Advertising
+  * y.hat = 6.60745 + 0.10692*Advertising
 <code>
@@ Line 487: / Line 550: @@
 위의 결과를 보면 (번호 (1)에서 (6)까지)
+(1) = 5.01234
+(2) = 0
+(3) = 0
+(4) = 0.08210 * (1 = location bad case) * Advertising
+(5) = 0.10417 * (0 = location good case) * Advertising = 0
+(6) = 0.10692 * (0 = location med case) * Advertising = 0
-Sales hat = 5.01234 + 0.08210
+따라서,
+Sales hat = 5.01234 + 0.08210 * Advertising
 이 첫번째 경우가
   * ShelveBad일 경우 intercept만 남고 (5.01234)    (1)
   * ShelveBad 가 1이고 나머지는 (Medium 과 Good) 0으로 보는 것이기에
   * ShelveLocBad:Advertising (4)만 남고 아래 두 줄은 ((5)와 (6)) 0으로 없어지므로
-  * (1)에다 (4)의 (ShelveLocBad:Advertising) 0.08210 이 더해진 것이
+  * (1)에다 (4)의  0.08210*Advertising 이 더해져서
-  * 그냥 Bad와 (5.01234)
+  * y.hat = 5.01234 + 0.08210*Advertising
-  * Advertising이 섞인 Bad (5.01234 + 0.08210) 이 됩니다.
-두번째 Medium일 경우는
+두번째 Good일 경우는
+  * Sales hat = (5.01234 + 4.43573) 이고 여기에 Advertising의 interaction을 고려하면
+  * Sales hat = (5.01234 + 4.43573) + 0.10417 * Advertising 이 됩니다.
+  * y.hat = 9.44807 + 0.10417 * Advertising
+세번째 Medium일 경우는
   * 5.01234에 (1) 1.59511을 (3) 더한 것이 ShleveLoc Medium 의 효과로
   * Sales hat = (5.01234 + 1.59511) 이고 여기에 Advertising의 interaction을 고려하면 여기에 (6)을 고려해서
-  * Sales hat = (5.01234 + 1.59511) + 0.10692 이 되겠습니다.
+  * Sales hat = (5.01234 + 1.59511) + 0.10692 * Advertising 이 되겠습니다.
+  * y.hat = 6.60745 + 0.10692*Advertising
-세번째 Good일 경우는
-  * Sales hat = (5.01234 + 4.43573) 이고 여기에 Advertising의 interaction을 고려하면
-  * Sales hat = (5.01234 + 4.43573) + 0.10417 이 됩니다.
 그런데 Advertising 과의 상호작용 때문에 증가하는 양이 비슷비슷합니다. 이것을 그래프로 그리면
 <code>
 > plot(ShelveLoc, Sales)