Differences

This shows you the differences between two versions of the page.

--- mediation_analysis [2023/06/03 01:41] – [e.gs] hkimscil
+++ mediation_analysis [2023/06/18 07:42] (current) – hkimscil
@@ Line 1: / Line 1: @@
 ====== Mediation Analysis ======
-Planned behavior 이론에 따라서 연구자는 아래와 같은 모델을 만들고 데이터를 얻은 후 테스트하려고 한다. 특히 이 단계에서 연구자는 Attitudes가 Behavior 미치는 영향력을 Attitudes 고유의 것과 Intention을 거쳐가는 것으로 구분하여 확인해보려고 한다.  이와 같은 통계검증 방식을 mediation analysis라고 하는데 . . . .
+Planned behavior 이론에 따라서 연구자는 아래와 같은 모델을 만들고 데이터를 얻은 후 테스트하려고 한다. 특히 이 단계에서 연구자는 Attitudes가 Behavior 미치는 영향력을 Attitudes 고유의 것과 Intention을 거쳐가는 것으로 구분하여 확인해보려고 한다.  이와 같은 통계검증 방식을 mediation analysis라고 하는데, 이 방식은 아래와 같은 상황을 전제로 한다.
-보통 lavaan package를 활용하여 (path analysis를 위해서 개발) mediator 변인의 효과와 두 독립변인의 공통효과를 알아낸다.
+  * 우선 attitudes와 (x 변인) behavior (y 변인) 간의 significant한 관계가 있다. 즉, lm(behavior ~ attitudes) 의 f statistics 값이 significant하다. 이 coefficient값을 c 라고 하자.
+<code>
+Attitudes  ---------c----------   Behavior
+</code>
+  * 이 attitudes는 intention에도 통계적으로 유의미한 설명력을 갖는다. 이 관계는 a 라는 coefficient값을 갖는다.
+<code>
+Attitudes  ---------a----------   Intention
+</code>
+  * 그리고 beahvior에 대해서 attitudes와 intention 두 변인을 이용한 regression의 R square값은 significant한 설명력을 갖는다. 그러나, 각 변인 중 애초에 독립적으로 사용되었을 때 significant했던 attitudes는 설명력을 잃거나 혹은 significant한 설명력을 갖지만 그 크기가 c보다 작다. 즉, 아래에서
+  * b = significant
+  * c' = not significant 혹은 c' < c 즉, attitudes의 설명력이 significant한 상태를 유지하지만 영향력은 감소
+<code>
+               Intention
+                          \
+                           \ b
+                            \
+Attitudes  --------c'-----==  Behavior
+b
+c' < c
+</code>
+이 때 연구자는 attitudes의 설명력이 Intentin을 통해서 behavior에 가게 되는게 아닌가 가정을 하고 이를 확인하게 되는데, 이것이 attitudes의 intention을 통한 moderating 혹은 mediated effect이다.
+attitudes가 intention을 통해서 어떤 크기의 영향력을 갖는가? 즉, attitudes의 매개효과의 크기는 어느정도인가가 궁금할 수 있는데 크게는 두 가지가 있다. 첫번째는 c - c' 의 크기를 indirect effect로 보는 경우이다. c는 attitudes가 온전히 (다른 변인 고려 없이) behavior에 미치는 영향력인데, 다른 변인이 존재하므로써 이 크기가 줄게 되고 (c'으로), 원래 크기에서 (c) 줄어든 크기를 (c') 뺀 값이 indirect effect이다.
+다른 하나는 세 변인을 동시에 고려하여 attitudes의 단위변화가 최종적으로 behavior에 어떻게 영향을 미치는가를 본다. 만약에 a의 beta 크기가 2라고 하고, b의 beta 크기가 3이라고 하면, 우리는 attitudes의 단위가 한단위 변하면 intention의 단위가 2단위 변한다는 것을 알고, 이를 두번째 단계에 (b의 크기) 적용하면 2단위가 변하였으므로 behavior는 3x2단위 (6단위) 변하게 될 것이라고 추측할 수 있다. 즉, attitudes의 한단위는 intention의 매개효과를 통하여 behavior가 6단위 변하도록 한다 (a beta coefficient * b beta coefficient).
+size of mediated effect
+''ab:=a*b'' 의 크기는 이것을 sdx/sdy 로 곱하여 beta coefficient값으로 변하여 구해본다. . . .
+이런 것들은 일반 regression을 이용하여 알아낼 수도 있지만, 보통 lavaan package를 활용하거나 (path analysis를 위해서 개발) 혹은 mediation이라는 package를 이용하여 독립변인의 매개효과를 알아낸다.
 <code>
@@ Line 17: / Line 51: @@
 또한 mediation analysis에서 독립변인들의 효과를 (설명력을) 직접효과와 간접효과로 나눌 수 있는데 직접효과는 a, b, 그리고 c'를 직접효과라고 (direct effects) 하고 a와 b를 거쳐서 가는 효과를 간접효과라고 (indirect effects) 한다. Indirect effects 의 크기를 어떻게 측정하는가에는 여러가지 방법이 있을 수 있지만, 가장 많이 쓰이는 방법으로는
   * a path와 b path의 coefficient값을 곱한 값을 취하는 방법이다
-  * 다른 방법으로는 b - a 값을 취하는 방법이 있지만 흔하지는 않다
+  * 다른 방법으로는 c - c' 값을 취하는 방법이 있지만 흔하지는 않다
 위에서 a b 를 곱해서 간접효과를 측정할 때에 그 값이 (효과가) significant한지 알아보기 위한 테스트에는 두 가지 방법이 있을 수 있는데
@@ Line 966: / Line 1000: @@
 ====== e.g. 2 ======
+https://advstats.psychstat.org/book/mediation/index.php
+<code>
+####################################
+nlsy <- read.csv("http://commres.net/wiki/_media/r/nlsy.csv")
+attach(nlsy)
+# install.packages("bmem")
+library(bmem)
+library(sem)
+# install.packages("cfa")
+library(cfa)
+</code>
+<code>
+##########################
+nlsy.model<-specifyEquations(exog.variances=T)
+math =b*HE + cp*ME
+HE = a*ME
+<ENTER>
+effects<-c('a*b', 'cp+a*b')
+nlsy.res<-bmem.sobel(nlsy, nlsy.model,effects)
+##########################
+</code>
+<code>
+m.me.he <- lm(ME~HE)
+m.math.me <- lm(math~ME)
+m.math.he <- lm(math~HE)
+m.math.mehe <- lm(math~ME+HE)
+m.he.me <- lm(HE~ME)
+summary(m.he.me)
+res.m.he.me <- resid(m.he.me)
+m.temp <- lm(math~res.m.he.me)
+summary(m.temp)
+res.m.me.he <- resid(m.me.he)
+m.temp2 <- lm(math~res.m.me.he)
+summary(m.temp2)
+library(lavaan)
+specmod <- "
+# path c' (direct effect)
+math ~ c*ME
+# path a
+HE ~ a*ME
+# path b
+math ~ b*HE
+# indirect effect (a*b)
+# sobel test (Delta method)
+ab := a*b
+"
+# fit/estimate model
+fitmod <- sem(specmod, data=df)
+# summarize/result the output
+summary(fitmod, fit.measures=TRUE, rsquare=TRUE)
+# for a
+summary(m.he.me)
+# for b
+summary(m.temp)
+# for cprime
+summary(m.temp2)
+a <- summary(m.he.me)$coefficient[2] # a
+b <- summary(m.temp)$coefficient[2]  # b
+c <- summary(m.temp2)$coefficient[2] # c
+a
+b
+c
+a*b
+c2 <- summary(fitmod)$pe$est[1]
+a2 <- summary(fitmod)$pe$est[2]
+b2 <- summary(fitmod)$pe$est[3]
+ab2 <- summary(fitmod)$pe$est[7]
+a2
+b2
+c2
+ab2
+</code>
+<code>
+> ####################################
+> nlsy <- read.csv("http://commres.net/wiki/_media/r/nlsy.csv")
+> attach(nlsy)
+The following objects are masked from nlsy (pos = 3):
+    HE, math, ME
+The following objects are masked from nlsy (pos = 5):
+    HE, math, ME
+The following objects are masked from df:
+    HE, math, ME
+> # install.packages("bmem")
+> library(bmem)
+> library(sem)
+> # install.packages("cfa")
+> library(cfa)
+>
+</code>
+<code>
+> nlsy.model<-specifyEquations(exog.variances=T)
+: math =b*HE + cp*ME
+: HE = a*ME
+:
+Read 2 items
+NOTE: adding 3 variances to the model
+> effects<-c('a*b', 'cp+a*b')
+> nlsy.res<-bmem.sobel(nlsy, nlsy.model,effects)
+           Estimate       S.E.   z-score      p.value
+b        0.46450283 0.14304860  3.247168 1.165596e-03
+cp       0.46281480 0.11977862  3.863918 1.115825e-04
+a        0.13925694 0.04292438  3.244239 1.177650e-03
+V[HE]    2.73092635 0.20078170 13.601471 0.000000e+00
+V[math] 20.67659134 1.52017323 13.601471 0.000000e+00
+V[ME]    4.00590078 0.29451968 13.601471 0.000000e+00
+a*b      0.06468524 0.02818457  2.295059 2.172977e-02
+cp+a*b   0.52750005 0.11978162  4.403848 1.063474e-05
+>
+</code>
+<code>
+> m.me.he <- lm(ME~HE)
+> m.math.me <- lm(math~ME)
+> m.math.he <- lm(math~HE)
+> m.math.mehe <- lm(math~ME+HE)
+> m.he.me <- lm(HE~ME)
+> summary(m.he.me)
+Call:
+lm(formula = HE ~ ME)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-5.5020 -0.7805  0.2195  1.2195  3.3587
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  4.10944    0.49127   8.365 1.25e-15 ***
+ME           0.13926    0.04298   3.240   0.0013 **
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 1.655 on 369 degrees of freedom
+Multiple R-squared:  0.02766,	Adjusted R-squared:  0.02502
+F-statistic:  10.5 on 1 and 369 DF,  p-value: 0.001305
+> res.m.he.me <- resid(m.he.me)
+> m.temp <- lm(math~res.m.he.me)
+> summary(m.temp)
+Call:
+lm(formula = math ~ res.m.he.me)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-12.4263  -3.1496  -0.3499   2.2826  29.7795
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  12.1186     0.2427  49.936  < 2e-16 ***
+res.m.he.me   0.4645     0.1471   3.159  0.00172 **
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 4.674 on 369 degrees of freedom
+Multiple R-squared:  0.02633,	Adjusted R-squared:  0.02369
+F-statistic: 9.978 on 1 and 369 DF,  p-value: 0.001715
+> res.m.me.he <- resid(m.me.he)
+> m.temp2 <- lm(math~res.m.me.he)
+> summary(m.temp2)
+Call:
+lm(formula = math ~ res.m.me.he)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-14.1938  -2.7965  -0.3425   2.4081  29.5656
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  12.1186     0.2413   50.22  < 2e-16 ***
+res.m.me.he   0.4628     0.1224    3.78 0.000183 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 4.648 on 369 degrees of freedom
+Multiple R-squared:  0.03728,	Adjusted R-squared:  0.03467
+F-statistic: 14.29 on 1 and 369 DF,  p-value: 0.0001828
+>
+> library(lavaan)
+> specmod <- "
++ # path c' (direct effect)
++ math ~ c*ME
++
++ # path a
++ HE ~ a*ME
++
++ # path b
++ math ~ b*HE
++
++ # indirect effect (a*b)
++ # sobel test (Delta method)
++ ab := a*b
++ "
+>
+> # fit/estimate model
+> fitmod <- sem(specmod, data=df)
+>
+> # summarize/result the output
+> summary(fitmod, fit.measures=TRUE, rsquare=TRUE)
+lavaan 0.6-12 ended normally after 1 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                         5
+  Number of observations                           371
+Model Test User Model:
+  Test statistic                                 0.000
+  Degrees of freedom                                 0
+Model Test Baseline Model:
+  Test statistic                                39.785
+  Degrees of freedom                                 3
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    1.000
+  Tucker-Lewis Index (TLI)                       1.000
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)              -1800.092
+  Loglikelihood unrestricted model (H1)      -1800.092
+  Akaike (AIC)                                3610.184
+  Bayesian (BIC)                              3629.765
+  Sample-size adjusted Bayesian (BIC)         3613.901
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.000
+Percent confidence interval - lower         0.000
+Percent confidence interval - upper         0.000
+  P-value RMSEA <= 0.05                             NA
+Standardized Root Mean Square Residual:
+  SRMR                                           0.000
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  math ~
+    ME         (c)    0.463    0.120    3.869    0.000
+  HE ~
+    ME         (a)    0.139    0.043    3.249    0.001
+  math ~
+    HE         (b)    0.465    0.143    3.252    0.001
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+   .math             20.621    1.514   13.620    0.000
+   .HE                2.724    0.200   13.620    0.000
+R-Square:
+                   Estimate
+    math              0.076
+    HE                0.028
+Defined Parameters:
+                   Estimate  Std.Err  z-value  P(>|z|)
+    ab                0.065    0.028    2.298    0.022
+>
+> # for a
+> summary(m.he.me)
+Call:
+lm(formula = HE ~ ME)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-5.5020 -0.7805  0.2195  1.2195  3.3587
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  4.10944    0.49127   8.365 1.25e-15 ***
+ME           0.13926    0.04298   3.240   0.0013 **
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 1.655 on 369 degrees of freedom
+Multiple R-squared:  0.02766,	Adjusted R-squared:  0.02502
+F-statistic:  10.5 on 1 and 369 DF,  p-value: 0.001305
+> # for b
+> summary(m.temp)
+Call:
+lm(formula = math ~ res.m.he.me)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-12.4263  -3.1496  -0.3499   2.2826  29.7795
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  12.1186     0.2427  49.936  < 2e-16 ***
+res.m.he.me   0.4645     0.1471   3.159  0.00172 **
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 4.674 on 369 degrees of freedom
+Multiple R-squared:  0.02633,	Adjusted R-squared:  0.02369
+F-statistic: 9.978 on 1 and 369 DF,  p-value: 0.001715
+> # for cprime
+> summary(m.temp2)
+Call:
+lm(formula = math ~ res.m.me.he)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-14.1938  -2.7965  -0.3425   2.4081  29.5656
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  12.1186     0.2413   50.22  < 2e-16 ***
+res.m.me.he   0.4628     0.1224    3.78 0.000183 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 4.648 on 369 degrees of freedom
+Multiple R-squared:  0.03728,	Adjusted R-squared:  0.03467
+F-statistic: 14.29 on 1 and 369 DF,  p-value: 0.0001828
+>
+> a <- summary(m.he.me)$coefficient[2] # a
+> b <- summary(m.temp)$coefficient[2]  # b
+> c <- summary(m.temp2)$coefficient[2] # c
+> a
+[1] 0.1392569
+> b
+[1] 0.4645028
+> c
+[1] 0.4628148
+> a*b
+[1] 0.06468524
+>
+> c2 <- summary(fitmod)$pe$est[1]
+> a2 <- summary(fitmod)$pe$est[2]
+> b2 <- summary(fitmod)$pe$est[3]
+> ab2 <- summary(fitmod)$pe$est[7]
+> a2
+[1] 0.1392569
+> b2
+[1] 0.4645028
+> c2
+[1] 0.4628148
+> ab2
+[1] 0.06468524
+>
+</code>
+====== temp ======
+<code>
+tests <- read.csv("http://commres.net/wiki/_media/r/tests_cor.csv")
+colnames(tests) <- c("ser", "sat", "clep", "gpa")
+tests <- subset(tests, select=c("sat", "clep", "gpa"))
+attach(tests)
+</code>
+<code>
+lm.gpa.clepsat <- lm(gpa ~ clep + sat, data = tests)
+summary(lm.gpa.clepsat)
+lm.gpa.clep <- lm(gpa ~ clep)
+lm.gpa.sat <- lm(gpa ~ sat)
+summary(lm.gpa.clep)
+summary(lm.gpa.sat)
+</code>
+<code>
+>  lm.gpa.clepsat <- lm(gpa ~ clep + sat, data = tests)
+> summary(lm.gpa.clepsat)
+Call:
+lm(formula = gpa ~ clep + sat, data = tests)
+Residuals:
+      Min        1Q    Median        3Q       Max
+-0.197888 -0.128974 -0.000528  0.131170  0.226404
+Coefficients:
+              Estimate Std. Error t value Pr(>|t|)
+(Intercept)  1.1607560  0.4081117   2.844   0.0249 *
+clep         0.0729294  0.0253799   2.874   0.0239 *
+sat         -0.0007015  0.0012564  -0.558   0.5940
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 0.1713 on 7 degrees of freedom
+Multiple R-squared:  0.7778,	Adjusted R-squared:  0.7143
+F-statistic: 12.25 on 2 and 7 DF,  p-value: 0.005175
+>
+</code>
+.7778 이 두 변인 clep 과 sat 가 gpa를 설명하는 부분
+''summary(lm.gpa.clepsat)$r.squared'' = 0.778
+그리고, 위에서 sat의 설명력은 significant하지 않음
+그럼 sat만으로 gpa를 보면?
+<code>
+> lm.gpa.clep <- lm(gpa ~ clep)
+> lm.gpa.sat <- lm(gpa ~ sat)
+> summary(lm.gpa.clep)
+Call:
+lm(formula = gpa ~ clep)
+Residuals:
+      Min        1Q    Median        3Q       Max
+-0.190496 -0.141167 -0.002376  0.110847  0.225207
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  1.17438    0.38946   3.015 0.016676 *
+clep         0.06054    0.01177   5.144 0.000881 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 0.1637 on 8 degrees of freedom
+Multiple R-squared:  0.7679,	Adjusted R-squared:  0.7388
+F-statistic: 26.46 on 1 and 8 DF,  p-value: 0.0008808
+> summary(lm.gpa.sat)
+Call:
+lm(formula = gpa ~ sat)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-0.23544 -0.12184  0.00316  0.02943  0.56456
+Coefficients:
+             Estimate Std. Error t value Pr(>|t|)
+(Intercept) 1.7848101  0.4771715   3.740   0.0057 **
+sat         0.0024557  0.0008416   2.918   0.0193 *
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 0.2365 on 8 degrees of freedom
+Multiple R-squared:  0.5156,	Adjusted R-squared:  0.455
+F-statistic: 8.515 on 1 and 8 DF,  p-value: 0.01935
+>
+</code>
+위에서처럼 significant함.
+''summary(lm.gpa.clep)$r.squared = 0.7679''
+''summary(lm.gpa.sat)$r.squared = 0.5156''
+그렇다면 sat의 영향력은 clep을 매개로 해서 나타나는가를 보기 위해서
+<code>
+> lm.clep.sat <- lm(clep ~ sat)
+> summary(lm.clep.sat)
+Call:
+lm(formula = clep ~ sat)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-2.5316 -1.2437 -0.2848  0.0949  5.6329
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept) 8.556962   4.813367   1.778  0.11334
+sat         0.043291   0.008489   5.100  0.00093 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 2.386 on 8 degrees of freedom
+Multiple R-squared:  0.7648,	Adjusted R-squared:  0.7353
+F-statistic: 26.01 on 1 and 8 DF,  p-value: 0.0009303
+>
+</code>
+<code>
+res.lm.clep.sat <- resid(lm.clep.sat)
+reg.lm.clep.sat <- predict(lm.clep.sat)-mean(clep)
+lm.gpa.sat.mediated.via.clep <- lm(gpa~reg.lm.clep.sat)
+lm.gpa.clep.alone <- lm(gpa~res.lm.clep.sat)
+summary(lm.gpa.sat.mediated.via.clep)
+summary(lm.gpa.clep.alone)
+</code>
+<code>
+> res.lm.clep.sat <- resid(lm.clep.sat)
+> reg.lm.clep.sat <- predict(lm.clep.sat)-mean(clep)
+> lm.gpa.sat.mediated.via.clep <- lm(gpa~reg.lm.clep.sat)
+> lm.gpa.clep.alone <- lm(gpa~res.lm.clep.sat)
+>
+> summary(lm.gpa.sat.mediated.via.clep)
+Call:
+lm(formula = gpa ~ reg.lm.clep.sat)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-0.23544 -0.12184  0.00316  0.02943  0.56456
+Coefficients:
+                Estimate Std. Error t value Pr(>|t|)
+(Intercept)      3.16000    0.07480  42.246 1.09e-10 ***
+reg.lm.clep.sat  0.05673    0.01944   2.918   0.0193 *
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 0.2365 on 8 degrees of freedom
+Multiple R-squared:  0.5156,	Adjusted R-squared:  0.455
+F-statistic: 8.515 on 1 and 8 DF,  p-value: 0.01935
+> summary(lm.gpa.clep.alone)
+Call:
+lm(formula = gpa ~ res.lm.clep.sat)
+Residuals:
+     Min       1Q   Median       3Q      Max
+-0.34523 -0.23300 -0.04416  0.27577  0.36370
+Coefficients:
+                Estimate Std. Error t value Pr(>|t|)
+(Intercept)      3.16000    0.09231  34.231  5.8e-10 ***
+res.lm.clep.sat  0.07293    0.04326   1.686     0.13
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 0.2919 on 8 degrees of freedom
+Multiple R-squared:  0.2622,	Adjusted R-squared:  0.1699
+F-statistic: 2.842 on 1 and 8 DF,  p-value: 0.1303
+>
+</code>
+sat의 영향력 mediated via clep = Multiple R-squared:  0.5156
+sat의 영향력을 제외한 celp만의 설명력 = Multiple R-squared:  0.2622
+====== temp2 ======
+<code>
+> #############################
+> #############################
+> # https://data.library.virginia.edu/introduction-to-mediation-analysis/
+> # Download data online. This is a simulated dataset for this post.
+> myData <- read.csv('http://static.lib.virginia.edu/statlab/materials/data/mediationData.csv')
+>
+> model.0 <- lm(Y ~ X, myData)
+> summary(model.0)
+Call:
+lm(formula = Y ~ X, data = myData)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-5.0262 -1.2340 -0.3282  1.5583  5.1622
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)   2.8572     0.6932   4.122 7.88e-05 ***
+X             0.3961     0.1112   3.564 0.000567 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 1.929 on 98 degrees of freedom
+Multiple R-squared:  0.1147,	Adjusted R-squared:  0.1057
+F-statistic:  12.7 on 1 and 98 DF,  p-value: 0.0005671
+> # c line의 coef 값 = 0.396 at p-level = 0.0006
+>
+> model.M <- lm(M ~ X, myData)
+> summary(model.M)
+Call:
+lm(formula = M ~ X, data = myData)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-4.3046 -0.8656  0.1344  1.1344  4.6954
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)  1.49952    0.58920   2.545   0.0125 *
+X            0.56102    0.09448   5.938 4.39e-08 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 1.639 on 98 degrees of freedom
+Multiple R-squared:  0.2646,	Adjusted R-squared:  0.2571
+F-statistic: 35.26 on 1 and 98 DF,  p-value: 4.391e-08
+> # a line = 0.561 p = 0.0
+>
+> model.Y <- lm(Y ~ X + M, myData)
+> summary(model.Y)
+Call:
+lm(formula = Y ~ X + M, data = myData)
+Residuals:
+    Min      1Q  Median      3Q     Max
+-3.7631 -1.2393  0.0308  1.0832  4.0055
+Coefficients:
+            Estimate Std. Error t value Pr(>|t|)
+(Intercept)   1.9043     0.6055   3.145   0.0022 **
+X             0.0396     0.1096   0.361   0.7187
+M             0.6355     0.1005   6.321 7.92e-09 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Residual standard error: 1.631 on 97 degrees of freedom
+Multiple R-squared:  0.373,	Adjusted R-squared:  0.3601
+F-statistic: 28.85 on 2 and 97 DF,  p-value: 1.471e-10
+> #################################################
+> ## b line + c' line (c에서 M이 추가되었으므로)
+> ## c' = 0.0396 ---
+> ## b  = 0.6355 ***
+> #################################################
+>
+> library(mediation)
+> results <- mediate(model.M, model.Y,
++                    treat='X',
++                    mediator='M',
++                    boot=TRUE, sims=500)
+Running nonparametric bootstrap
+> summary(results)
+Causal Mediation Analysis
+Nonparametric Bootstrap Confidence Intervals with the Percentile Method
+               Estimate 95% CI Lower 95% CI Upper p-value
+ACME             0.3565       0.2174         0.54  <2e-16 ***
+ADE              0.0396      -0.1712         0.31    0.74
+Total Effect     0.3961       0.1890         0.64  <2e-16 ***
+Prop. Mediated   0.9000       0.4707         1.69  <2e-16 ***
+---
+Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
+Sample Size Used: 100
+Simulations: 500
+>
+> # OR with lavaan
+> library(lavaan)
+> specmod <- "
++ # path c' (direct effect)
++ Y ~ c*X
++
++ # path a
++ M ~ a*X
++
++ # path b
++ Y ~ b*M
++
++ # indirect effect (a*b)
++ # sobel test (Delta method)
++ ab := a*b
++ "
+>
+> # fit/estimate model
+> fitmod <- sem(specmod, data=myData)
+>
+> # summarize/result the output
+> summary(fitmod, fit.measures=TRUE, rsquare=TRUE)
+lavaan 0.6-12 ended normally after 1 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                         5
+  Number of observations                           100
+Model Test User Model:
+  Test statistic                                 0.000
+  Degrees of freedom                                 0
+Model Test Baseline Model:
+  Test statistic                                77.413
+  Degrees of freedom                                 3
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    1.000
+  Tucker-Lewis Index (TLI)                       1.000
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)               -379.612
+  Loglikelihood unrestricted model (H1)       -379.612
+  Akaike (AIC)                                 769.225
+  Bayesian (BIC)                               782.250
+  Sample-size adjusted Bayesian (BIC)          766.459
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.000
+Percent confidence interval - lower         0.000
+Percent confidence interval - upper         0.000
+  P-value RMSEA <= 0.05                             NA
+Standardized Root Mean Square Residual:
+  SRMR                                           0.000
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  Y ~
+    X          (c)    0.040    0.108    0.367    0.714
+  M ~
+    X          (a)    0.561    0.094    5.998    0.000
+  Y ~
+    M          (b)    0.635    0.099    6.418    0.000
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+   .Y                 2.581    0.365    7.071    0.000
+   .M                 2.633    0.372    7.071    0.000
+R-Square:
+                   Estimate
+    Y                 0.373
+    M                 0.265
+Defined Parameters:
+                   Estimate  Std.Err  z-value  P(>|z|)
+    ab                0.357    0.081    4.382    0.000
+>
+> # Resampling method
+> set.seed(2019)
+>
+> fitmod <- sem(specmod, data=myData, se="bootstrap", bootstrap=1000)
+> summary(fitmod, fit.measures=TRUE, rsquare=TRUE)
+lavaan 0.6-12 ended normally after 1 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                         5
+  Number of observations                           100
+Model Test User Model:
+  Test statistic                                 0.000
+  Degrees of freedom                                 0
+Model Test Baseline Model:
+  Test statistic                                77.413
+  Degrees of freedom                                 3
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    1.000
+  Tucker-Lewis Index (TLI)                       1.000
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)               -379.612
+  Loglikelihood unrestricted model (H1)       -379.612
+  Akaike (AIC)                                 769.225
+  Bayesian (BIC)                               782.250
+  Sample-size adjusted Bayesian (BIC)          766.459
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.000
+Percent confidence interval - lower         0.000
+Percent confidence interval - upper         0.000
+  P-value RMSEA <= 0.05                             NA
+Standardized Root Mean Square Residual:
+  SRMR                                           0.000
+Parameter Estimates:
+  Standard errors                            Bootstrap
+  Number of requested bootstrap draws             1000
+  Number of successful bootstrap draws            1000
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  Y ~
+    X          (c)    0.040    0.124    0.319    0.750
+  M ~
+    X          (a)    0.561    0.095    5.888    0.000
+  Y ~
+    M          (b)    0.635    0.102    6.224    0.000
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+   .Y                 2.581    0.334    7.731    0.000
+   .M                 2.633    0.362    7.271    0.000
+R-Square:
+                   Estimate
+    Y                 0.373
+    M                 0.265
+Defined Parameters:
+                   Estimate  Std.Err  z-value  P(>|z|)
+    ab                0.357    0.080    4.442    0.000
+> parameterEstimates(fitmod, ci=TRUE, level=.95, boot.ci.type="perc")
+  lhs op rhs label   est    se     z pvalue ci.lower ci.upper
+   Y  ~   X     c 0.040 0.124 0.319   0.75   -0.200    0.288
+   M  ~   X     a 0.561 0.095 5.888   0.00    0.391    0.765
+   Y  ~   M     b 0.635 0.102 6.224   0.00    0.439    0.835
+   Y ~~   Y       2.581 0.334 7.731   0.00    1.868    3.193
+   M ~~   M       2.633 0.362 7.271   0.00    1.890    3.311
+   X ~~   X       3.010 0.000    NA     NA    3.010    3.010
+  ab := a*b    ab 0.357 0.080 4.442   0.00    0.225    0.538
+>
+> summary(lm(Y~X+M,data=myData))$r.square
+[1] 0.3729992
+> summary(lm(M~X,data=myData))$r.square
+[1] 0.2645879
+> var(myData$Y)
+[1] 4.158687
+> var(myData$M)
+[1] 3.616566
+> var(myData$X)
+[1] 3.040303
+>
+</code>