Differences

This shows you the differences between two versions of the page.

--- r:path_analysis [2022/11/15 13:00] – hkimscil
+++ r:path_analysis [2023/11/27 16:57] (current) – [Lavaan in R: explanation] hkimscil
@@ Line 1: / Line 1: @@
 ====== Path Analysis ======
-===== Lavaan 2 =====
+{{:r:pasted:20230529-234519.png}}
+====== Introduction ======
-{{youtube>_tTPHt4cPwI}}
-<code>
-model <- '
-    # labeling path from mastery to interest
-    interest ~ a*mastery + perfgoal + ses
-    # labeling path from interest to achieve.
-    # Adding labeled path from
-    # mastery to achieve
-    achieve ~ e*anxiety + b*interest + c*mastery
-    # predicting anxiety and labeling path from mastery
-    anxiety ~ perfgoal + d*mastery
-    # estimtating the variances and covariances of
-    # the exogenous variables (ses, mastery,performance)
-    mastery~~mastery
-    perfgoal~~perfgoal
-    ses~~ses
-    mastery~~perfgoal+ses
-    perfgoal~~ses
-    # estimating the variances of residuals
-    # for endogenous variables
-    # (interest, anxiety, achieve)
-    interest~~interest
-    anxiety~~anxiety
-    achieve~~achieve
-    # estimating the covariance of residuals
-    # for interest and anxiety
-    interest~~anxiety
-    # calculating specific indirect effect
-    # of mastery on achieve via interest
-    SIE1:=a*b
-    # calculating specific indirect effect of
-    # mastery on achieve via anxiety
-    SIE2:=d*e
-    # calculating total indirect effect of
-    # mastery on achievement via mediators
-    TIE:=SIE1+SIE2
-    # calculating total effect of mastery on achieve
-    TE:=TIE+c'
-    # using naive bootstrap to obtain standard errors
-    fit <- sem(model, data=processdata, se="bootstrap")
-    summary(fit,fit.measures=TRUE)
-    # using 'parameterEstimates' function will give
-    # us confidence intervals based on naive bootstrap.
-    # A standard approach to testing indirect effects.
-    parameterEstimates(fit)
-</code>
-----
-===== Lavaan 3: Testing data normality =====
-{{youtube>HvYW_GeHpD8}}
-<code>
-processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
-str(processdata)
-# install.packages("MVN")
-library(MVN)
-newdata <- processdata[c("achieve", "interest", "anxiety")]
-str(newdata)
-</code>
-Use the 'mvn' function to evalue normality
-Multivariate normality is evidenced by p-values associated with multivariate skewness and kurtosis statistics that are > .05. In those cases where both the skewness and kurtosis results are non-significant (p's > .05), then the data are assumed to follow a multivariate normal distribution where p > .05 (Korkmaz, Goksuluk, & Zarasiz, 2014, 2019).
-You can also use plots to explore possible multivariate outliers. Moreover, you can examine univariate tests of normality (the default is Shapiro-Wilk test, but can be changed if desired). A significant test result regarding a specific variable indicates a significant departure from normality.
-<code>
-mvn(newdata, mvnTest="mardia")
-mvn(newdata, multivariatePlot="qq")
-mvn(newdata, multivariateOutlierMethod="quan")
-</code>
-You can generate univariate plot as well to evaluate distribution of the endogenous variables for non-normality. Skewness values approaching 2 or kurtoisis values over 7 may be considered indicative of more "significant problems" with non-normality (Curran, et al., 1996).
-<code>
-mvn(newdata, univariatePlot="histogram")
-mvn(newdata, univariatePlot="box")
-model <- '
-    interest ~ mastery + perfgoal + ses
-    achieve ~ anxiety + interest + mastery
-    anxiety ~ perfgoal + mastery
-    # variances
-    mastery ~~ mastery
-    perfgoal ~~ perfgoal
-    ses ~~ ses
-    mastery ~~ perfgoal + ses
-    perfgoal ~~ ses
-    interest ~~ interest
-    anxiety ~~ anxiety
-    achieve ~~ achieve
-    interest~~anxiety
-'
-</code>
-We will fit the model using the 'estimator' argument at set it equal to "MLM." This will result in the Satorra-Bentler model chi-square being computed. We will also use the 'se' argument and set it to "roburst."
-<code>
-fit <- sem(model, data=processdata, estimator = "MLM", se="roburst")
-summary(fit,fit.measures=TRUE)
-</code>
-----
-reference
-{{youtube>8r9bUKUVecc?small}}
-see [[https://www.rensvandeschoot.com/tutorials/lme4/|lme4 tutorial]]
-====== Path Analysis 2 ======
 {{youtube>UGIVPtFKwc0}}
@@ Line 224: / Line 111: @@
 </code>
+----
+<code>
+# my own
+# pbt model
+specmod5 <- '
+    # Directional relations (path)
+    intention ~ a*attitude + b*norms + c*control
+    behavior ~ d*intention
+    # Covariances
+    attitude ~~ norms + control
+    norms ~~ control
+    ad := a*d
+    bd := b*d
+    cd := c*d
+'
+fitmod5 <- sem(specmod5, data=df)
+summary(fitmod5, fit.measures=TRUE, rsquare=TRUE)
+</code>
 ====== Output ======
 <code>
@@ Line 553: / Line 458: @@
 </code>
+===== specmod5 =====
+<code>
+> specmod5 <- "
++     # Directional relations (path)
++     intention ~ attitude + norms + control
++     behavior ~ intention + norms
++     # Covariances
++     attitude ~~ norms + control
++     norms ~~ control
++ "
+> fitmod5 <- sem(specmod5, data=df)
+> summary(fitmod5, fit.measures=TRUE, rsquare=TRUE)
+lavaan 0.6-12 ended normally after 18 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                        13
+  Number of observations                           199
+Model Test User Model:
+  Test statistic                                 1.781
+  Degrees of freedom                                 2
+  P-value (Chi-square)                           0.410
+Model Test Baseline Model:
+  Test statistic                               182.295
+  Degrees of freedom                                10
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    1.000
+  Tucker-Lewis Index (TLI)                       1.006
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)              -1258.396
+  Loglikelihood unrestricted model (H1)      -1257.506
+  Akaike (AIC)                                2542.792
+  Bayesian (BIC)                              2585.605
+  Sample-size adjusted Bayesian (BIC)         2544.421
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.000
+Percent confidence interval - lower         0.000
+Percent confidence interval - upper         0.136
+  P-value RMSEA <= 0.05                          0.569
+Standardized Root Mean Square Residual:
+  SRMR                                           0.018
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  intention ~
+    attitude          0.352    0.058    6.068    0.000
+    norms             0.153    0.059    2.577    0.010
+    control           0.275    0.058    4.740    0.000
+  behavior ~
+    intention         0.443    0.068    6.525    0.000
+    norms             0.034    0.068    0.493    0.622
+Covariances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  attitude ~~
+    norms             0.200    0.064    3.128    0.002
+    control           0.334    0.070    4.748    0.000
+  norms ~~
+    control           0.220    0.065    3.411    0.001
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+   .intention         0.530    0.053    9.975    0.000
+   .behavior          0.698    0.070    9.975    0.000
+    attitude          0.928    0.093    9.975    0.000
+    norms             0.830    0.083    9.975    0.000
+    control           0.939    0.094    9.975    0.000
+R-Square:
+                   Estimate
+    intention         0.369
+    behavior          0.199
+</code>
 ===== Lavaan in R: explanation =====
@@ Line 572: / Line 573: @@
 <code>
-processdata<-read.csv("path analysis dataN BinW.csv", header=TRUE, sep=",")
+# processdata<-read.csv("path analysis dataN BinW.csv", header=TRUE, sep=",")
+processdata<-read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv",
+                       header=TRUE, sep=",", fileEncoding="UTF-8-BOM")
 </code>
@@ Line 630: / Line 633: @@
   * Step 2: Use 'lavaan' function to run analysis. Here, I will be saving  the results in an R object called 'fit' (arbitrarily named). Inside  the parenthesis are arguments separated by commas. The first argument contains the name of the object containing the model syntax (see above). The object is named 'model' (again, arbitrarily named above). Next, we have the 'data' argument. This identifies the object (i.e., data frame)  containing the raw data.
 <code>
-fit<-lavaan(model,data=processdata)
+fit<-lavaan(model, data=processdata)
 </code>
   * The 'summary' function can be used to obtain various fit measures and the parameter estimates for the model
 <code>
-summary(fit,fit.measures=TRUE)
+summary(fit, fit.measures=TRUE)
 </code>
   * To obtain standardized estimates, use the 'standardized' argument (setting it to TRUE) when using the 'summary' function. You will need to interpret the Std.all column in the output, as it will provide standardized estimates for all measured variables in the model.
 <code>
-summary(fit,fit.measures=TRUE,standardized=TRUE,rsquare=TRUE)
+summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
 </code>
@@ Line 688: / Line 691: @@
   interest~~anxiety'
-  fit<-lavaan(model,data=processdata,auto.var=TRUE)
+  fit<-lavaan(model, data=processdata, auto.var=TRUE)
-  summary(fit,fit.measures=TRUE,standardized=TRUE,rsquare=TRUE)
+  summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
 </code>
@@ Line 748: / Line 751: @@
 CODING
 <code>
-processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
+processdata<-read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv",
+                       header=TRUE, sep=",", fileEncoding="UTF-8-BOM")
 str(processdata)
 library(lavaan)
@@ Line 768: / Line 772: @@
     interest~~anxiety
 '
-fit <- lavaan(model. data=processdata)
+fit <- lavaan(model, data=processdata)
-fit <- sem(model. data=processdata)
+fit <- sem(model, data=processdata)
 summary(fit, fit.measures=TRUE)
@@ Line 797: / Line 801: @@
 </code>
 ----
+===== Lavaan 2 =====
+{{youtube>_tTPHt4cPwI}}
+<code>
+model <- '
+    # labeling path from mastery to interest
+    interest ~ a*mastery + perfgoal + ses
+    # labeling path from interest to achieve.
+    # Adding labeled path from
+    # mastery to achieve
+    achieve ~ e*anxiety + b*interest + c*mastery
+    # predicting anxiety and labeling path from mastery
+    anxiety ~ perfgoal + d*mastery
+    # estimtating the variances and covariances of
+    # the exogenous variables (ses, mastery,performance)
+    mastery~~mastery
+    perfgoal~~perfgoal
+    ses~~ses
+    mastery~~perfgoal+ses
+    perfgoal~~ses
+    # estimating the variances of residuals
+    # for endogenous variables
+    # (interest, anxiety, achieve)
+    interest~~interest
+    anxiety~~anxiety
+    achieve~~achieve
+    # estimating the covariance of residuals
+    # for interest and anxiety
+    interest~~anxiety
+    # calculating specific indirect effect
+    # of mastery on achieve via interest
+    SIE1:=a*b
+    # calculating specific indirect effect of
+    # mastery on achieve via anxiety
+    SIE2:=d*e
+    # calculating total indirect effect of
+    # mastery on achievement via mediators
+    TIE:=SIE1+SIE2
+    # calculating total effect of mastery on achieve
+    TE:=TIE+c'
+    # using naive bootstrap to obtain standard errors
+    fit <- sem(model, data=processdata, se="bootstrap")
+    summary(fit,fit.measures=TRUE)
+    # using 'parameterEstimates' function will give
+    # us confidence intervals based on naive bootstrap.
+    # A standard approach to testing indirect effects.
+    parameterEstimates(fit)
+</code>
+----
+===== Lavaan 3: Testing data normality =====
+{{youtube>HvYW_GeHpD8}}
+<code>
+processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
+str(processdata)
+# install.packages("MVN")
+library(MVN)
+newdata <- processdata[c("achieve", "interest", "anxiety")]
+str(newdata)
+</code>
+Use the 'mvn' function to evalue normality
+Multivariate normality is evidenced by p-values associated with multivariate skewness and kurtosis statistics that are > .05. In those cases where both the skewness and kurtosis results are non-significant (p's > .05), then the data are assumed to follow a multivariate normal distribution where p > .05 (Korkmaz, Goksuluk, & Zarasiz, 2014, 2019).
+You can also use plots to explore possible multivariate outliers. Moreover, you can examine univariate tests of normality (the default is Shapiro-Wilk test, but can be changed if desired). A significant test result regarding a specific variable indicates a significant departure from normality.
+<code>
+mvn(newdata, mvnTest="mardia")
+mvn(newdata, multivariatePlot="qq")
+mvn(newdata, multivariateOutlierMethod="quan")
+</code>
+You can generate univariate plot as well to evaluate distribution of the endogenous variables for non-normality. Skewness values approaching 2 or kurtoisis values over 7 may be considered indicative of more "significant problems" with non-normality (Curran, et al., 1996).
+<code>
+mvn(newdata, univariatePlot="histogram")
+mvn(newdata, univariatePlot="box")
+model <- '
+    interest ~ mastery + perfgoal + ses
+    achieve ~ anxiety + interest + mastery
+    anxiety ~ perfgoal + mastery
+    # variances
+    mastery ~~ mastery
+    perfgoal ~~ perfgoal
+    ses ~~ ses
+    mastery ~~ perfgoal + ses
+    perfgoal ~~ ses
+    interest ~~ interest
+    anxiety ~~ anxiety
+    achieve ~~ achieve
+    interest~~anxiety
+'
+</code>
+We will fit the model using the 'estimator' argument at set it equal to "MLM." This will result in the Satorra-Bentler model chi-square being computed. We will also use the 'se' argument and set it to "roburst."
+<code>
+fit <- sem(model, data=processdata, estimator = "MLM", se="roburst")
+summary(fit,fit.measures=TRUE)
+</code>
+----
+reference
+{{youtube>8r9bUKUVecc?small}}
+see [[https://www.rensvandeschoot.com/tutorials/lme4/|lme4 tutorial]]
+===== Exercise =====
+Using mtcars in R
+<code>
+?mtcars
+mtcars
+str(mtcars)
+df <- mtcars
+</code>
+<code>
+# model specfication
+model <-'
+  mpg ~ hp + gear + cyl + disp + carb + am + wt
+  hp ~ cyl + disp + carb
+'
+# model fit
+fit <- cfa(model, data = mtcars)
+summary(fit, fit.measures = TRUE, standardized=T, rsquare=T)
+semPaths(fit, 'std', layout = 'circle')
+</code>