Differences

This shows you the differences between two versions of the page.

--- r:path_analysis [2022/11/15 06:37] – hkimscil
+++ r:path_analysis [2024/11/04 10:28] (current) – [Introduction] hkimscil
@@ Line 1: / Line 1: @@
 ====== Path Analysis ======
-===== Lavaan in R: explanation =====
+{{:r:pasted:20230529-234519.png}}
+====== Introduction ======
-{{youtube>QP-v6RwsZjY?start=251}}
-Path analysis in R with Lavaan (introduction)
-By Mike Crowson, Ph.D.
-September 17, 2019
-  * Overview: There are two basic functions that allow you to run path analysis in Lavaan: the 'sem' and the 'lavaan' functions.This video will demonstrate how to specify a path model involving only manifest variables and how to estimate model parameters using the 'lavaan' function. A copy of this text file and a .csv file containing the raw data will be available for download underneath the video description. You will notice that I use the pound sign (#) in some of the syntax. The # sign is used for comments and are not read by the program. I use it in some of the syntax below to provide annotations.
-  * If you have not already done so, you will need to install Lavaan.
-<code>
-install.packages("lavaan")
-</code>
-  * Read data into R and store in data object. Make sure you have R correctly pointed to the folder containing your data. Below is syntax to create a data frame called 'processdata' when reading the .csv file (referenced above) into R.This is the data frame we will be using when running our analyses.
-<code>
-processdata<-read.csv("path analysis dataN BinW.csv", header=TRUE, sep=",")
-</code>
-  * Using the 'str' function, you can look at the structure of the data.
-<code>
-str(processdata)
-</code>
-  * Use libary function to call up lavaan
-<code>
-library(lavaan)
-</code>
-  * 'lavaan' function
-  * Step 1: Use lavaan model syntax to specify path model and have it stored in an R object. In our model, we will treat ses, mastery goals, and performance goals as predictors of student achievement. The effect of mastery on achievement will be both direct and indirect (via interest and anxiety. The effects of ses and performance goals will be treated as being fully mediated through anxiety and interest.
-  * When specifying predictive relationships in the model, we use the tilde sign ('~'), which separates thedependent variable in each equation from its predictors. Predictors are separated in each equation by '+' sign. In our model, we will also allow the residuals for anxiety and interest to correlate (see '~~' in syntax below)
-<code>
-# model specification
-model <- '
-  #equation where interest is predicted by ses
-  # & mastery and performance goals
-  interest ~ mastery + perfgoal + ses
-  # equation where achieve is predicted by
-  # interest and anxiety
-  achieve ~ anxiety + interest + mastery
-  # equation where anxiety is predicted
-  # by mastery and performance goals
-  anxiety ~ perfgoal + mastery
-  # estimating the variances of
-  # the exogenous variables (ses, mastery,performance)
-  mastery ~~ mastery
-  perfgoal ~~ perfgoal
-  ses ~~ ses
-  # estimtating the covariances of the exogenous
-  # variables (ses, mastery,performance)
-  mastery ~~ perfgoal + ses
-  perfgoal ~~ ses
-  # estimating the residual variances
-  # for endogenous variables (interest, anxiety, achieve)
-  interest ~~ interest
-  anxiety ~~ anxiety
-  achieve ~~ achieve
-  # estimating the covariance of residuals
-  # for interest and anxiety
-  interest ~~ anxiety '
-</code>
-  * Step 2: Use 'lavaan' function to run analysis. Here, I will be saving  the results in an R object called 'fit' (arbitrarily named). Inside  the parenthesis are arguments separated by commas. The first argument contains the name of the object containing the model syntax (see above). The object is named 'model' (again, arbitrarily named above). Next, we have the 'data' argument. This identifies the object (i.e., data frame)  containing the raw data.
-<code>
-fit<-lavaan(model,data=processdata)
-</code>
-  * The 'summary' function can be used to obtain various fit measures and the parameter estimates for the model
-<code>
-summary(fit,fit.measures=TRUE)
-</code>
-  * To obtain standardized estimates, use the 'standardized' argument (setting it to TRUE) when using the 'summary' function. You will need to interpret the Std.all column in the output, as it will provide standardized estimates for all measured variables in the model.
-<code>
-summary(fit,fit.measures=TRUE,standardized=TRUE,rsquare=TRUE)
-</code>
-  * Using the 'parameterEstimates' function, you can obtain confidence intervals
-<code>
-parameterEstimates(fit)
-</code>
-  * For a more comprehensive set of fit measures, use the 'fitMeasures' function
-<code>
-fitMeasures(fit)
-</code>
-  * To obtain modification indices, you can use the 'modificationIndices' function
-<code>
-modificationIndices(fit)
-</code>
-  * Note: Modification indices represent the expected decrease in model chi-square after freeing a given parameter (Schumacker & Lomax, 2004). The EPC is an estimate of the model parameter itself. A MI value of 3.84 or greater may be considered "significant" (at the .05) level. Warning: This is totally an empirically based approach to model specification. Consult your theory when using these!
------------------------------
-  * Specification of model using auto.var argument...
-<code>
-# model specification
-model<-'
-  # equation where interest is predicted by ses & mastery and
-  # performance goals
-  interest ~ mastery + perfgoal + ses
-  # equation where achieve is predicted by interest and anxiety
-  achieve~anxiety+interest+mastery
-  #equation where anxiety is predicted by mastery and performance goals
-  anxiety~perfgoal+mastery
-  # estimtating the variances of the exogenous variables (ses, mastery,performance)
-  mastery~~mastery
-  perfgoal~~perfgoal
-  ses~~ses
-  # estimtating the covariances of the exogenous variables (ses, mastery,performance)
-  mastery~~perfgoal+ses
-  perfgoal~~ses
-  # The auto.var argument when fitting the model can be used so that
-  # you do not have to directly request estimation of residual variances
-  # Estimating the covariance of residuals for interest and anxiety
-  interest~~anxiety'
-  fit<-lavaan(model,data=processdata,auto.var=TRUE)
-  summary(fit,fit.measures=TRUE,standardized=TRUE,rsquare=TRUE)
-</code>
-  * There are a couple of ways you can obtain path diagrams (although they can be somewhat tricky to implement.
-  * One approach is to use the 'semPaths' function from the 'semPlot' package. Below, I provide a rough demo of this approach. Citations containing additional information is provided below the demo.
-<code>
-install.packages("semPlot")
-library("semPlot")
-semPaths(fit,what="paths",whatLabels="par",style="lisrel",layout="tree",
-rotation=2)
-</code>
-  * A second approach is to use the 'lavaanPlot" function from the 'lavaanPlot' package.
-<code>
-install.packages("lavaanPlot")
-library(lavaanPlot)
-lavaanPlot(model = fit,
-    node_options = list(shape = "box", fontname = "Helvetica"),
-    edge_options = list(color = "grey"),
-    coefs = TRUE,
-    covs = TRUE,
-    stars = c("regress"))
-</code>
-----
-Resources on the use of lavaan:
-  * http://lavaan.ugent.be/tutorial/tutorial.pdf
-  * https://www.jstatsoft.org/index.php/jss/article/view/v048i02/v48i02.pdf
-  * https://cran.r-project.org/web/packages/lavaan/lavaan.pdf
-  * http://lavaan.ugent.be/tutorial/index.html
-----
-Using the 'semPlot' package
-  * https://cran.r-project.org/web/packages/semPlot/semPlot.pdf
-  * https://www.rdocumentation.org/packages/semPlot/versions/1.1.2/topics/semPaths
-  * http://sachaepskamp.com/semPlot/examples
-----
-Using the 'lavaanPlot' package
-  * https://cran.r-project.org/web/packages/lavaanPlot/lavaanPlot.pdf
-  * https://cran.r-project.org/web/packages/lavaanPlot/vignettes/Intro_to_lavaanPlot.html
-  * https://cran.rstudio.com/web/packages/lavaanPlot/vignettes/Intro_to_lavaanPlot.html
-  * http://www.alexlishinski.com/post/2018-04-13-lavaanplot0.5/
-----
-Raw data for all examples can be downloaded at...
-  * https://drive.google.com/open?id=1Ge0kIn7-f6gSfL40mZ47zGFH5WNSXRBt
-A copy of the Powerpoint of the model specification can be downloaded at...
-  * https://drive.google.com/open?id=1Nvpz7RnBEfEzK1VJKZksy6PBO4mzbdLO
-Basics of path analysis using Lavaan.txt
-Displaying Basics of path analysis using Lavaan.txt.
-CODING
-<code>
-processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
-str(processdata)
-library(lavaan)
-model <- '
-    interest ~ mastery + perfgoal + ses
-    achieve ~ anxiety + interest + mastery
-    anxiety ~ perfgoal + mastery
-    # variances
-    mastery ~~ mastery
-    perfgoal ~~ perfgoal
-    ses ~~ ses
-    mastery ~~ perfgoal + ses
-    perfgoal ~~ ses
-    interest ~~ interest
-    anxiety ~~ anxiety
-    achieve ~~ achieve
-    interest~~anxiety
-'
-fit <- lavaan(model. data=processdata)
-fit <- sem(model. data=processdata)
-summary(fit, fit.measures=TRUE)
-summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
-parameterEstimates(fit)
-fitMeasures(fit)
-modificationIndices(fit)
-install.packages("semPlot")
-library("semPlot")
-semPaths(fit,what="paths",whatLabels="par",style="lisrel",layout="tree",
-rotation=2)
-install.packages("lavaanPlot")
-library(lavaanPlot)
-lavaanPlot(
-    model = fit,
-    node_options = list(shape = "box", fontname = "Helvetica"),
-    edge_options = list(color = "grey"),
-    coefs = TRUE, covs=TRUE,
-    stars = c("regress"))
-</code>
-----
-===== Lavaan 2 =====
-{{youtube>_tTPHt4cPwI}}
-<code>
-model <- '
-    # labeling path from mastery to interest
-    interest ~ a*mastery + perfgoal + ses
-    # labeling path from interest to achieve.
-    # Adding labeled path from
-    # mastery to achieve
-    achieve ~ e*anxiety + b*interest + c*mastery
-    # predicting anxiety and labeling path from mastery
-    anxiety ~ perfgoal + d*mastery
-    # estimtating the variances and covariances of
-    # the exogenous variables (ses, mastery,performance)
-    mastery~~mastery
-    perfgoal~~perfgoal
-    ses~~ses
-    mastery~~perfgoal+ses
-    perfgoal~~ses
-    # estimating the variances of residuals
-    # for endogenous variables
-    # (interest, anxiety, achieve)
-    interest~~interest
-    anxiety~~anxiety
-    achieve~~achieve
-    # estimating the covariance of residuals
-    # for interest and anxiety
-    interest~~anxiety
-    # calculating specific indirect effect
-    # of mastery on achieve via interest
-    SIE1:=a*b
-    # calculating specific indirect effect of
-    # mastery on achieve via anxiety
-    SIE2:=d*e
-    # calculating total indirect effect of
-    # mastery on achievement via mediators
-    TIE:=SIE1+SIE2
-    # calculating total effect of mastery on achieve
-    TE:=TIE+c'
-    # using naive bootstrap to obtain standard errors
-    fit <- sem(model, data=processdata, se="bootstrap")
-    summary(fit,fit.measures=TRUE)
-    # using 'parameterEstimates' function will give
-    # us confidence intervals based on naive bootstrap.
-    # A standard approach to testing indirect effects.
-    parameterEstimates(fit)
-</code>
-----
-===== Lavaan 3: Testing data normality =====
-{{youtube>HvYW_GeHpD8}}
-<code>
-processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
-str(processdata)
-# install.packages("MVN")
-library(MVN)
-newdata <- processdata[c("achieve", "interest", "anxiety")]
-str(newdata)
-</code>
-Use the 'mvn' function to evalue normality
-Multivariate normality is evidenced by p-values associated with multivariate skewness and kurtosis statistics that are > .05. In those cases where both the skewness and kurtosis results are non-significant (p's > .05), then the data are assumed to follow a multivariate normal distribution where p > .05 (Korkmaz, Goksuluk, & Zarasiz, 2014, 2019).
-You can also use plots to explore possible multivariate outliers. Moreover, you can examine univariate tests of normality (the default is Shapiro-Wilk test, but can be changed if desired). A significant test result regarding a specific variable indicates a significant departure from normality.
-<code>
-mvn(newdata, mvnTest="mardia")
-mvn(newdata, multivariatePlot="qq")
-mvn(newdata, multivariateOutlierMethod="quan")
-</code>
-You can generate univariate plot as well to evaluate distribution of the endogenous variables for non-normality. Skewness values approaching 2 or kurtoisis values over 7 may be considered indicative of more "significant problems" with non-normality (Curran, et al., 1996).
-<code>
-mvn(newdata, univariatePlot="histogram")
-mvn(newdata, univariatePlot="box")
-model <- '
-    interest ~ mastery + perfgoal + ses
-    achieve ~ anxiety + interest + mastery
-    anxiety ~ perfgoal + mastery
-    # variances
-    mastery ~~ mastery
-    perfgoal ~~ perfgoal
-    ses ~~ ses
-    mastery ~~ perfgoal + ses
-    perfgoal ~~ ses
-    interest ~~ interest
-    anxiety ~~ anxiety
-    achieve ~~ achieve
-    interest~~anxiety
-'
-</code>
-We will fit the model using the 'estimator' argument at set it equal to "MLM." This will result in the Satorra-Bentler model chi-square being computed. We will also use the 'se' argument and set it to "roburst."
-<code>
-fit <- sem(model, data=processdata, estimator = "MLM", se="roburst")
-summary(fit,fit.measures=TRUE)
-</code>
-----
-reference
-{{youtube>8r9bUKUVecc?small}}
-see [[https://www.rensvandeschoot.com/tutorials/lme4/|lme4 tutorial]]
-====== Path Analysis 2 ======
 {{youtube>UGIVPtFKwc0}}
@@ Line 373: / Line 16: @@
     * The number of unique (non-redundent) source of information
     * $p(p+1)/2$
-    * The number of parameters (paths) specified in model
+  * The number of parameters (paths) specified in model
-  * Just-identified (df = 0)
+    * Just-identified (df = 0)
-    * Model can be estimated, but cannot be assessed
+      * Model can be estimated, but cannot be assessed
-  * Over-identified (df > 0)
+    * Over-identified (df > 0)
-    * Model can be estimated and assessed
+      * Model can be estimated and assessed
-  * Under-identified (df < 0)
+    * Under-identified (df < 0)
-    * Model cannot be either estimated or assessed
+      * Model cannot be either estimated or assessed
   * Exogenous and
@@ Line 468: / Line 111: @@
 </code>
+----
+<code>
+# my own
+# pbt model
+specmod5 <- '
+    # Directional relations (path)
+    intention ~ a*attitude + b*norms + c*control
+    behavior ~ d*intention
+    # Covariances
+    attitude ~~ norms + control
+    norms ~~ control
+    ad := a*d
+    bd := b*d
+    cd := c*d
+'
+fitmod5 <- sem(specmod5, data=df)
+summary(fitmod5, fit.measures=TRUE, rsquare=TRUE)
+</code>
 ====== Output ======
 <code>
@@ Line 797: / Line 458: @@
 </code>
+===== specmod5 =====
+<code>
+> specmod5 <- "
++     # Directional relations (path)
++     intention ~ attitude + norms + control
++     behavior ~ intention + norms
++     # Covariances
++     attitude ~~ norms + control
++     norms ~~ control
++ "
+> fitmod5 <- sem(specmod5, data=df)
+> summary(fitmod5, fit.measures=TRUE, rsquare=TRUE)
+lavaan 0.6-12 ended normally after 18 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                        13
+  Number of observations                           199
+Model Test User Model:
+  Test statistic                                 1.781
+  Degrees of freedom                                 2
+  P-value (Chi-square)                           0.410
+Model Test Baseline Model:
+  Test statistic                               182.295
+  Degrees of freedom                                10
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    1.000
+  Tucker-Lewis Index (TLI)                       1.006
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)              -1258.396
+  Loglikelihood unrestricted model (H1)      -1257.506
+  Akaike (AIC)                                2542.792
+  Bayesian (BIC)                              2585.605
+  Sample-size adjusted Bayesian (BIC)         2544.421
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.000
+Percent confidence interval - lower         0.000
+Percent confidence interval - upper         0.136
+  P-value RMSEA <= 0.05                          0.569
+Standardized Root Mean Square Residual:
+  SRMR                                           0.018
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  intention ~
+    attitude          0.352    0.058    6.068    0.000
+    norms             0.153    0.059    2.577    0.010
+    control           0.275    0.058    4.740    0.000
+  behavior ~
+    intention         0.443    0.068    6.525    0.000
+    norms             0.034    0.068    0.493    0.622
+Covariances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  attitude ~~
+    norms             0.200    0.064    3.128    0.002
+    control           0.334    0.070    4.748    0.000
+  norms ~~
+    control           0.220    0.065    3.411    0.001
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+   .intention         0.530    0.053    9.975    0.000
+   .behavior          0.698    0.070    9.975    0.000
+    attitude          0.928    0.093    9.975    0.000
+    norms             0.830    0.083    9.975    0.000
+    control           0.939    0.094    9.975    0.000
+R-Square:
+                   Estimate
+    intention         0.369
+    behavior          0.199
+</code>
+===== Lavaan in R: explanation =====
+{{youtube>QP-v6RwsZjY?start=251}}
+Path analysis in R with Lavaan (introduction)
+By Mike Crowson, Ph.D.
+September 17, 2019
+  * Overview: There are two basic functions that allow you to run path analysis in Lavaan: the 'sem' and the 'lavaan' functions.This video will demonstrate how to specify a path model involving only manifest variables and how to estimate model parameters using the 'lavaan' function. A copy of this text file and a .csv file containing the raw data will be available for download underneath the video description. You will notice that I use the pound sign (#) in some of the syntax. The # sign is used for comments and are not read by the program. I use it in some of the syntax below to provide annotations.
+  * If you have not already done so, you will need to install Lavaan.
+<code>
+install.packages("lavaan")
+</code>
+  * Read data into R and store in data object. Make sure you have R correctly pointed to the folder containing your data. Below is syntax to create a data frame called 'processdata' when reading the .csv file (referenced above) into R.This is the data frame we will be using when running our analyses.
+<code>
+# processdata<-read.csv("path analysis dataN BinW.csv", header=TRUE, sep=",")
+processdata<-read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv",
+                       header=TRUE, sep=",", fileEncoding="UTF-8-BOM")
+</code>
+  * Using the 'str' function, you can look at the structure of the data.
+<code>
+str(processdata)
+</code>
+  * Use libary function to call up lavaan
+<code>
+library(lavaan)
+</code>
+  * 'lavaan' function
+  * Step 1: Use lavaan model syntax to specify path model and have it stored in an R object. In our model, we will treat ses, mastery goals, and performance goals as predictors of student achievement. The effect of mastery on achievement will be both direct and indirect (via interest and anxiety. The effects of ses and performance goals will be treated as being fully mediated through anxiety and interest.
+  * When specifying predictive relationships in the model, we use the tilde sign ('~'), which separates thedependent variable in each equation from its predictors. Predictors are separated in each equation by '+' sign. In our model, we will also allow the residuals for anxiety and interest to correlate (see '~~' in syntax below)
+<code>
+# model specification
+model <- '
+  #equation where interest is predicted by ses
+  # & mastery and performance goals
+  interest ~ mastery + perfgoal + ses
+  # equation where achieve is predicted by
+  # interest and anxiety
+  achieve ~ anxiety + interest + mastery
+  # equation where anxiety is predicted
+  # by mastery and performance goals
+  anxiety ~ perfgoal + mastery
+  # estimating the variances of
+  # the exogenous variables (ses, mastery,performance)
+  mastery ~~ mastery
+  perfgoal ~~ perfgoal
+  ses ~~ ses
+  # estimtating the covariances of the exogenous
+  # variables (ses, mastery,performance)
+  mastery ~~ perfgoal + ses
+  perfgoal ~~ ses
+  # estimating the residual variances
+  # for endogenous variables (interest, anxiety, achieve)
+  interest ~~ interest
+  anxiety ~~ anxiety
+  achieve ~~ achieve
+  # estimating the covariance of residuals
+  # for interest and anxiety
+  interest ~~ anxiety '
+</code>
+  * Step 2: Use 'lavaan' function to run analysis. Here, I will be saving  the results in an R object called 'fit' (arbitrarily named). Inside  the parenthesis are arguments separated by commas. The first argument contains the name of the object containing the model syntax (see above). The object is named 'model' (again, arbitrarily named above). Next, we have the 'data' argument. This identifies the object (i.e., data frame)  containing the raw data.
+<code>
+fit<-lavaan(model, data=processdata)
+</code>
+  * The 'summary' function can be used to obtain various fit measures and the parameter estimates for the model
+<code>
+summary(fit, fit.measures=TRUE)
+</code>
+  * To obtain standardized estimates, use the 'standardized' argument (setting it to TRUE) when using the 'summary' function. You will need to interpret the Std.all column in the output, as it will provide standardized estimates for all measured variables in the model.
+<code>
+summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
+</code>
+  * Using the 'parameterEstimates' function, you can obtain confidence intervals
+<code>
+parameterEstimates(fit)
+</code>
+  * For a more comprehensive set of fit measures, use the 'fitMeasures' function
+<code>
+fitMeasures(fit)
+</code>
+  * To obtain modification indices, you can use the 'modificationIndices' function
+<code>
+modificationIndices(fit)
+</code>
+  * Note: Modification indices represent the expected decrease in model chi-square after freeing a given parameter (Schumacker & Lomax, 2004). The EPC is an estimate of the model parameter itself. A MI value of 3.84 or greater may be considered "significant" (at the .05) level. Warning: This is totally an empirically based approach to model specification. Consult your theory when using these!
+output
+<code>
+> # install.packages("lavaan")
+>
+> # processdata<-read.csv("path analysis dataN BinW.csv", header=TRUE, sep=",")
+> processdata<-read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv",
++                        header=TRUE, sep=",", fileEncoding="UTF-8-BOM")
+>
+> str(processdata)
+'data.frame':	140 obs. of  9 variables:
+ $ id      : int  1 2 3 4 5 6 7 8 9 10 ...
+ $ ses     : int  1 0 0 1 1 1 0 0 1 1 ...
+ $ genderid: int  1 0 1 1 1 1 0 0 0 0 ...
+ $ perfgoal: num  29.5 29.5 30.4 33.5 28.7 ...
+ $ achieve : num  6.12 1.62 4.5 2.38 5.12 ...
+ $ mastery : num  5.71 1.43 1.29 2.29 4.57 ...
+ $ interest: num  6 4 2 4 5.5 4 4 5 4.5 4 ...
+ $ anxiety : num  1.67 6.33 3.67 3.67 3.67 ...
+ $ pgoal_MS: int  0 0 1 1 0 1 0 1 0 0 ...
+> library(lavaan)
+>
+> # model specification
+> model <- '
++   # equation where interest is predicted by ses
++   # & mastery and performance goals
++   interest ~ mastery + perfgoal + ses
++
++   # equation where achieve is predicted by
++   # interest and anxiety
++   achieve ~ anxiety + interest + mastery
++
++   # equation where anxiety is predicted
++   # by mastery and performance goals
++   anxiety ~ perfgoal + mastery
++
++   # estimating the variances of
++   # the exogenous variables (ses, mastery,performance)
++   mastery ~~ mastery
++   perfgoal ~~ perfgoal
++   ses ~~ ses
++
++   # estimtating the covariances of the exogenous
++   # variables (ses, mastery,performance)
++   mastery ~~ perfgoal + ses
++   perfgoal ~~ ses
++
++   # estimating the residual variances
++   # for endogenous variables (interest, anxiety, achieve)
++   interest ~~ interest
++   anxiety ~~ anxiety
++   achieve ~~ achieve
++
++   # estimating the covariance of residuals
++   # for interest and anxiety
++   interest ~~ anxiety '
+>
+> fit<-lavaan(model, data=processdata)
+> summary(fit, fit.measures=TRUE)
+lavaan 0.6.16 ended normally after 27 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                        18
+  Number of observations                           140
+Model Test User Model:
+  Test statistic                                29.231
+  Degrees of freedom                                 3
+  P-value (Chi-square)                           0.000
+Model Test Baseline Model:
+  Test statistic                               202.409
+  Degrees of freedom                                15
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    0.860
+  Tucker-Lewis Index (TLI)                       0.300
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)              -1391.274
+  Loglikelihood unrestricted model (H1)      -1376.659
+  Akaike (AIC)                                2818.548
+  Bayesian (BIC)                              2871.498
+  Sample-size adjusted Bayesian (SABIC)       2814.548
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.250
+Percent confidence interval - lower         0.172
+Percent confidence interval - upper         0.336
+  P-value H_0: RMSEA <= 0.050                    0.000
+  P-value H_0: RMSEA >= 0.080                    1.000
+Standardized Root Mean Square Residual:
+  SRMR                                           0.074
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  interest ~
+    mastery           0.708    0.088    8.066    0.000
+    perfgoal         -0.035    0.040   -0.879    0.380
+    ses               0.520    0.242    2.154    0.031
+  achieve ~
+    anxiety          -0.040    0.054   -0.747    0.455
+    interest          0.211    0.060    3.527    0.000
+    mastery           0.345    0.079    4.358    0.000
+  anxiety ~
+    perfgoal          0.025    0.045    0.556    0.578
+    mastery          -0.387    0.097   -4.009    0.000
+Covariances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+  mastery ~~
+    perfgoal         -0.935    0.361   -2.590    0.010
+    ses               0.170    0.061    2.805    0.005
+  perfgoal ~~
+    ses              -0.226    0.128   -1.768    0.077
+ .interest ~~
+   .anxiety           0.059    0.181    0.329    0.742
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)
+    mastery           1.944    0.232    8.367    0.000
+    perfgoal          8.936    1.068    8.367    0.000
+    ses               0.249    0.030    8.367    0.000
+   .interest          1.895    0.227    8.367    0.000
+   .anxiety           2.410    0.288    8.367    0.000
+   .achieve           0.988    0.118    8.367    0.000
+> summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
+lavaan 0.6.16 ended normally after 27 iterations
+  Estimator                                         ML
+  Optimization method                           NLMINB
+  Number of model parameters                        18
+  Number of observations                           140
+Model Test User Model:
+  Test statistic                                29.231
+  Degrees of freedom                                 3
+  P-value (Chi-square)                           0.000
+Model Test Baseline Model:
+  Test statistic                               202.409
+  Degrees of freedom                                15
+  P-value                                        0.000
+User Model versus Baseline Model:
+  Comparative Fit Index (CFI)                    0.860
+  Tucker-Lewis Index (TLI)                       0.300
+Loglikelihood and Information Criteria:
+  Loglikelihood user model (H0)              -1391.274
+  Loglikelihood unrestricted model (H1)      -1376.659
+  Akaike (AIC)                                2818.548
+  Bayesian (BIC)                              2871.498
+  Sample-size adjusted Bayesian (SABIC)       2814.548
+Root Mean Square Error of Approximation:
+  RMSEA                                          0.250
+Percent confidence interval - lower         0.172
+Percent confidence interval - upper         0.336
+  P-value H_0: RMSEA <= 0.050                    0.000
+  P-value H_0: RMSEA >= 0.080                    1.000
+Standardized Root Mean Square Residual:
+  SRMR                                           0.074
+Parameter Estimates:
+  Standard errors                             Standard
+  Information                                 Expected
+  Information saturated (h1) model          Structured
+Regressions:
+                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
+  interest ~
+    mastery           0.708    0.088    8.066    0.000    0.708    0.558
+    perfgoal         -0.035    0.040   -0.879    0.380   -0.035   -0.060
+    ses               0.520    0.242    2.154    0.031    0.520    0.147
+  achieve ~
+    anxiety          -0.040    0.054   -0.747    0.455   -0.040   -0.053
+    interest          0.211    0.060    3.527    0.000    0.211    0.294
+    mastery           0.345    0.079    4.358    0.000    0.345    0.379
+  anxiety ~
+    perfgoal          0.025    0.045    0.556    0.578    0.025    0.045
+    mastery          -0.387    0.097   -4.009    0.000   -0.387   -0.327
+Covariances:
+                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
+  mastery ~~
+    perfgoal         -0.935    0.361   -2.590    0.010   -0.935   -0.224
+    ses               0.170    0.061    2.805    0.005    0.170    0.244
+  perfgoal ~~
+    ses              -0.226    0.128   -1.768    0.077   -0.226   -0.151
+ .interest ~~
+   .anxiety           0.059    0.181    0.329    0.742    0.059    0.028
+Variances:
+                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
+    mastery           1.944    0.232    8.367    0.000    1.944    1.000
+    perfgoal          8.936    1.068    8.367    0.000    8.936    1.000
+    ses               0.249    0.030    8.367    0.000    0.249    1.000
+   .interest          1.895    0.227    8.367    0.000    1.895    0.606
+   .anxiety           2.410    0.288    8.367    0.000    2.410    0.884
+   .achieve           0.988    0.118    8.367    0.000    0.988    0.613
+R-Square:
+                   Estimate
+    interest          0.394
+    anxiety           0.116
+    achieve           0.387
+>
+> parameterEstimates(fit)
+        lhs op      rhs    est    se      z pvalue ci.lower ci.upper
+  interest  ~  mastery  0.708 0.088  8.066  0.000    0.536    0.880
+  interest  ~ perfgoal -0.035 0.040 -0.879  0.380   -0.114    0.043
+  interest  ~      ses  0.520 0.242  2.154  0.031    0.047    0.994
+   achieve  ~  anxiety -0.040 0.054 -0.747  0.455   -0.146    0.066
+   achieve  ~ interest  0.211 0.060  3.527  0.000    0.094    0.328
+   achieve  ~  mastery  0.345 0.079  4.358  0.000    0.190    0.500
+   anxiety  ~ perfgoal  0.025 0.045  0.556  0.578   -0.063    0.113
+   anxiety  ~  mastery -0.387 0.097 -4.009  0.000   -0.576   -0.198
+   mastery ~~  mastery  1.944 0.232  8.367  0.000    1.488    2.399
+perfgoal ~~ perfgoal  8.936 1.068  8.367  0.000    6.842   11.029
+      ses ~~      ses  0.249 0.030  8.367  0.000    0.191    0.308
+  mastery ~~ perfgoal -0.935 0.361 -2.590  0.010   -1.642   -0.227
+  mastery ~~      ses  0.170 0.061  2.805  0.005    0.051    0.288
+perfgoal ~~      ses -0.226 0.128 -1.768  0.077   -0.476    0.024
+interest ~~ interest  1.895 0.227  8.367  0.000    1.451    2.339
+  anxiety ~~  anxiety  2.410 0.288  8.367  0.000    1.845    2.974
+  achieve ~~  achieve  0.988 0.118  8.367  0.000    0.757    1.220
+interest ~~  anxiety  0.059 0.181  0.329  0.742   -0.295    0.414
+> fitMeasures(fit)
+                 npar                  fmin                 chisq
+.000                 0.104                29.231
+                   df                pvalue        baseline.chisq
+.000                 0.000               202.409
+          baseline.df       baseline.pvalue                   cfi
+.000                 0.000                 0.860
+                  tli                  nnfi                   rfi
+.300                 0.300                 0.278
+                  nfi                  pnfi                   ifi
+.856                 0.171                 0.868
+                  rni                  logl     unrestricted.logl
+.860             -1391.274             -1376.659
+                  aic                   bic                ntotal
+.548              2871.498               140.000
+                 bic2                 rmsea        rmsea.ci.lower
+.548                 0.250                 0.172
+       rmsea.ci.upper        rmsea.ci.level          rmsea.pvalue
+.336                 0.900                 0.000
+       rmsea.close.h0 rmsea.notclose.pvalue     rmsea.notclose.h0
+.050                 1.000                 0.080
+                  rmr            rmr_nomean                  srmr
+.122                 0.122                 0.074
+         srmr_bentler   srmr_bentler_nomean                  crmr
+.074                 0.074                 0.088
+          crmr_nomean            srmr_mplus     srmr_mplus_nomean
+.088                 0.074                 0.074
+                cn_05                 cn_01                   gfi
+.428                55.335                 0.941
+                 agfi                  pgfi                   mfi
+.587                 0.134                 0.911
+                 ecvi
+.466
+> modificationIndices(fit)
+        lhs op      rhs     mi    epc sepc.lv sepc.all sepc.nox
+interest ~~  achieve 25.396 -2.899  -2.899   -2.118   -2.118
+  achieve ~~  anxiety  6.669  6.803   6.803    4.408    4.408
+  achieve ~~  mastery 22.476 -1.743  -1.743   -1.257   -1.257
+  achieve ~~ perfgoal  2.763 -0.406  -0.406   -0.137   -0.137
+  achieve ~~      ses 20.541  0.186   0.186    0.376    0.376
+  anxiety ~~  mastery  0.921  0.765   0.765    0.354    0.354
+  anxiety ~~ perfgoal  0.921 -3.576  -3.576   -0.771   -0.771
+  anxiety ~~      ses  0.921 -0.061  -0.061   -0.078   -0.078
+interest  ~  achieve 25.396 -2.933  -2.933   -2.106   -2.106
+  achieve  ~ perfgoal  4.551 -0.062  -0.062   -0.146   -0.146
+  achieve  ~      ses 22.431  0.837   0.837    0.329    0.329
+  anxiety  ~ interest  0.921 -0.502  -0.502   -0.538   -0.538
+  anxiety  ~  achieve  1.119  1.922   1.922    1.478    1.478
+  anxiety  ~      ses  0.921 -0.261  -0.261   -0.079   -0.079
+  mastery  ~ interest  0.923 12.889  12.889   16.348   16.348
+  mastery  ~  achieve 22.801 -1.774  -1.774   -1.616   -1.616
+  mastery  ~  anxiety  0.921  0.318   0.318    0.376    0.376
+perfgoal  ~  achieve  2.745 -0.410  -0.410   -0.174   -0.174
+perfgoal  ~  anxiety  0.921 -1.484  -1.484   -0.820   -0.820
+      ses  ~ interest  0.923 -1.021  -1.021   -3.618   -3.618
+      ses  ~  achieve 20.964  0.190   0.190    0.484    0.484
+      ses  ~  anxiety  0.921 -0.025  -0.025   -0.083   -0.083
+>
+>
+</code>
+-----------------------------
+  * Specification of model using auto.var argument...
+<code>
+# model specification
+model<-'
+  # equation where interest is predicted by ses & mastery and
+  # performance goals
+  interest ~ mastery + perfgoal + ses
+  # equation where achieve is predicted by interest and anxiety
+  achieve~anxiety+interest+mastery
+  #equation where anxiety is predicted by mastery and performance goals
+  anxiety~perfgoal+mastery
+  # estimtating the variances of the exogenous variables (ses, mastery,performance)
+  mastery~~mastery
+  perfgoal~~perfgoal
+  ses~~ses
+  # estimtating the covariances of the exogenous variables (ses, mastery,performance)
+  mastery~~perfgoal+ses
+  perfgoal~~ses
+  # The auto.var argument when fitting the model can be used so that
+  # you do not have to directly request estimation of residual variances
+  # Estimating the covariance of residuals for interest and anxiety
+  interest~~anxiety'
+  fit<-lavaan(model, data=processdata, auto.var=TRUE)
+  summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
+</code>
+  * There are a couple of ways you can obtain path diagrams (although they can be somewhat tricky to implement.
+  * One approach is to use the 'semPaths' function from the 'semPlot' package. Below, I provide a rough demo of this approach. Citations containing additional information is provided below the demo.
+<code>
+install.packages("semPlot")
+library("semPlot")
+semPaths(fit,what="paths",whatLabels="par",style="lisrel",layout="tree",
+rotation=2)
+</code>
+  * A second approach is to use the 'lavaanPlot" function from the 'lavaanPlot' package.
+<code>
+install.packages("lavaanPlot")
+library(lavaanPlot)
+lavaanPlot(model = fit,
+    node_options = list(shape = "box", fontname = "Helvetica"),
+    edge_options = list(color = "grey"),
+    coefs = TRUE,
+    covs = TRUE,
+    stars = c("regress"))
+</code>
+----
+Resources on the use of lavaan:
+  * http://lavaan.ugent.be/tutorial/tutorial.pdf
+  * https://www.jstatsoft.org/index.php/jss/article/view/v048i02/v48i02.pdf
+  * https://cran.r-project.org/web/packages/lavaan/lavaan.pdf
+  * http://lavaan.ugent.be/tutorial/index.html
+----
+Using the 'semPlot' package
+  * https://cran.r-project.org/web/packages/semPlot/semPlot.pdf
+  * https://www.rdocumentation.org/packages/semPlot/versions/1.1.2/topics/semPaths
+  * http://sachaepskamp.com/semPlot/examples
+----
+Using the 'lavaanPlot' package
+  * https://cran.r-project.org/web/packages/lavaanPlot/lavaanPlot.pdf
+  * https://cran.r-project.org/web/packages/lavaanPlot/vignettes/Intro_to_lavaanPlot.html
+  * https://cran.rstudio.com/web/packages/lavaanPlot/vignettes/Intro_to_lavaanPlot.html
+  * http://www.alexlishinski.com/post/2018-04-13-lavaanplot0.5/
+----
+Raw data for all examples can be downloaded at...
+  * https://drive.google.com/open?id=1Ge0kIn7-f6gSfL40mZ47zGFH5WNSXRBt
+A copy of the Powerpoint of the model specification can be downloaded at...
+  * https://drive.google.com/open?id=1Nvpz7RnBEfEzK1VJKZksy6PBO4mzbdLO
+Basics of path analysis using Lavaan.txt
+Displaying Basics of path analysis using Lavaan.txt.
+CODING
+<code>
+processdata<-read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv",
+                       header=TRUE, sep=",", fileEncoding="UTF-8-BOM")
+str(processdata)
+library(lavaan)
+model <- '
+    interest ~ mastery + perfgoal + ses
+    achieve ~ anxiety + interest + mastery
+    anxiety ~ perfgoal + mastery
+    # variances
+    mastery ~~ mastery
+    perfgoal ~~ perfgoal
+    ses ~~ ses
+    mastery ~~ perfgoal + ses
+    perfgoal ~~ ses
+    interest ~~ interest
+    anxiety ~~ anxiety
+    achieve ~~ achieve
+    interest~~anxiety
+'
+fit <- lavaan(model, data=processdata)
+fit <- sem(model, data=processdata)
+summary(fit, fit.measures=TRUE)
+summary(fit, fit.measures=TRUE, standardized=TRUE, rsquare=TRUE)
+parameterEstimates(fit)
+fitMeasures(fit)
+modificationIndices(fit)
+install.packages("semPlot")
+library("semPlot")
+semPaths(fit,what="paths",whatLabels="par",style="lisrel",layout="tree",
+rotation=2)
+install.packages("lavaanPlot")
+library(lavaanPlot)
+lavaanPlot(
+    model = fit,
+    node_options = list(shape = "box", fontname = "Helvetica"),
+    edge_options = list(color = "grey"),
+    coefs = TRUE, covs=TRUE,
+    stars = c("regress"))
+</code>
+----
+===== Lavaan 2 =====
+{{youtube>_tTPHt4cPwI}}
+<code>
+model <- '
+    # labeling path from mastery to interest
+    interest ~ a*mastery + perfgoal + ses
+    # labeling path from interest to achieve.
+    # Adding labeled path from
+    # mastery to achieve
+    achieve ~ e*anxiety + b*interest + c*mastery
+    # predicting anxiety and labeling path from mastery
+    anxiety ~ perfgoal + d*mastery
+    # estimtating the variances and covariances of
+    # the exogenous variables (ses, mastery,performance)
+    mastery~~mastery
+    perfgoal~~perfgoal
+    ses~~ses
+    mastery~~perfgoal+ses
+    perfgoal~~ses
+    # estimating the variances of residuals
+    # for endogenous variables
+    # (interest, anxiety, achieve)
+    interest~~interest
+    anxiety~~anxiety
+    achieve~~achieve
+    # estimating the covariance of residuals
+    # for interest and anxiety
+    interest~~anxiety
+    # calculating specific indirect effect
+    # of mastery on achieve via interest
+    SIE1:=a*b
+    # calculating specific indirect effect of
+    # mastery on achieve via anxiety
+    SIE2:=d*e
+    # calculating total indirect effect of
+    # mastery on achievement via mediators
+    TIE:=SIE1+SIE2
+    # calculating total effect of mastery on achieve
+    TE:=TIE+c'
+    # using naive bootstrap to obtain standard errors
+    fit <- sem(model, data=processdata, se="bootstrap")
+    summary(fit,fit.measures=TRUE)
+    # using 'parameterEstimates' function will give
+    # us confidence intervals based on naive bootstrap.
+    # A standard approach to testing indirect effects.
+    parameterEstimates(fit)
+</code>
+----
+===== Lavaan 3: Testing data normality =====
+{{youtube>HvYW_GeHpD8}}
+<code>
+processdata <- read.csv("http://commres.net/wiki/_media/r/path_analysis_datan_binw.csv")
+str(processdata)
+# install.packages("MVN")
+library(MVN)
+newdata <- processdata[c("achieve", "interest", "anxiety")]
+str(newdata)
+</code>
+Use the 'mvn' function to evalue normality
+Multivariate normality is evidenced by p-values associated with multivariate skewness and kurtosis statistics that are > .05. In those cases where both the skewness and kurtosis results are non-significant (p's > .05), then the data are assumed to follow a multivariate normal distribution where p > .05 (Korkmaz, Goksuluk, & Zarasiz, 2014, 2019).
+You can also use plots to explore possible multivariate outliers. Moreover, you can examine univariate tests of normality (the default is Shapiro-Wilk test, but can be changed if desired). A significant test result regarding a specific variable indicates a significant departure from normality.
+<code>
+mvn(newdata, mvnTest="mardia")
+mvn(newdata, multivariatePlot="qq")
+mvn(newdata, multivariateOutlierMethod="quan")
+</code>
+You can generate univariate plot as well to evaluate distribution of the endogenous variables for non-normality. Skewness values approaching 2 or kurtoisis values over 7 may be considered indicative of more "significant problems" with non-normality (Curran, et al., 1996).
+<code>
+mvn(newdata, univariatePlot="histogram")
+mvn(newdata, univariatePlot="box")
+model <- '
+    interest ~ mastery + perfgoal + ses
+    achieve ~ anxiety + interest + mastery
+    anxiety ~ perfgoal + mastery
+    # variances
+    mastery ~~ mastery
+    perfgoal ~~ perfgoal
+    ses ~~ ses
+    mastery ~~ perfgoal + ses
+    perfgoal ~~ ses
+    interest ~~ interest
+    anxiety ~~ anxiety
+    achieve ~~ achieve
+    interest~~anxiety
+'
+</code>
+We will fit the model using the 'estimator' argument at set it equal to "MLM." This will result in the Satorra-Bentler model chi-square being computed. We will also use the 'se' argument and set it to "roburst."
+<code>
+fit <- sem(model, data=processdata, estimator = "MLM", se="roburst")
+summary(fit,fit.measures=TRUE)
+</code>
+----
+reference
+{{youtube>8r9bUKUVecc?small}}
+see [[https://www.rensvandeschoot.com/tutorials/lme4/|lme4 tutorial]]
+===== Exercise =====
+Using mtcars in R
+<code>
+?mtcars
+mtcars
+str(mtcars)
+df <- mtcars
+</code>
+<code>
+# model specfication
+model <-'
+  mpg ~ hp + gear + cyl + disp + carb + am + wt
+  hp ~ cyl + disp + carb
+'
+# model fit
+fit <- cfa(model, data = mtcars)
+summary(fit, fit.measures = TRUE, standardized=T, rsquare=T)
+semPaths(fit, 'std', layout = 'circle')
+</code>