{{:r:efa.csv}}
> read.csv("http://commres.net/wiki/_media/r/efa.csv", header = T)
> str(efa)
'data.frame': 90 obs. of 14 variables:
$ Price : int 4 3 4 4 5 4 3 4 5 4 ...
$ Safety : int 4 5 4 4 5 4 4 3 4 4 ...
$ Exterior_Looks : int 5 3 3 4 4 5 3 4 5 3 ...
$ Space_comfort : int 4 3 4 3 4 3 4 4 4 3 ...
$ Technology : int 3 4 5 3 5 4 3 5 3 5 ...
$ After_Sales_Service: int 4 4 5 4 4 5 5 4 5 4 ...
$ Resale_Value : int 5 3 5 5 5 3 3 5 5 5 ...
$ Fuel_Type : int 4 4 4 5 3 4 4 4 4 5 ...
$ Fuel_Efficiency : int 4 3 5 4 4 3 5 4 4 4 ...
$ Color : int 2 4 4 4 5 2 4 4 4 5 ...
$ Maintenance : int 4 3 5 4 5 3 3 5 4 5 ...
$ Test_drive : int 2 2 4 2 5 2 5 2 2 2 ...
$ Product_reviews : int 4 2 4 5 5 2 2 4 4 2 ...
$ Testimonials : int 3 2 3 3 2 3 4 4 4 4 ...
> names(efa)
[1] "Price" "Safety" "Exterior_Looks" "Space_comfort"
[5] "Technology" "After_Sales_Service" "Resale_Value" "Fuel_Type"
[9] "Fuel_Efficiency" "Color" "Maintenance" "Test_drive"
[13] "Product_reviews" "Testimonials"
>
>
)
Suppose that you want to do a factor analysis on efa data. We want to do rough one first without rotation method, no specific factor numbers.
> efa.fa.rough <- fa(efa, rotate="none")
> efa.fa.rough
Factor Analysis using method = minres
Call: fa(r = efa, rotate = "none")
Standardized loadings (pattern matrix) based upon correlation matrix
MR1 h2 u2 com
Price 0.42 0.1755 0.82 1
Safety -0.10 0.0105 0.99 1
Exterior_Looks -0.07 0.0048 1.00 1
Space_comfort 0.25 0.0614 0.94 1
Technology 0.22 0.0479 0.95 1
After_Sales_Service 0.41 0.1687 0.83 1
Resale_Value 0.39 0.1497 0.85 1
Fuel_Type 0.22 0.0489 0.95 1
Fuel_Efficiency 0.72 0.5163 0.48 1
Color 0.39 0.1526 0.85 1
Maintenance 0.59 0.3527 0.65 1
Test_drive 0.29 0.0846 0.92 1
Product_reviews 0.49 0.2443 0.76 1
Testimonials 0.08 0.0060 0.99 1
MR1
SS loadings 2.02
Proportion Var 0.14
Mean item complexity = 1
Test of the hypothesis that 1 factor is sufficient.
The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71
The degrees of freedom for the model are 77 and the objective function was 1.94
The root mean square of the residuals (RMSR) is 0.13
The df corrected root mean square of the residuals is 0.14
The harmonic number of observations is 90 with the empirical chi square 287.57 with prob < 4.7e-26
The total number of observations was 90 with Likelihood Chi Square = 160.6 with prob < 7.8e-08
Tucker Lewis Index of factoring reliability = 0.361
RMSEA index = 0.118 and the 90 % confidence intervals are 0.086 0.134
BIC = -185.88
Fit based upon off diagonal values = 0.52
Measures of factor score adequacy
MR1
Correlation of (regression) scores with factors 0.87
Multiple R square of scores with factors 0.76
Minimum correlation of possible factor scores 0.51
>
Then, check the eigen-values with e.values column
> efa.fa.rough$e.values
[1] 2.7550607 2.1640701 1.4645469 1.3299030 1.0402907 0.9919870 0.8063453 0.6810294 0.6013657
[10] 0.5536899 0.5136469 0.4665307 0.3566072 0.2749266
There are five possible factors of which e.values are over 1. We try to extract 5 factors then with varimax rotation method.
> efa.fa.5 <- fa(efa, nfactors = 5, rotate="varimax")
> efa.fa.5
Factor Analysis using method = minres
Call: fa(r = efa, nfactors = 5, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
MR1 MR2 MR3 MR5 MR4 h2 u2 com
Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2
Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9
Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3
Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2
Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5
After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4
Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5
Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1
Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2
Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4
Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4
Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7
Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4
Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2
MR1 MR2 MR3 MR5 MR4
SS loadings 1.72 1.50 1.09 1.03 1.00
Proportion Var 0.12 0.11 0.08 0.07 0.07
Cumulative Var 0.12 0.23 0.31 0.38 0.45
Proportion Explained 0.27 0.24 0.17 0.16 0.16
Cumulative Proportion 0.27 0.51 0.68 0.84 1.00
Mean item complexity = 1.7
Test of the hypothesis that 5 factors are sufficient.
The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71
The degrees of freedom for the model are 31 and the objective function was 0.34
The root mean square of the residuals (RMSR) is 0.04
The df corrected root mean square of the residuals is 0.06
The harmonic number of observations is 90 with the empirical chi square 20.2 with prob < 0.93
The total number of observations was 90 with Likelihood Chi Square = 27.44 with prob < 0.65
Tucker Lewis Index of factoring reliability = 1.071
RMSEA index = 0 and the 90 % confidence intervals are 0 0.067
BIC = -112.06
Fit based upon off diagonal values = 0.97
Measures of factor score adequacy
MR1 MR2 MR3 MR5 MR4
Correlation of (regression) scores with factors 0.86 0.91 0.80 0.93 0.82
Multiple R square of scores with factors 0.75 0.82 0.64 0.87 0.67
Minimum correlation of possible factor scores 0.50 0.64 0.27 0.75 0.34
>
Sort the factor loadings to sort the variables out to proper factors.
> fa.sort(efa.fa.5)
Factor Analysis using method = minres
Call: fa(r = efa, nfactors = 5, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
MR1 MR2 MR3 MR5 MR4 h2 u2 com
Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5
Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4
Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2
Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2
Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2
Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1
Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5
Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9
Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2
Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7
Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4
After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4
Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4
Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3
MR1 MR2 MR3 MR5 MR4
SS loadings 1.72 1.50 1.09 1.03 1.00
Proportion Var 0.12 0.11 0.08 0.07 0.07
Cumulative Var 0.12 0.23 0.31 0.38 0.45
Proportion Explained 0.27 0.24 0.17 0.16 0.16
Cumulative Proportion 0.27 0.51 0.68 0.84 1.00
Mean item complexity = 1.7
Test of the hypothesis that 5 factors are sufficient.
The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71
The degrees of freedom for the model are 31 and the objective function was 0.34
The root mean square of the residuals (RMSR) is 0.04
The df corrected root mean square of the residuals is 0.06
The harmonic number of observations is 90 with the empirical chi square 20.2 with prob < 0.93
The total number of observations was 90 with Likelihood Chi Square = 27.44 with prob < 0.65
Tucker Lewis Index of factoring reliability = 1.071
RMSEA index = 0 and the 90 % confidence intervals are 0 0.067
BIC = -112.06
Fit based upon off diagonal values = 0.97
Measures of factor score adequacy
MR1 MR2 MR3 MR5 MR4
Correlation of (regression) scores with factors 0.86 0.91 0.80 0.93 0.82
Multiple R square of scores with factors 0.75 0.82 0.64 0.87 0.67
Minimum correlation of possible factor scores 0.50 0.64 0.27 0.75 0.34
>
We took a look at the result, and found out that h2 value of one variable, "Technology" is under 0.2, which is assessed as small. We want to eliminate this variable and do the factor analysis again. Also, with five factors there is one variable stick out as a factor. So, we seek out 4 factors instead of 5.
MR1 MR2 MR3 MR5 MR4 h2 u2 com
Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5
Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4
Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2
Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2
Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2
Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1
Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5 **
Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9
Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2
Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7
Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4
After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4
Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4
Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3
We can do this in several way as see below.
> efa_a <- subset(efa, select = (efa.fa.5$uniquenesses <= 0.8))
> names(efa_a)
[1] "Price" "Safety" "Exterior_Looks" "Space_comfort"
[5] "After_Sales_Service" "Resale_Value" "Fuel_Type" "Fuel_Efficiency"
[9] "Color" "Maintenance" "Test_drive" "Product_reviews"
[13] "Testimonials"
> efa_ab <- subset(efa, select = -(Technology))
> names(efa_a) == names(efa_ab)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
> efa_ac <- efa[, !(names(efa) %in% drops)]
> names(efa_a) == names(efa_ac)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
>
We check where locates the "Technology" variables in efa data set. It is in the fifth.
> names(efa)
[1] "Price" "Safety" "Exterior_Looks" "Space_comfort"
[5] "Technology" "After_Sales_Service" "Resale_Value" "Fuel_Type"
[9] "Fuel_Efficiency" "Color" "Maintenance" "Test_drive"
[13] "Product_reviews" "Testimonials"
> efa_ad <- efa[,-5]
> names(efa_a) == names(efa_ad)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
>
Then, we do fa again with efa_a data.
> efa_a.fa.4 <- fa(efa_a, nfactors=4, rotate="varimax")
> efa_a.fa.4
Factor Analysis using method = minres
Call: fa(r = efa_a, nfactors = 4, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
MR1 MR2 MR3 MR4 h2 u2 com
Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1
Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2
Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2
Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4
After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6
Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2
Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1
Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0
Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5
Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4
Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3
Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0
Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3
MR1 MR2 MR3 MR4
SS loadings 1.73 1.37 1.18 1.00
Proportion Var 0.13 0.11 0.09 0.08
Cumulative Var 0.13 0.24 0.33 0.41
Proportion Explained 0.33 0.26 0.22 0.19
Cumulative Proportion 0.33 0.59 0.81 1.00
Mean item complexity = 1.6
Test of the hypothesis that 4 factors are sufficient.
The degrees of freedom for the null model are 78 and the objective function was 2.76 with Chi Square of 231.56
The degrees of freedom for the model are 32 and the objective function was 0.48
The root mean square of the residuals (RMSR) is 0.05
The df corrected root mean square of the residuals is 0.07
The harmonic number of observations is 90 with the empirical chi square 29.46 with prob < 0.6
The total number of observations was 90 with Likelihood Chi Square = 38.68 with prob < 0.19
Tucker Lewis Index of factoring reliability = 0.889
RMSEA index = 0.06 and the 90 % confidence intervals are 0 0.097
BIC = -105.32
Fit based upon off diagonal values = 0.95
Measures of factor score adequacy
MR1 MR2 MR3 MR4
Correlation of (regression) scores with factors 0.86 0.85 0.81 0.80
Multiple R square of scores with factors 0.74 0.72 0.66 0.64
Minimum correlation of possible factor scores 0.49 0.44 0.32 0.28
>
> fa.sort(efa_a.fa.4)
Factor Analysis using method = minres
Call: fa(r = efa_a, nfactors = 4, rotate = "varimax")
Standardized loadings (pattern matrix) based upon correlation matrix
MR1 MR2 MR3 MR4 h2 u2 com
Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2
Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4
Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1
Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0
Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4
Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1
After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6
Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2
Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3
Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0
Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3
Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5
Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2
MR1 MR2 MR3 MR4
SS loadings 1.73 1.37 1.18 1.00
Proportion Var 0.13 0.11 0.09 0.08
Cumulative Var 0.13 0.24 0.33 0.41
Proportion Explained 0.33 0.26 0.22 0.19
Cumulative Proportion 0.33 0.59 0.81 1.00
Mean item complexity = 1.6
Test of the hypothesis that 4 factors are sufficient.
The degrees of freedom for the null model are 78 and the objective function was 2.76 with Chi Square of 231.56
The degrees of freedom for the model are 32 and the objective function was 0.48
The root mean square of the residuals (RMSR) is 0.05
The df corrected root mean square of the residuals is 0.07
The harmonic number of observations is 90 with the empirical chi square 29.46 with prob < 0.6
The total number of observations was 90 with Likelihood Chi Square = 38.68 with prob < 0.19
Tucker Lewis Index of factoring reliability = 0.889
RMSEA index = 0.06 and the 90 % confidence intervals are 0 0.097
BIC = -105.32
Fit based upon off diagonal values = 0.95
Measures of factor score adequacy
MR1 MR2 MR3 MR4
Correlation of (regression) scores with factors 0.86 0.85 0.81 0.80
Multiple R square of scores with factors 0.74 0.72 0.66 0.64
Minimum correlation of possible factor scores 0.49 0.44 0.32 0.28
>
We see four factors, of which names might be as follow:
* MR1: economic factor
* MR2: convenience factor
* MR3: information (review) factor
* MR4: look factor
MR1 MR2 MR3 MR4 h2 u2 com
Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2
Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4
Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1
Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0
----
Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4
Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1
After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6
Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2
----
Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3
Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0
Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3
----
Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5
Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2