{{:r:efa.csv}} > read.csv("http://commres.net/wiki/_media/r/efa.csv", header = T) > str(efa) 'data.frame': 90 obs. of 14 variables: $ Price : int 4 3 4 4 5 4 3 4 5 4 ... $ Safety : int 4 5 4 4 5 4 4 3 4 4 ... $ Exterior_Looks : int 5 3 3 4 4 5 3 4 5 3 ... $ Space_comfort : int 4 3 4 3 4 3 4 4 4 3 ... $ Technology : int 3 4 5 3 5 4 3 5 3 5 ... $ After_Sales_Service: int 4 4 5 4 4 5 5 4 5 4 ... $ Resale_Value : int 5 3 5 5 5 3 3 5 5 5 ... $ Fuel_Type : int 4 4 4 5 3 4 4 4 4 5 ... $ Fuel_Efficiency : int 4 3 5 4 4 3 5 4 4 4 ... $ Color : int 2 4 4 4 5 2 4 4 4 5 ... $ Maintenance : int 4 3 5 4 5 3 3 5 4 5 ... $ Test_drive : int 2 2 4 2 5 2 5 2 2 2 ... $ Product_reviews : int 4 2 4 5 5 2 2 4 4 2 ... $ Testimonials : int 3 2 3 3 2 3 4 4 4 4 ... > names(efa) [1] "Price" "Safety" "Exterior_Looks" "Space_comfort" [5] "Technology" "After_Sales_Service" "Resale_Value" "Fuel_Type" [9] "Fuel_Efficiency" "Color" "Maintenance" "Test_drive" [13] "Product_reviews" "Testimonials" > > ) Suppose that you want to do a factor analysis on efa data. We want to do rough one first without rotation method, no specific factor numbers. > efa.fa.rough <- fa(efa, rotate="none") > efa.fa.rough Factor Analysis using method = minres Call: fa(r = efa, rotate = "none") Standardized loadings (pattern matrix) based upon correlation matrix MR1 h2 u2 com Price 0.42 0.1755 0.82 1 Safety -0.10 0.0105 0.99 1 Exterior_Looks -0.07 0.0048 1.00 1 Space_comfort 0.25 0.0614 0.94 1 Technology 0.22 0.0479 0.95 1 After_Sales_Service 0.41 0.1687 0.83 1 Resale_Value 0.39 0.1497 0.85 1 Fuel_Type 0.22 0.0489 0.95 1 Fuel_Efficiency 0.72 0.5163 0.48 1 Color 0.39 0.1526 0.85 1 Maintenance 0.59 0.3527 0.65 1 Test_drive 0.29 0.0846 0.92 1 Product_reviews 0.49 0.2443 0.76 1 Testimonials 0.08 0.0060 0.99 1 MR1 SS loadings 2.02 Proportion Var 0.14 Mean item complexity = 1 Test of the hypothesis that 1 factor is sufficient. The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71 The degrees of freedom for the model are 77 and the objective function was 1.94 The root mean square of the residuals (RMSR) is 0.13 The df corrected root mean square of the residuals is 0.14 The harmonic number of observations is 90 with the empirical chi square 287.57 with prob < 4.7e-26 The total number of observations was 90 with Likelihood Chi Square = 160.6 with prob < 7.8e-08 Tucker Lewis Index of factoring reliability = 0.361 RMSEA index = 0.118 and the 90 % confidence intervals are 0.086 0.134 BIC = -185.88 Fit based upon off diagonal values = 0.52 Measures of factor score adequacy MR1 Correlation of (regression) scores with factors 0.87 Multiple R square of scores with factors 0.76 Minimum correlation of possible factor scores 0.51 > Then, check the eigen-values with e.values column > efa.fa.rough$e.values [1] 2.7550607 2.1640701 1.4645469 1.3299030 1.0402907 0.9919870 0.8063453 0.6810294 0.6013657 [10] 0.5536899 0.5136469 0.4665307 0.3566072 0.2749266 There are five possible factors of which e.values are over 1. We try to extract 5 factors then with varimax rotation method. > efa.fa.5 <- fa(efa, nfactors = 5, rotate="varimax") > efa.fa.5 Factor Analysis using method = minres Call: fa(r = efa, nfactors = 5, rotate = "varimax") Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 MR3 MR5 MR4 h2 u2 com Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2 Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9 Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3 Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2 Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5 After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4 Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5 Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1 Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2 Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4 Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4 Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7 Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4 Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2 MR1 MR2 MR3 MR5 MR4 SS loadings 1.72 1.50 1.09 1.03 1.00 Proportion Var 0.12 0.11 0.08 0.07 0.07 Cumulative Var 0.12 0.23 0.31 0.38 0.45 Proportion Explained 0.27 0.24 0.17 0.16 0.16 Cumulative Proportion 0.27 0.51 0.68 0.84 1.00 Mean item complexity = 1.7 Test of the hypothesis that 5 factors are sufficient. The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71 The degrees of freedom for the model are 31 and the objective function was 0.34 The root mean square of the residuals (RMSR) is 0.04 The df corrected root mean square of the residuals is 0.06 The harmonic number of observations is 90 with the empirical chi square 20.2 with prob < 0.93 The total number of observations was 90 with Likelihood Chi Square = 27.44 with prob < 0.65 Tucker Lewis Index of factoring reliability = 1.071 RMSEA index = 0 and the 90 % confidence intervals are 0 0.067 BIC = -112.06 Fit based upon off diagonal values = 0.97 Measures of factor score adequacy MR1 MR2 MR3 MR5 MR4 Correlation of (regression) scores with factors 0.86 0.91 0.80 0.93 0.82 Multiple R square of scores with factors 0.75 0.82 0.64 0.87 0.67 Minimum correlation of possible factor scores 0.50 0.64 0.27 0.75 0.34 > Sort the factor loadings to sort the variables out to proper factors. > fa.sort(efa.fa.5) Factor Analysis using method = minres Call: fa(r = efa, nfactors = 5, rotate = "varimax") Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 MR3 MR5 MR4 h2 u2 com Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5 Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4 Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2 Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2 Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2 Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1 Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5 Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9 Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2 Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7 Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4 After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4 Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4 Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3 MR1 MR2 MR3 MR5 MR4 SS loadings 1.72 1.50 1.09 1.03 1.00 Proportion Var 0.12 0.11 0.08 0.07 0.07 Cumulative Var 0.12 0.23 0.31 0.38 0.45 Proportion Explained 0.27 0.24 0.17 0.16 0.16 Cumulative Proportion 0.27 0.51 0.68 0.84 1.00 Mean item complexity = 1.7 Test of the hypothesis that 5 factors are sufficient. The degrees of freedom for the null model are 91 and the objective function was 2.97 with Chi Square of 247.71 The degrees of freedom for the model are 31 and the objective function was 0.34 The root mean square of the residuals (RMSR) is 0.04 The df corrected root mean square of the residuals is 0.06 The harmonic number of observations is 90 with the empirical chi square 20.2 with prob < 0.93 The total number of observations was 90 with Likelihood Chi Square = 27.44 with prob < 0.65 Tucker Lewis Index of factoring reliability = 1.071 RMSEA index = 0 and the 90 % confidence intervals are 0 0.067 BIC = -112.06 Fit based upon off diagonal values = 0.97 Measures of factor score adequacy MR1 MR2 MR3 MR5 MR4 Correlation of (regression) scores with factors 0.86 0.91 0.80 0.93 0.82 Multiple R square of scores with factors 0.75 0.82 0.64 0.87 0.67 Minimum correlation of possible factor scores 0.50 0.64 0.27 0.75 0.34 > We took a look at the result, and found out that h2 value of one variable, "Technology" is under 0.2, which is assessed as small. We want to eliminate this variable and do the factor analysis again. Also, with five factors there is one variable stick out as a factor. So, we seek out 4 factors instead of 5. MR1 MR2 MR3 MR5 MR4 h2 u2 com Resale_Value 0.69 -0.22 -0.17 0.16 0.01 0.59 0.413 1.5 Maintenance 0.61 0.07 0.07 0.07 0.25 0.45 0.549 1.4 Price 0.57 0.15 -0.05 -0.04 -0.02 0.35 0.645 1.2 Fuel_Efficiency 0.46 0.09 0.28 0.37 0.23 0.49 0.512 3.2 Space_comfort 0.02 0.87 0.21 0.00 -0.17 0.83 0.172 1.2 Fuel_Type 0.06 0.54 -0.02 0.08 -0.04 0.30 0.699 1.1 Technology 0.04 0.32 0.09 0.13 0.03 0.13 0.873 1.5 ** Safety -0.28 0.30 -0.16 0.11 0.05 0.21 0.790 2.9 Testimonials -0.18 0.04 0.66 -0.05 0.02 0.47 0.535 1.2 Test_drive 0.09 0.06 0.43 0.23 -0.07 0.26 0.745 1.7 Product_reviews 0.38 0.18 0.41 -0.03 0.07 0.36 0.642 2.4 After_Sales_Service 0.06 0.37 0.06 0.88 0.04 0.92 0.085 1.4 Color 0.21 -0.05 0.26 0.07 0.74 0.67 0.329 1.4 Exterior_Looks -0.01 0.05 0.20 0.01 -0.54 0.33 0.669 1.3 We can do this in several way as see below. > efa_a <- subset(efa, select = (efa.fa.5$uniquenesses <= 0.8)) > names(efa_a) [1] "Price" "Safety" "Exterior_Looks" "Space_comfort" [5] "After_Sales_Service" "Resale_Value" "Fuel_Type" "Fuel_Efficiency" [9] "Color" "Maintenance" "Test_drive" "Product_reviews" [13] "Testimonials" > efa_ab <- subset(efa, select = -(Technology)) > names(efa_a) == names(efa_ab) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > efa_ac <- efa[, !(names(efa) %in% drops)] > names(efa_a) == names(efa_ac) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > We check where locates the "Technology" variables in efa data set. It is in the fifth. > names(efa) [1] "Price" "Safety" "Exterior_Looks" "Space_comfort" [5] "Technology" "After_Sales_Service" "Resale_Value" "Fuel_Type" [9] "Fuel_Efficiency" "Color" "Maintenance" "Test_drive" [13] "Product_reviews" "Testimonials" > efa_ad <- efa[,-5] > names(efa_a) == names(efa_ad) [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE > Then, we do fa again with efa_a data. > efa_a.fa.4 <- fa(efa_a, nfactors=4, rotate="varimax") > efa_a.fa.4 Factor Analysis using method = minres Call: fa(r = efa_a, nfactors = 4, rotate = "varimax") Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 MR3 MR4 h2 u2 com Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1 Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2 Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2 Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4 After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6 Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2 Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1 Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0 Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5 Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4 Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3 Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0 Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3 MR1 MR2 MR3 MR4 SS loadings 1.73 1.37 1.18 1.00 Proportion Var 0.13 0.11 0.09 0.08 Cumulative Var 0.13 0.24 0.33 0.41 Proportion Explained 0.33 0.26 0.22 0.19 Cumulative Proportion 0.33 0.59 0.81 1.00 Mean item complexity = 1.6 Test of the hypothesis that 4 factors are sufficient. The degrees of freedom for the null model are 78 and the objective function was 2.76 with Chi Square of 231.56 The degrees of freedom for the model are 32 and the objective function was 0.48 The root mean square of the residuals (RMSR) is 0.05 The df corrected root mean square of the residuals is 0.07 The harmonic number of observations is 90 with the empirical chi square 29.46 with prob < 0.6 The total number of observations was 90 with Likelihood Chi Square = 38.68 with prob < 0.19 Tucker Lewis Index of factoring reliability = 0.889 RMSEA index = 0.06 and the 90 % confidence intervals are 0 0.097 BIC = -105.32 Fit based upon off diagonal values = 0.95 Measures of factor score adequacy MR1 MR2 MR3 MR4 Correlation of (regression) scores with factors 0.86 0.85 0.81 0.80 Multiple R square of scores with factors 0.74 0.72 0.66 0.64 Minimum correlation of possible factor scores 0.49 0.44 0.32 0.28 > > fa.sort(efa_a.fa.4) Factor Analysis using method = minres Call: fa(r = efa_a, nfactors = 4, rotate = "varimax") Standardized loadings (pattern matrix) based upon correlation matrix MR1 MR2 MR3 MR4 h2 u2 com Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2 Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4 Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1 Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0 Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4 Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1 After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6 Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2 Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3 Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0 Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3 Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5 Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2 MR1 MR2 MR3 MR4 SS loadings 1.73 1.37 1.18 1.00 Proportion Var 0.13 0.11 0.09 0.08 Cumulative Var 0.13 0.24 0.33 0.41 Proportion Explained 0.33 0.26 0.22 0.19 Cumulative Proportion 0.33 0.59 0.81 1.00 Mean item complexity = 1.6 Test of the hypothesis that 4 factors are sufficient. The degrees of freedom for the null model are 78 and the objective function was 2.76 with Chi Square of 231.56 The degrees of freedom for the model are 32 and the objective function was 0.48 The root mean square of the residuals (RMSR) is 0.05 The df corrected root mean square of the residuals is 0.07 The harmonic number of observations is 90 with the empirical chi square 29.46 with prob < 0.6 The total number of observations was 90 with Likelihood Chi Square = 38.68 with prob < 0.19 Tucker Lewis Index of factoring reliability = 0.889 RMSEA index = 0.06 and the 90 % confidence intervals are 0 0.097 BIC = -105.32 Fit based upon off diagonal values = 0.95 Measures of factor score adequacy MR1 MR2 MR3 MR4 Correlation of (regression) scores with factors 0.86 0.85 0.81 0.80 Multiple R square of scores with factors 0.74 0.72 0.66 0.64 Minimum correlation of possible factor scores 0.49 0.44 0.32 0.28 > We see four factors, of which names might be as follow: * MR1: economic factor * MR2: convenience factor * MR3: information (review) factor * MR4: look factor MR1 MR2 MR3 MR4 h2 u2 com Resale_Value 0.72 -0.14 -0.13 0.06 0.55 0.45 1.2 Maintenance 0.60 0.02 0.12 0.22 0.43 0.57 1.4 Price 0.54 0.11 0.01 -0.05 0.31 0.69 1.1 Fuel_Efficiency 0.50 0.23 0.33 0.30 0.50 0.50 3.0 ---- Space_comfort -0.01 0.74 0.26 -0.22 0.67 0.33 1.4 Fuel_Type 0.06 0.55 0.03 -0.07 0.31 0.69 1.1 After_Sales_Service 0.17 0.49 0.14 0.12 0.30 0.70 1.6 Safety -0.27 0.40 -0.16 0.07 0.26 0.74 2.2 ---- Testimonials -0.24 -0.04 0.67 0.00 0.51 0.49 1.3 Product_reviews 0.33 0.12 0.44 0.03 0.32 0.68 2.0 Test_drive 0.09 0.13 0.42 -0.02 0.20 0.80 1.3 ---- Color 0.20 -0.06 0.27 0.69 0.59 0.41 1.5 Exterior_Looks -0.01 0.02 0.19 -0.56 0.35 0.65 1.2