We have already discussed about factor analysis in the previous article (Factor Analysis using SPSS), and how it should be conducted using SPSS. In this article we will be discussing about how output of Factor analysis can be interpreted.
The first output from the analysis is a table of descriptive statistics for all the variables under investigation. Typically, the mean, standard deviation and number of respondents (N) who participated in the survey are given. Looking at the mean, one can conclude that respectability of product is the most important variable that influences customers to buy the product. It has the highest mean of 6.08 (Table 1).
The correlation matrix
The next output from the analysis is the correlation coefficient. A correlation matrix is simple a rectangular array of numbers which gives the correlation coefficients between a single variable and every other variables in the investigation. The correlation coefficient between a variable and itself is always 1, hence the principal diagonal of the correlation matrix contains 1s (See Red Line in the Table 2 below). The correlation coefficients above and below the principal diagonal are the same. The determinant of the correlation matrix is shown at the foot of the table below.
With respect to Correlation Matrix if any pair of variables has a value less than 0.5, consider dropping one of them from the analysis (by repeating the factor analysis test in SPSS by removing variables whose value is less than 0.5). The off-diagonal elements (The values on the left and right side of diagonal in the table below) should all be very small (close to zero) in a good model.
Kaiser Meyer Olkin (KMO) and Bartlett’s Test (measures the strength of relationship among the variables)
The KMO measures the sampling adequacy (which determines if the responses given with the sample are adequate or not) which should be close than 0.5 for a satisfactory factor analysis to proceed. Kaiser (1974) recommend 0.5 (value for KMO) as minimum (barely accepted), values between 0.7-0.8 acceptable, and values above 0.9 are superb. Looking at the table below, the KMO measure is 0.417, which is close of 0.5 and therefore can be barely accepted (Table 3).
There is no significant answer to question “How many cases respondents do I need to factor analysis?”, and methodologies differ. A common rule is to suggest that a researcher has at least 10-15 participants per variable. Fiedel (2005) says that in general over 300 Respondents for sampling analysis is probably adequate. There is universal agreement that factor analysis is inappropriate when sample size is below 50.
Bartlett’s test is another indication of the strength of the relationship among variables. This tests the null hypothesis that the correlation matrix is an identity matrix. An identity matrix is matrix in which all of the diagonal elements are 1 (See Table 1) and all off diagonal elements (term explained above) are close to 0. You want to reject this null hypothesis. From the same table, we can see that the Bartlett’s Test Of Sphericity is significant (0.12). That is, significance is less than 0.05. In fact, it is actually 0.012, i.e. the significance level is small enough to reject the null hypothesis. This means that correlation matrix is not an identity matrix.
The next item from the output is a table of communalities which shows how much of the variance (i.e. the communality value which should be more than 0.5 to be considered for further analysis. Else these variables are to be removed from further steps factor analysis) in the variables has been accounted for by the extracted factors. For instance over
90% of the variance in “Quality of product” is accounted for, while 73.5% of the variance in “Availability of product” is accounted for (Table 4).
Total variance explained
Eigenvalue actually reflects the number of extracted factors whose sum should be equal to number of items which are subjected to factor analysis. The next item shows all the factors extractable from the analysis along with their eigenvalues.
The Eigenvalue table has been divided into three sub-sections, i.e. Initial Eigen Values, Extracted Sums of Squared Loadings and Rotation of Sums of Squared Loadings. For analysis and interpretation purpose we are only concerned with Extracted Sums of Squared Loadings. Here one should note that Notice that the first factor accounts for 46.367% of the variance, the second 18.471% and the third 17.013%. All the remaining factors are not significant (Table 5).
- Component: As can be seen in the Communalities table 3 above, there 8 components shown in column 1 under table 3.
- Initial Eigenvalues Total: Total variance.
- Initial Eigenvalues % of variance: The percent of variance attributable to each factor.
- Initial Eigenvalues Cumulative %: Cumulative variance of the factor when added to the previous factors.
- Extraction sums of Squared Loadings Total: Total variance after extraction.
- Extraction Sums of Squared Loadings % of variance: The percent of variance attributable to each factor after extraction. This value is of significance to us and therefore we determine in this step that they are three factors which contribute towards why would someone by a particular product.
- Extraction Sums of Squared Cumulative %: Cumulative variance of the factor when added to the previous factors after extraction.
- Rotation of Sums of Squared Loadings Total: Total variance after rotation.
- Rotation of Sums of Squared Loadings % of variance: The percent of variance attributable to each factor after rotation.
- Rotation of Sums of Squared Loadings Cumulative %: Cumulative variance of the factor when added to the previous factors.
The scree plot is a graph of the eigenvalues against all the factors. The graph is useful for determining how many factors to retain. The point of interest is where the curve starts to flatten. It can be seen that the curve begins to flatten between factors 3 and 4. Note also that factor 4 onwards have an eigenvalue of less than 1, so only three factors have been retained.
The table 6 below shows the loadings (extracted values of each item under 3 variables) of the eight variables on the three factors extracted. The higher the absolute value of the loading, the more the factor contributes to the variable (We have extracted three variables wherein the 8 items are divided into 3 variables according to most important items which similar responses in component 1 and simultaneously in component 2 and 3). The gap (empty spaces) on the table represent loadings that are less than 0.5, this makes reading the table easier. We suppressed all loadings less than 0.5 (Table 6).
Rotated component matrix
The idea of rotation is to reduce the number factors on which the variables under investigation have high loadings. Rotation does not actually change anything but makes the interpretation of the analysis easier. Looking at the table below, we can see that availability of product, and cost of product are substantially loaded on Factor (Component) 3 while experience with product, popularity of product, and quantity of product are substantially loaded on Factor 2. All the remaining variables are substantially loaded on Factor. These factors can be used as variables for further analysis (Table 7).
- How skewed was healthcare access in India by the end of 2020? - July 27, 2021
- Python syntax to correctly handle string data type - July 23, 2021
- The growing use of social media networksamong teenagers in India - July 17, 2021