We have already discussed about factor analysis in the previous article (Factor Analysis using SPSS), and how it should be conducted using SPSS. In this article we will be discussing about how output of Factor analysis can be interpreted.
The first output from the analysis is a table of descriptive statistics for all the variables under investigation. Typically, the mean, standard deviation, and number of respondents (N) who participated in the survey are given. The mean value describes the characteristics of the most common response among the stated dataset. Therefore there is no minimum value required. Looking at the mean values in Table 1 below, one can conclude that ‘respectability of product’ is the most important variable that influences customers to buy the product. The lowest value of 2.42 for ‘cost of product’ indicates that the respondents approximately strongly disagree on cost of product role. All the variables’ role in consumers’ decision to buy a product can be interpreted in a similar way.
The correlation matrix
The next output from the analysis is the correlation coefficient. A correlation matrix is simple a rectangular array of numbers which gives the correlation coefficients between a single variable and every other variables in the investigation. The correlation coefficient between a variable and itself is always 1, hence the principal diagonal of the correlation matrix contains 1s (See Red Line in the Table 2 below). The correlation coefficients above and below the principal diagonal are the same. The determinant of the correlation matrix is shown at the foot of the table below.
With respect to correlation matrix if any pair of variables has a value less than 0.5, consider dropping one of them from the analysis. For this factor anlaysis need to be reperformed with exclusion of pair of variables with less than 0.5 value. The off-diagonal elements (The values on the left and right side of diagonal in the table below) should all be very small (close to zero) in a good model.
Kaiser Meyer Olkin (KMO) and Bartlett’s Test (measures the strength of relationship among the variables)
The KMO measures the sampling adequacy (which determines if the responses given with the sample are adequate or not) which should be close than 0.5 for a satisfactory factor analysis to proceed. Kaiser (1974) recommend 0.5 (value for KMO) as minimum (barely accepted), values between 0.7-0.8 acceptable, and values above 0.9 are superb. Looking at the table below, the KMO measure is 0.417, which is close of 0.5 and therefore can be barely accepted (Table 3).
There is no significant answer to question “How many cases respondents do I need to factor analysis?”, and methodologies differ. A common rule is to suggest that a researcher has at least 10-15 participants per variable. Fiedel (2005) says that in general over 300 Respondents for sampling analysis is probably adequate. There is universal agreement that factor analysis is inappropriate when sample size is below 50.
Bartlett’s test is another indication of the strength of the relationship among variables. This tests the null hypothesis that the correlation matrix is an identity matrix. An identity matrix is matrix in which all of the diagonal elements are 1 (See Table 1) and all off diagonal elements (term explained above) are close to 0. You want to reject this null hypothesis. From the same table, we can see that the Bartlett’s Test Of Sphericity is significant (0.12). That is, significance is less than 0.05. In fact, it is actually 0.012, i.e. the significance level is small enough to reject the null hypothesis. This means that correlation matrix is not an identity matrix.
The next item from the output is a table of communalities which shows how much of the variance (i.e. the communality value which should be more than 0.5 to be considered for further analysis. Else these variables are to be removed from further steps factor analysis) in the variables has been accounted for by the extracted factors. For instance over
90% of the variance in “Quality of product” is accounted for, while 73.5% of the variance in “Availability of product” is accounted for (Table 4).
Total variance explained
Eigenvalue actually reflects the number of extracted factors whose sum should be equal to number of items which are subjected to factor analysis. The next item shows all the factors extractable from the analysis along with their eigenvalues.
The Eigenvalue table has been divided into three sub-sections:
- Initial Eigen Values
- Extracted Sums of Squared Loadings
- Rotation of Sums of Squared Loadings.
For analysis and interpretation purpose we are concerned only with Initial Eigenvalues and Extracted Sums of Squared Loadings. As the requirement for identifying the number of components or factors stated by selected varaibles is the presence of eigenvalues to more than 1. Table 5 herein shows that for 1st component the value is 3.709 > 1, 2nd component is 1.478 > 1, 3rd component is 1.361 > 1, and 4th component is 0.600 < 1. Thus, the stated set of 8 variables with 12 observations represent three components. Further, the extracted sum of squared holding % of variance depict that first factor accounts for 46.367% of the variance features from the stated observations, the second 18.471% and the third 17.013% (Table 5). Thus, 3 components are effective enough in representing all the characteristics or components highlighted by the stated 8 variables.
- Component: As can be seen in the Communalities table 3 above, there 8 components shown in column 1 under table 3.
- Initial Eigenvalues Total: Total variance.
- Initial Eigenvalues % of variance: The percent of variance attributable to each factor.
- Initial Eigenvalues Cumulative %: Cumulative variance of the factor when added to the previous factors.
- Extraction sums of Squared Loadings Total: Total variance after extraction.
- Extraction Sums of Squared Loadings % of variance: The percent of variance attributable to each factor after extraction. This value is of significance to us and therefore we determine in this step that they are three factors which contribute towards why would someone by a particular product.
- Extraction Sums of Squared Cumulative %: Cumulative variance of the factor when added to the previous factors after extraction.
- Rotation of Sums of Squared Loadings Total: Total variance after rotation.
- Rotation of Sums of Squared Loadings % of variance: The percent of variance attributable to each factor after rotation.
- Rotation of Sums of Squared Loadings Cumulative %: Cumulative variance of the factor when added to the previous factors.
The scree plot is a graph of the eigenvalues against all the factors. The graph is useful for determining how many factors to retain. The point of interest is where the curve starts to flatten. It can be seen that the curve begins to flatten between factors 3 and 4. Note also that factor 4 onwards have an eigenvalue of less than 1, so only three factors have been retained.
Table 6 below shows the loadings (extracted values of each item under 3 variables) of the eight variables on the three factors extracted. The higher the absolute value of the loading, the more the factor contributes to the variable. We have extracted three variables wherein the 8 items are divided into 3 variables according to most important items which similar responses in component 1 and simultaneously in components 2 and 3. The gap (empty spaces) on the table represents loadings that are less than 0.5, this makes reading the table easier. We suppressed all loadings less than 0.5. As the requirement of having precise computation of each factor component, but Table 6 depict that there is the presence of cross loading i.e. one factor measuring more than one component. As this cross-loading is very high in Table 6 i.e. cost of product, popularity of product, prestige of product, and quality of product has cross-loading, thus, for deriving more adequate results, these cross-loadings need to be eliminated. For this, the solution is to redistribute the factor loading by having rotation, and hence rotated component matrix is examined for identification of components.
Rotated component matrix
The idea of rotation is to reduce the number factors on which the variables under investigation have high loadings. Rotation does not actually change anything but makes the interpretation of the analysis easier. Looking at the table below, we can see that availability of product, and cost of product is substantially loaded on Factor (Component) 3 while experience with the product, popularity of the product, and quantity of product are substantially loaded on Factor 2. Sometimes the loading of variables are there on two components or more. Therefore there is a requirement of checking the factor loading value.
If the value is lower than the required value of 0.5 or the set limit (could be 0.6 too as per the researcher’s need of including the desired factor loading) for one of the components, then that variable could be considered for further analysis. But as the presence of more than 0.5 (or 0.6) loading in more than one component represents that this variable represents two components, thus, it is not effective in measuring a specific category. Hence, need to be excluded. As in Table 7 as experience with the product, and quality of the product measures more than one component, thus, they can’t be considered for further analysis. Hence, further processing i.e. impact analysis or any other statistical analysis with including all variables except experience with the product, and quality of the product (Table 7).
- Understanding the scheme of thesis chapterisation - September 16, 2021
- A thesis hypothesis plays a significant role in a study - September 13, 2021
- Understanding the importance of a research hypothesis - August 28, 2021