The last article talked about performing joint frequencies and vocabulary lists using Hamlet II. It offered further analyses like correspondence, cluster analysis, multi-dimensional scaling, and multiple text comparisons. This article focuses on hierarchical clustering analysis. Hierarchical clustering uses methods to segregate the texts according to the similar vocabularies and then similar words or context are clustered together. Hierarchical clustering, along with non-hierarchical clustering, uses a multidimensional scaling technique which helps in similarities matrix assessments.
Steps to for hierarchical clustering using Hamlet II
- Once the results for Joint Frequencies has been derived, save the output file.
- Hamlet will prompt “Try a cluster analysis of the matrix?” ; Click “Yes”.
- The above-shown dialogue box will appear. Click on Diameter button to switch between these two clustering methods which are to be applied to the matrix of similarities.
- Click on “Display the cluster dendrogram” to create and save dendrogram plot of the clustering method selected. A window as shown in the figure below will appear.
- Enter a minimum similarity value ranging from 0.0 – 1.0 in the box stating “Show clusters for a level of inclusion”. In this example, the value 0.4 is used as the minimum similarity value.
- Now, click “Continue” in order to proceed and save the matrices by clicking “Yes” on the pop up as shown below
- Save the matrix file in your destination folder in *.mat format as this output will be used moving forward for non-hierarchical cluster analysis
Hierarchical and non-hierarchical clustering
Hierarchical clustering as mentioned in the beginning considers only similar wordlist and vocabularies in order to form hierarchy based clusters. On the other hand, non-hierarchical clusters do not prepare clusters on the basis of the wordlist or similarities in the vocabularies. As from the image above it can be seen that the clusters of similar words are formed in equivalent distance to each other. Hierarchical clustering helps in assessing the individual similar vocabularies entries and similarity matrices between an unassigned item and texts in the existing clusters.
Non-hierarchical cluster analysis simply comprises a list of all the partitions generated from the hierarchical analysis. In the next article, the interpretation of the non-hierarchical analysis will be presented along with its interpretations.
Latest posts by Avishek Majumder (see all)
- Serological and molecular marker analysis in infectious disease identification - January 14, 2019
- Health insurance policies play an important role in managing allergy - January 9, 2019
- Genetic biomarker of peripheral arterial disease (PAD) - January 7, 2019