How to perform hierarchical clustering using Hamlet II?

By Avishek Majumder & Priya Chetty on October 22, 2018

The last article talked about performing joint frequencies and vocabulary lists using Hamlet II. It offered further analyses like correspondence, cluster analysis, multi-dimensional scaling, and multiple text comparisons. This article focuses on hierarchical clustering analysis. Hierarchical clustering uses methods to segregate the texts according to the similar vocabularies and then similar words or context are clustered together. Hierarchical clustering, along with non-hierarchical clustering, uses a multidimensional scaling technique which helps in similarities matrix assessments.

Steps to for hierarchical clustering using Hamlet II

  1. Once the results for Joint Frequencies has been derived, save the output file.
  2. Hamlet will prompt “Try a cluster analysis of the matrix?” ; Click “Yes”.
    Pop-up box for Hierarchical cluster analysis
    Pop-up box for Hierarchical cluster analysis
  3. The above-shown dialogue box will appear. Click on Diameter button to switch between these two clustering methods which are to be applied to the matrix of similarities.
  4. Click on “Display the cluster dendrogram” to create and save dendrogram plot of the clustering method selected. A window as shown in the figure below will appear.
    Dendogram achieved from the Hierarchical clustering analysis
    Dendrogram achieved from the Hierarchical cluster analysis
  5. Enter a minimum similarity value ranging from 0.0 – 1.0 in the box stating “Show clusters for a level of inclusion”. In this example, the value 0.4 is used as the minimum similarity value.
  6. Now, click “Continue” in order to proceed and save the matrices by clicking “Yes” on the pop up as shown below
    Save button for matrix file
    Save button for matrix file
  7. Save the matrix file in your destination folder in *.mat format as this output will be used moving forward for non-hierarchical cluster analysis

Hierarchical and non-hierarchical clustering

Hierarchical clustering as mentioned in the beginning considers only similar wordlist and vocabularies in order to form hierarchy based clusters. On the other hand, non-hierarchical clusters do not prepare clusters on the basis of the wordlist or similarities in the vocabularies. As from the image above it can be seen that the clusters of similar words are formed in equivalent distance to each other. Hierarchical clustering helps in assessing the individual similar vocabularies entries and similarity matrices between an unassigned item and texts in the existing clusters.
Non-hierarchical cluster analysis simply comprises a list of all the partitions generated from the hierarchical analysis. In the next article, the interpretation of the non-hierarchical analysis will be presented along with its interpretations.

Discuss