How to perform hierarchical clustering using Hamlet II?

The last article talked about performing joint frequencies and vocabulary lists using Hamlet II. It offered further analyses like correspondence, cluster analysis, multi-dimensional scaling, and multiple text comparisons. This article focuses on hierarchical clustering analysis. Hierarchical clustering uses methods to segregate the texts according to the similar vocabularies and then similar words or context are clustered together. Hierarchical clustering, along with non-hierarchical clustering, uses a multidimensional scaling technique which helps in similarities matrix assessments.

Steps to for hierarchical clustering using Hamlet II

  1. Once the results for Joint Frequencies has been derived, save the output file.
  2. Hamlet will prompt “Try a cluster analysis of the matrix?” ; Click “Yes”.
    Pop-up box for Hierarchical cluster analysis

    Pop-up box for Hierarchical cluster analysis

  3. The above-shown dialogue box will appear. Click on Diameter button to switch between these two clustering methods which are to be applied to the matrix of similarities.
  4. Click on “Display the cluster dendrogram” to create and save dendrogram plot of the clustering method selected. A window as shown in the figure below will appear.
    Dendogram achieved from the Hierarchical clustering analysis

    Dendrogram achieved from the Hierarchical cluster analysis

  5. Enter a minimum similarity value ranging from 0.0 – 1.0 in the box stating “Show clusters for a level of inclusion”. In this example, the value 0.4 is used as the minimum similarity value.
  6. Now, click “Continue” in order to proceed and save the matrices by clicking “Yes” on the pop up as shown below
    Save button for matrix file

    Save button for matrix file

  7. Save the matrix file in your destination folder in *.mat format as this output will be used moving forward for non-hierarchical cluster analysis

Hierarchical and non-hierarchical clustering

Hierarchical clustering as mentioned in the beginning considers only similar wordlist and vocabularies in order to form hierarchy based clusters. On the other hand, non-hierarchical clusters do not prepare clusters on the basis of the wordlist or similarities in the vocabularies. As from the image above it can be seen that the clusters of similar words are formed in equivalent distance to each other. Hierarchical clustering helps in assessing the individual similar vocabularies entries and similarity matrices between an unassigned item and texts in the existing clusters.
Non-hierarchical cluster analysis simply comprises a list of all the partitions generated from the hierarchical analysis. In the next article, the interpretation of the non-hierarchical analysis will be presented along with its interpretations.

Avishek Majumder

Avishek Majumder

Research Analyst at Project Guru
Avishek is a Master in Biotechnology and has previously worked with Lifecell International Private Limited. Apart from data analysis and biological research, he loves photography and reading. He loves to play football and basketball in his spare time with an avid interest in adventure and nature. He was also a member of the Scouts in his school and has attended Military training.
Avishek Majumder

Related articles

  • Perform a non-hierarchical cluster analysis in Hamlet II Non-hierarchical cluster analysis is the next step to a hierarchical cluster model. It allows the partitioning of the similar matrices into equal numbers of clusters. It also creates a list of the partitions from the similar matrix generated in the hierarchical cluster.
  • Joint frequencies analysis using Hamlet II Joint frequencies analysis helps to search inter-connections between a number of keywords or character strings occurring in the text. It produces matrices of joint frequencies of the items of a specified vocabulary list with respect to a suitably chosen unit of context.
  • Application of PINDIS separately in Hamlet II This article explains the application of PINDIS separately in Hamlet II. It also presents an example using PINDIS analysis to understand the application in depth. Accessing PINDIS separately is possible only after creating the input file using 'Select' function.
  • Performing wordlist comparing, KWIC and text profile in Hamlet II This article presents the steps to perform frequency analyses which are, keyword in-context or KWIC and graphical analysis of wordlist and compare wordlist.
  • Steps to conduct MDPREF using Hamlet II for Singular Value Decomposition (SVD) This article talks about the application of Singular Value Decomposition (SVD) technique MDPREF using Hamlet II. It is performed on the same matrix of profiles or context units saved while performing joint frequency analysis.

Discuss

We are looking for candidates who have completed their master's degree or Ph.D. Click here to know more about our vacancies.