How to perform hierarchical clustering using Hamlet II?

By Avishek Majumder & Priya Chetty on October 22, 2018

The last article talked about performing joint frequencies and vocabulary lists using Hamlet II. It offered further analyses like correspondence, cluster analysis, multi-dimensional scaling, and multiple text comparisons. This article focuses on hierarchical clustering analysis. Hierarchical clustering uses methods to segregate the texts according to the similar vocabularies and then similar words or context are clustered together. Hierarchical clustering, along with non-hierarchical clustering, uses a multidimensional scaling technique which helps in similarities matrix assessments.

Steps to for hierarchical clustering using Hamlet II

  1. Once the results for Joint Frequencies has been derived, save the output file.
  2. Hamlet will prompt “Try a cluster analysis of the matrix?” ; Click “Yes”.
    Pop-up box for Hierarchical cluster analysis
    Pop-up box for Hierarchical cluster analysis
  3. The above-shown dialogue box will appear. Click on Diameter button to switch between these two clustering methods which are to be applied to the matrix of similarities.
  4. Click on “Display the cluster dendrogram” to create and save dendrogram plot of the clustering method selected. A window as shown in the figure below will appear.
    Dendogram achieved from the Hierarchical clustering analysis
    Dendrogram achieved from the Hierarchical cluster analysis
  5. Enter a minimum similarity value ranging from 0.0 – 1.0 in the box stating “Show clusters for a level of inclusion”. In this example, the value 0.4 is used as the minimum similarity value.
  6. Now, click “Continue” in order to proceed and save the matrices by clicking “Yes” on the pop up as shown below
    Save button for matrix file
    Save button for matrix file
  7. Save the matrix file in your destination folder in *.mat format as this output will be used moving forward for non-hierarchical cluster analysis

Hierarchical and non-hierarchical clustering

Hierarchical clustering as mentioned in the beginning considers only similar wordlist and vocabularies in order to form hierarchy based clusters. On the other hand, non-hierarchical clusters do not prepare clusters on the basis of the wordlist or similarities in the vocabularies. As from the image above it can be seen that the clusters of similar words are formed in equivalent distance to each other. Hierarchical clustering helps in assessing the individual similar vocabularies entries and similarity matrices between an unassigned item and texts in the existing clusters.
Non-hierarchical cluster analysis simply comprises a list of all the partitions generated from the hierarchical analysis. In the next article, the interpretation of the non-hierarchical analysis will be presented along with its interpretations.

Priya is the co-founder and Managing Partner of Project Guru, a research and analytics firm based in Gurgaon. She is responsible for the human resource planning and operations functions. Her expertise in analytics has been used in a number of service-based industries like education and financial services.

Her foundational educational is from St. Xaviers High School (Mumbai). She also holds MBA degree in Marketing and Finance from the Indian Institute of Planning and Management, Delhi (2008).

Some of the notable projects she has worked on include:

  • Using systems thinking to improve sustainability in operations: A study carried out in Malaysia in partnership with Universiti Kuala Lumpur.
  • Assessing customer satisfaction with in-house doctors of Jiva Ayurveda (a project executed for the company)
  • Predicting the potential impact of green hydrogen microgirds (A project executed for the Government of South Africa)

She is a key contributor to the in-house research platform Knowledge Tank.

She currently holds over 300 citations from her contributions to the platform.

She has also been a guest speaker at various institutes such as JIMS (Delhi), BPIT (Delhi), and SVU (Tirupati).