Text frequency or wordlist analysis in Hamlet II

By Avishek Majumder and Priya Chetty on November 15, 2018

In the previous article, introductory aspects and the interface of Hamlet II were presented. Text analysis, or text frequency analysis, is an important and common text-based analysis using Wordlist. In this analysis, the transcript or the text file is assessed for occurrence or repetition or frequency of words. This is known as the wordlist analysis in Hamlet II. Wordlist creates a list of all the most repetitive words in a text file thereby showing the importance of the reoccurring words.

Text analysis is the technique of assessing texts from a draft or mainly a transcript. Text analysis also helps to create an overview of the most common words and show relevance to a research study. Text frequency analysis is also known as wordlist.

Steps for performing wordlist analysis in Hamlet II

Follow the below steps for performing wordlist analysis using Hamlet II.

  • Step 1: Transcribe interview responses in a Word document and convert it from .docx/.doc to .txt format.
  • Step 2: Now copy the .txt file format to the source folder of Hamlet II and paste the file. Source folder is the folder in C-drive where the Hamlet II software is installed.
  • Step 3: Open a file in Hamlet II using the following steps:
    1. Click on ‘File’
    2. Click on ‘Select’
    3. Select the appropriate file and click on ‘Open’.

The image below shows the process of step 3.

How to open a file in Hamlet II
How to open a file for worldlist analysis in Hamlet II

Now, Hamlet II will import the .txt file as shown in image below.

    Figure 2: Imported .TXT file
    Figure 2: Imported .TXT file
    • Step 4: To perform the wordlist analysis in Hamlet, perform the following steps.
    1. On the main toolbar or taskbar click on ‘Tools’.
    2. Click on ‘Wordlist’.
    3. In the new window, select the file.
    4. Click on ‘create a word list’.

    The figure below shows the entire step 4 (figure below).

    Figure 3: Steps for word list analysis
    Figure 3: Steps for wordlist analysis in Hamlet II
    • Step 5: A popup box for confirmation of the analysis will appear. These pop-up boxes usually ask if numerical are to be ignored for wordlist analysis. Click on ‘Yes’ (figure below).
    Popup box for ignoring numericals
    Figure 3: Popup box for ignoring numerical

    The software will tabulate the results from the frequency analysis of word list (figure below).

    Result from wordlist analysis
    Figure 4: Result for worldlist analysis in Hamlet II
    • Step 6: Close the popup box and click ‘Yes’ to save the wordlist in the warning popup box.
    • Step 7: Save the output in .LST file extension.
    Figure 4: Frequencies from wordlist for for worldlist analysis in Hamlet II

    Creating a stop-list

    The stop-list popup box will appear as shown in the image below.

    Stop-list for worldlist analysis in Hamlet II
    • Step 8: Now, either select from the list of stop-lists or make a new one from the stoplist_template.stp
    • Step 9: While creating one, choose any file and after a couple of popup boxes (ignore them, however, read them first) the stop-list entries appear in the list as shown in the figure below.
    Sample stop-list
    Figure 6: Sample stop-list for wordlist analysis in Hamlet II

    Wordlist using a stop-list

    You can either list the words in a text and save them, or arrange them chronologically. Arranging them chronologically means lemmatization. In this process, arrange and remove unwanted words properly with the utilization of a vocabulary and morphological analysis of words.

    • Step 10: On using the stop-list the result from wordlist analysis differs. Stop-list creates a list of words to exclude during wordlist assessment (figure below).
    Using filtered in wordlist
    Figure 7: Using filter for wordlist analysis in Hamlet II

    The figure below presents the result of wordlist after applying the stop-list feature. The words in the stop-list are ignored while testing wordlist. The list of stop-list comprised of words like; a, an, are, as, be, by, does, for, has, have, if, in, is, it, of, that, the, this, to, was, what, which, and whose. The stop-list helped in testing only the most important words needed from the text file.

    Results from filtered wordlist analysis
    Figure 8: Results from filtered wordlist analysis in Hamlet II

    Interpreting the results from Wordlist in textual analysis

    The highest frequency of wordlist from the transcript, ‘tourism’ with 117 counts followed by ‘medical’ with 112 counts. They shared a combined frequency and re-occurrence of 3.2%, indicating the topic of ‘medical tourism’ or ‘tourism in the medical sector’. In addition, other important words assessed in the assessment were ‘patients’ (95), ‘hospital’ (93), and ‘India’ (64).

    In case, the interpretation of the topic is different since the subject is familiar. Thus, the words tourism (117), medical (112), patients (95), hospital (93), and India (64) are in relevance to the main subject aim. The word count occurrence indicates relevance to the study topic, and that is why the word repetitions were the most.

    The main motive here was to link the occurrence of words to the main aim or objectives of the study. This type of interpretation helps in providing relevance to the study or research. Thus, this is another type of interpretation for wordlist in Hamlet II. The above instructions help in the successful assessment of word list in Hamlet II software. The next article instructs on other word-based analyses; KWIC and text profile.

    Difference between Hamlet II and other text-based analytical softwareHamlet II is a tool for quantitative textual analysis
    Avishek Majumder