Analytical bibliometric study and its challenges

By Avishek Majumder on February 26, 2018

Bibliometric studies are not a recent trend of studies. Bibliometrics are in simple terms metrics of bibliography where the studies are focused on understanding the trend of researches on a particular field of study or diverse field of studies (Hicks, Wouters, Waltman, De Rijcke, & Rafols, 2015). Bibliometrics are statistical analyses of written publications, books, journals or thesis articles and are frequently used in the field of library and information science, including scientometrics (Arsenova, 2013; Leydesdorff, 2015).

The process of bibliometric study

The process of bibliometric analyses in a flowchart is depicted as;

The bibliometric study process
The bibliometrics process

General challenges

However, in general majority of the academicians in this field face problems with respect to the collection of data and using a suitable software (Drew, Pettibone, Finch, Giles, & Jordan, 2016; Kurtz & Bollen, 2010). Challenges not limited to data collection or software usage but to;

  • Identification of correct file formats for various analyses.
  • Exporting and importing of file formats.
  • Manual data collection.
  • Lack of general format for manual data collection.
  • Interpretation of various technical terms.
  • Usage of Google Scholar, and Scopus index.
  • Usage of citation software.
  • Mapping analyses.

According to Hicks et al., (2015), every bibliometric study face problem of data collection, data analysis, and particular trend of previous studies. Moreover, Scopus index comprises of numerous journals with a diverse field of study. Furthermore, no field of study has yet completely assessed under bibliometric study. Arsenova, (2013) and Drew et al., (2016), on the other hand, mentioned that choosing the right techniques for bibliometrics and a standard base of assessment is very important in finding correct analyses for bibliometrics.

However, practically this statement does not hold true. Every bibliometric study done till date had used or adopted different methods of analysis and collection of data. In addition, the use of software or a group of software is also not the same. Every bibliometric study is different in methods of analysis and interpretations. Studies are not related to each other.

Challenges of selection of subject topic or journal publications

The major challenge in bibliometrics is the selection of a subject area such as arts, engineering, biology, veterinary, medical, law, management, mathematics and many others. Henceforth, choosing the subject area is a challenge. There are numerous journal publishers in every field, locally and internationally. Thus, it becomes difficult to choose and link between the two.

Furthermore, another challenge in this part of bibliometrics is that most of the journals are either not available on Scopus index or Web of Science due to low impact factor or are not published online by libraries such as Taylor & Francis, Springer and others (Kurtz & Bollen, 2010). Thus, the papers not published online do not get enough citations on Google Scholar. This hampers the motive of the study as the results are not reliable and collection data too becomes troublesome (Drew et al., 2016).

Problem to subject/journal selection
Problem to subject/journal selection


Choose specific topics such as engineering theories, criminal laws, humanities, molecular biology, genomics, fine arts, and similar other specific topics. Furthermore, narrow down the list of journals published to identify the specific subject topic. Narrow down the journal publications with Scopus index. Scopus index provides the list of all relevant and internationally accepted journals. Although Scopus index provides limited access, the database of all journals is available for free by just registering at This helps in narrowing down to relevant data collection from online libraries and interpret relevant results.

Solution to subject/journal selection
Solution to subject/journal selection

Challenge of data collection or extraction

Researchers use three modes of data collection. First is from the online library of Web of Science, Scopus Index or Google Scholar, downloading bibliometric files. The second method is online publishers such as Taylor & Francis, Springer-Nature, Science Direct, Elsevier, Wiley, and others. Lastly manual extraction of data into spreadsheets (Hicks et al., 2015; Leydesdorff, 2015). However, both Web of Science and Scopus Index provide limited access and is restricted to only authorized users (Harzing & Alakangas, 2016).

Furthermore, in the second method, the challenge is that they only allow to download .ris or .bibtex file extensions, and/or save to RefWorks. However, some online publishers do not have open access to citation data (Mongeon & Paul-Hus, 2016). Manual data collection is very time taking and lengthy. Moreover, the chances of mistake become more relevant and predictive due to the collection of large data that may even tally to approx. 20,000 cell entries in MS Excel or any other spreadsheets.

Problem to data collection
Problem to data collection


Assessing the most relevant techniques of data extraction, configuring the second and the third method helps towards appropriate data collection.

  1. The first step will be to download the files .ris or .bibtex and maintain a folder.
  2. Then download the free software Zotero or Mendeley, automated citation software helpful in bibliometric studies (Fernandez, 2011; Lestari Trisasti, 2014).
  3. Using them it is now possible to export to various other file extensions such as; .csv, .txt and .bib. Thus, manual data is achieved and diverse analyses are possible, even though manual edits in bibliographic data extracted are important.

However, manual bibliometric data collection is recommended for frequency analyses. This includes the number of papers published, the number of in-text citations, cross-tabulation for issue wise papers published, subject wise papers published and other similar analyses.

Solution to data collection
Solution to data collection

However, various challenges from the data analyses and descriptive analyses presented along with the solutions in the next article.