Information retrieval: Solving mismatching vocabulary in closed document collections
Abstract
During a search, phrase-terms expressed in queries are presented to an information retrieval system (IRS) to find documents relevant to a topic. The IRS makes relevance judgements by attempting to match vocabulary in queries to documents. If there is a mismatch, the problem of vocabulary mismatch occurs. The aim is to examine ways of searching for documents more effectively, in order to minimise mismatches. A further aim is to understand the mechanisms of, and the differences between, human and machine-assisted, retrieval. The objective of this study was to determine whether IRS-H (an IRS using the hybrid indexing method) and human participants agree or disagree on relevancy judgments, and whether the problem of mismatching vocabulary can be solved. A collection of eighty research documents and sixty-five phrase-terms were presented to (i) IRS-H and four participants in Test 1, and (ii) IRS-H and one participant (aided by search software) in Test 2. Statistical analysis was performed using the Kappa coefficient. IRS-H and the four participants’ judgements disagreed. IRS-H and the participant aided by search software judgments did agree. IRS-H solves the problem of mismatching vocabulary between a query and a document.Downloads
Copyright (c) 2022 Kyle Andrew Fitzgerald, Andre Charles de la Harpe, Corrie Susanna Uys, Andrew John Bytheway
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This journal is an open access journal, and the authors (copyright owners) should be properly acknowledged when works are cited. Authors retain publishing rights without any restrictions.
South African Journal of Libraries and Information Science is an Open Access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, without asking prior permission from the publisher or the author. This is in accordance with the BOAI definition of Open Access.
Creative Commons Attribution-ShareAlike 4.0 International License