Can LLMs assist with Ambiguity? A Quantitative Evaluation of various Large Language Models on Word Sense Disambiguation
Ambiguous words are often found in modern digital communications. Lexical ambiguity challenges traditional Word Sense Disambiguation (WSD) methods, due to limited data. Consequently, the efficiency of translation, information retrieval, and question-answering systems is hindered by these limitations. This study investigates the use of Large Language Models (LLMs) to improve WSD using a novel approach combining a systematic prompt augmentation mechanism with a knowledge base (KB) consisting of different sense interpretations. The proposed method incorporates a human-in-loop approach for prompt augmentation where prompt is supported by Part-of-Speech (POS) tagging, synonyms of ambiguous words, aspect-based sense filtering and few-shot prompting to guide the LLM. By utilizing a few-shot Chain of Thought (COT) prompting-based approach, this work demonstrates a substantial improvement in performance.
However, research shows that a word's meaning is closely linked to the words around it proving that isolated word analysis is insufficient to perform correct sense identification (Luo et al., 2018). Therefore, proper word sense with positional value, POS tag and aspect of the sentence is being considered for accurate models. LLMs and generative AI, which are based on transformers, show promising results in the contextual understanding of words (Dettmers et al., 2023). These models have shown a strong ability to handle complex language tasks because of extensive training on vast amounts of data. Finetuning such base models for downstream tasks such as question answering and domain specific knowledge generation has shown promising results (Guo et al., 2023).
Instances of lexical ambiguity, such as words like ‘post’, ‘brake’, ‘part’, ‘bat’ and ‘try’ exhibit significant diversity, with each possessing more than ten distinct senses across both noun and verb forms. Pasini et al. have found that current proposed architectures are not confident enough to predict the sense for highly diverse ambiguous words, where there are multiple interpretations for the ambiguous word (Pasini et al., 2021). Drawing upon insights from existing literature, our work aims to evaluate the impact of using pre-trained language models for sense prediction for diverse ambiguous words and to identify the key factors influencing the performance of sense prediction of highly ambiguous words