Semantic Change Characterization with LLMs using Rhetorics

Paper · arXiv 2407.16624 · Published July 23, 2024

Languages continually evolve in response to societal events, resulting in new terms and shifts in meanings. These changes have significant implications for computer applications, including automatic translation and chatbots, making it essential to characterize them accurately. The recent development of Large Language Models (LLMs) has notably advanced natural language understanding, particularly in sense inference and reasoning. In this paper, we investigate the potential of LLMs in characterizing three types of semantic change: dimension, relation, and orientation. We achieve this by combining LLMs’ Chain-of-Thought with rhetorical devices and conducting an experimental assessment of our approach using newly created datasets. Our results highlight the effectiveness of LLMs in capturing and analyzing semantic changes, providing valuable insights to improve computational linguistic applications.

In this paper, we propose a method for automating the characterization of semantic change across different corpora. To this end, we rely on the following set of predominant typologies defined in the literature (Traugott, 2017; Juvonen and Koptjevskaja- Tamm, 2016):

• Broadening: gaining a new meaning related or not to the previous meaning, such that a word represents more concepts, e.g., ‘cloud,’ a computing infrastructure.

• Narrowing: restriction of meaning occurs when a symbol represents fewer concepts than previously, e.g., ‘gay’ which historically meant festive or happy, is now predominantly used to refer to homosexuality.

• Amelioration: a word gains a more positive sense to the previous sense, nice, ‘foolish, innocent’ changed to ‘pleasant.’

• Pejoration: the word is used with a worse connotation to the previous usage, stincan, ‘smell (sweet or bad)’ changed to stink.

• Metonymization: association between terms, e.g., board ‘table’, changed to “people sitting around a table, governing body.”

• Metaphorization: conceptualizing one thing in terms of another, e.g., ‘head of the company’ the word ‘head’ conceptualizes “command or control.”

In the dimension pole, we compute the “number of senses” a word can have. This pole is self-complementary as increasing represents broadening, and decreasing represents a narrowing of senses. After identifying the number of senses, we can compare the differences between corpora.

Metaphorical and metonymical changes are classified under the relation category, as these changes enhance the connection between one sense of a word and its other senses. In this framework, a word’s meaning relies on the link established through either conceptual (abstract relation) or material (physical association) similarity between concepts. We identify which senses are used figuratively in relation to other senses of the same word.

The orientation pole regroups the process of amelioration or pejoration of a meaning. In this pole, words are analyzed according to the contextual sentiment captured from each corpus, and then we analyze how the sentiment changes over corpora. In this study, we explore only positive, negative, and neutral sentiment values for words.

LLMs have exhibited significant progress in natural language comprehension. This includes reasoning by analogy (Webb et al., 2022), understanding metaphors (Liu et al., 2022), argumentation (Chen et al., 2023), and acquiring cultural knowledge (Petroni et al., 2019). Additionally, instructing an LLM to generate a rationale, which is a natural language explanation for its reasoning process, before providing an answer has been shown to improve performance on many NLP tasks that require logical reasoning (Wei et al., 2022; Kavumba et al., 2023). This rationale generation step is believed to inject more information retrieved from the LLM’s internal knowledge store into the prompt. This enriched prompt allows the LLM to consider a broader range of knowledge during the final decision-making process (Dasgupta et al., 2022).