Meanings are like Onions: a Layered Approach to Metaphor Processing
Abstract
Metaphorical meaning is not a flat mapping between concepts, but a complex cognitive phenomenon that integrates multiple levels of interpretation. In this paper, we propose a stratified model of metaphor processing that treats meaning as an onion: a multi-layered structure comprising (1) content analysis, (2) conceptual blending, and (3) pragmatic intentionality. This three-dimensional framework allows for a richer and more cognitively grounded approach to metaphor interpretation in computational systems. At the first level, metaphors are annotated through basic conceptual elements. At the second level, we model conceptual combinations, linking components to emergent meanings. Finally, at the third level, we introduce a pragmatic vocabulary to capture speaker intent, communicative function, and contextual effects, aligning metaphor understanding with pragmatic theories. By unifying these layers into a single formal framework, our model lays the groundwork for computational methods capable of representing metaphorical meaning beyond surface associations—toward deeper, more context-sensitive reasoning.
Metaphors pervade human communication and cognition, extending far beyond mere linguistic decoration. As cognitive tools, they grant privileged access to implicit knowledge structures that might otherwise remain hidden [1]. By mapping relationships between concepts, metaphors serve as bridges that both reveal and reshape our conceptual frameworks—think of how we say that we spend, save, or waste time, implicitly assuming it is a finite resource.
Despite rapid progress in natural language processing, computational metaphor analysis continues to face five intertwined challenges rooted in the very knowledge structures metaphors invoke:
Data scarcity and representational gaps. Datasets accounting for many metaphorical phenomena are scarce, and building new ones is (i) resource-intensive, and (ii) hindered by frameworks that go no further than simple domain mappings.
Contextual insensitivity. While the Conceptual Metaphor Theory (CMT) developed by Lakoff and Johnson [2] dominates in computational accounts of metaphors, it often fails to capture, among other things, how context shifts a metaphor’s meaning in discourse.
Evaluation and standardization. Definitions, metrics, and supported linguistic forms (nominal, verbal, adjectival) in metaphor processing vary wildly across studies [3].
Theoretical fragmentation. Competing accounts (e.g., CMT vs. interactional or embodiment theories) illuminate different aspects of metaphorical phenomena but rarely integrate pragmatics— what a metaphor does in conversation—often goes unmodeled despite Speech Act Theory being a long-standing account of these processes and has been widely adopted in the literature [4].
Limits of current computational works. Even large language models (LLMs), the state of the art in metaphor processing, struggle to distinguish deep relational mappings from mere associations, particularly in complex or multimodal metaphors [5, 6].
Specifically, we put forward an operational framework designed to support the processing of metaphorical meaning in a way that can effectively represent and interweave both conceptual and pragmatic aspects.
2.1. Metaphor theories
Conceptual Metaphor Theory (CMT), developed by G. Lakoff and M. Johnson, posits that metaphors map a source domain onto a target domain via systematic correspondences, enabling abstract reasoning through familiar experiential structures [2]. Conceptual Blending Theory (CBT) extends this view by introducing a generic “blend” space that selectively inherits elements from both input domains according to a blending criterion or key property, such as yellow when we say “golden hair”, evoking that golden is yellow and shiny like the hair [8]. CBT is regarded as a valid computational approach also by the Categorization theory [9]. This account sees metaphors as category statements where the source acquires a categorical meaning, more abstract than its literal meaning. For example, “golden” in “golden hair” denotes the category of “shiny, yellow things”. In this view, conceptual blending can be used to extract the abstract meaning of the source and combine it with the meaning of the target. These frameworks emphasize that metaphor comprehension relies on shared background knowledge (or frames), which Fillmore’s frame semantics formalizes by associating lexical items with structured role–filler expectations [10]. The complementary relationship between CMT, CBT, and frame semantics highlights that metaphorical meaning emerges not merely from lexical similarity but from dynamic frame activation and role alignment within a community’s commonsense knowledge [4]. Beyond commonsense or prototypical knowledge, recent theories of metaphors have noted the lack of inclusion of personal and sometimes contextual aspects that influence knowledge acquisition and interchange. In fact, the experiential dimension of metaphor has traditionally been downplayed, with research focusing primarily on metaphors as a mental and individual achievement. Researchers have so far paid little attention to context and the collaborative production of metaphoric language [11].
The communicative aspect of metaphor is fundamental to view metaphor as a multidimensional phenomenon [12]. Metaphor systems are not neutral but reflect underlying belief systems that justify social actions and representations. In this view, language and metaphor in particular plays a key role in realizing these social and political values: texts are always “oriented social action” [13]. Linell’s notion of an “interworld” provides a valuable theoretical framework for understanding these social dimensions of metaphor [14]. Unlike traditional cognitive approaches that locate metaphor primarily in individual minds, the interworld concept emphasizes how metaphorical meaning emerges through interaction in a shared communicative space. As an example, long-standing views of metaphor like the one carried out by CMT presuppose universal bodily experiences, excluding experiences of the disabled [15]. For this reason, recent studies claim for a view on metaphor that is not just embodied, but inter-bodily. Indeed, Gibbs [16], contrary to the standard assumption within CMT that claims source domains of conceptual metaphors are primarily based on direct sensorymotor experiences, argues that metaphorical meanings do not necessarily arise from the mappings of purely embodied knowledge onto abstract concepts. Instead, the source domains themselves metaphorical in nature.
Connected to inter-bodily multidimensional accounts of metaphor is the metaphor resistance phenomenon, only recently studied, and the various reasons why it happens. For instance, people resist metaphors if they lack explanatory power or for a preference for alternative metaphorical concepts with respect to normative ones. However, without a comprehensive metaphor study it is not possible to know why some metaphors aren’t picked up [17]. Thus, we aim to shed light on these theoretical studies to account for a multidimensional view of metaphor that can also reflect in a new strand of computational metaphor processing studies.
2.5. The pragmatics perspective Pragmatics, as the study of the meaning of linguistic signs in context—that is, in their actual use—deals primarily with implicit linguistic knowledge. The type of meaning it investigates, the pragmatic message, is not encoded in any direct way in the literal utterance. One can say something that literally means one thing while actually intending something entirely different. For instance, saying to someone on the subway, “You’re standing on my foot”, means likely not wanting to describe the situation to them, but rather asking them to move [37]. The utterance may contain only hints as to how the pragmatic meaning should be interpreted, such as tone of voice, and it depends on a variety of extra-textual and extra-linguistic elements, including context, linguistic conventions, and socio-cultural norms. These pragmatic cues, along with contextual dependence, must be correctly interpreted, something that requires what is known as pragmatic competence. Utterances that contain metaphors can, of course, be described in pragmatic terms, and can further be understood as speech acts. Speech Act Theory (SAT) [38] [39] considers language and linguistic utterances not merely as expressions of mental operations, like articulations of thoughts, but as actions in themselves. As actions they produce effects in the world, while pursuing their speaker’s intentions. These are acts such as describing, ordering, pleading, mocking, or, through conventional formulas, marrying two people or sentencing someone.
Computational pragmatics deals mainly on how types of speech acts, conceived as categories of illocutionary acts—that is, acts defined by the speaker’s intention, by how they mean the literal sentence they utter—are assigned to utterances, framing this as a problem of context dependence. This is far from a trivial task: there is no deterministic relationship between clause types and illocutionary force, in the sense that imperative clauses are always commands, interrogative clauses always queries, and declarative clauses always assertions. Depending on factors like the power dynamics between speaker and addressee, and their communicative goals, an utterance may emerge as a complex blend of request, suggestion, and command. A “metaphorical speech act” would presumably involve a similar blend of illocutionary forces. According to certain frameworks, such as those proposed by Popa-Wyatt, one can distinguish primary and secondary illocutionary acts, sometimes nested within each other, as, for instance, in the case of an ironic act embedded inside a metaphorical one [40]. Though, speech act labeling should not be valuable as an exercise in itself, but only insofar as it helps identify the communicative purposes of an interaction, and, likewise, categorization of illocutionary acts should be interpreted functionally, rather than as fixed descriptive categories. Other main challenges in computational pragmatics include identifying and tracking the effects that utterances have on context, modeling context itself, and formalizing cultural conventions and intentionality, all of which are crucial for a pragmatic study of metaphor as well. In truth, in computational pragmatics, the pragmatic understanding of metaphor—why it is used, in what context, and with what effect—remains only partially addressed, despite the advances of contextual neural models in recent years. As we have seen, metaphor is often treated within the realm of automatic recognition and pragmatic-inferential interpretive approaches in the tradition of CMT, which aim to capture conceptual mapping. However, to be truly robust, these approaches require a clear representation of communicative goals. There are also attempts to address metaphor from the perspective of intention and dialogue, such as models of Theory of Mind based on beliefs and intentions, sometimes inspired by the Relevance Theory (RT), and adapted computationally [41]. In this view, metaphor is treated as a pragmatic tool conveying implicatures and implicit evaluations. Relevance Theory offers several strengths in capturing implicit meaning, particularly in its emphasis on inferential comprehension and the cognitive principle that human communication aims at maximal relevance with minimal processing effort. As argued by Gibbs [42], RT provides a flexible framework for understanding metaphor as an expression of speaker intention that relies on the hearer’s ability to infer non-literal meanings from contextual cues and assumptions. Unlike Speech Act Theory, which classifies utterances according to fixed illocutionary forces, RT focuses on the dynamic interplay between context, cognition, and communicative intention. While SAT offers a useful structural categorization of communicative acts, RT excels in modeling how listeners derive meaning beyond what is explicitly said. A fruitful direction for research lies in integrating these two perspectives: leveraging the inferential depth of Relevance Theory to handle non-literal and contextsensitive meaning, while also drawing on the intentional structure of Speech Act Theory to account for the performative dimension of metaphorical utterances. This is precisely what we aim to pursue, in order to better model metaphor as a communicative act endowed with pragmatic force.