Grounding ‘Grounding’ in NLP
In contrast, Cognitive Science more formally defines “grounding” as the process of establishing what mutual information is required for successful communication between two interlocutors – a definition which might implicitly capture the NLP usage but differs in intent and scope.
The first and the most important dimension that bridges the gap between the two definitions of grounding is the aspect of coordination – alternatively viewed as the difference between static and dynamic grounding (Fig 2).
Static grounding is the most common type and assumes that the evidence for common ground or the gold truth for grounding is given or attained pseudo-automatically. This is demonstrated in Figure 2 (a). The sequence for this form of interaction includes: (1) human querying the agent, (2) agent querying the data or the knowledge it acquired, (3) agent retrieving and framing a response and (4) agent delivering it to the human. In this setting the common ground is the ground truth KB/data. The human and the agent have common ground by assuming its universality (i.e. no external references). Therefore, successfully grounding the query in this case relies solely on the agent being able to link the query to the data. For instance, in a scenario where a human wants to know the weather report, the accuracy of the database itself is axiomatic and we build a model for the agent to accurately retrieve the queried information in natural language.
Dynamic grounding posits that common ground is built via interactions and clarifications. The mutual information needed to communicate successfully is built via interactions including: Requesting and providing clarifications, Acknowledging or confirming the clarifications, Enacting or demonstrating to receive confirmations, and so forth. This dynamically-established-grounding guides the rest of the interaction by course-correcting any misunderstandings. The sequence of actions in dynamic grounding is demonstrated in Figure 2 (b). The steps for establishing grounding is a part of the interaction that includes: (1) The human querying the agent, (2) The agent requesting clarification or acknowledging, (3) The human clarifying or confirming. These three steps loop until a common ground is established. The remaining steps of (4) querying the data, (5) retrieving or framing a response, and (6) delivering the response, are same as that of static grounding. The agent and the human may not be on the same common ground but steps 2 and 3 loop as the conversation progresses to build this common ground. The process of successfully grounding the query not only relies on the ability of the agent to link the query but also to construct the common ground from the mutually shared information with respect to the human. Although there are efforts about clarification questioning (), the coverage of phenomena are still far from comprehensive (Benotti and Blackburn, 2021b).