What are the Goals of Distributional Semantics?
As Harnad (1990) discusses, if the meanings of words are defined only in terms of other words, these definitions are circular. One goal for a semantic model is to capture how language relates to the world, including sensory perception and motor control – this process of connecting language to the world is called grounding.5
A purely distributional model is not grounded, as it is only trained on text, with no direct link to the world. There are several ways we could try to ground a distributional model (for an overview, see: Baroni, 2016). The simplest way is to train a distributional model as normal, then combine it with a grounded model. For example, Bruni et al. (2011) concatenate distributional vectors and image feature vectors.
How do meanings relate to the world? In truth-conditional semantics, the answer is that meaning is defined in terms of truth.6 If an agent understands a language, then in any given situation, they know how to evaluate whether a sentence is true or false of that situation.7 An advantage of this approach is that it supports logical reasoning, which I will discuss in x5.2. One goal for a semantic theory is to be able to generalise to new situations. This is difficult for traditional truth-conditional semantics, with classical theories challenged on both philosophical grounds (for example: Wittgenstein, 1953, x66–71) and empirical grounds (for example: Rosch, 1975, 1978). However, a machine learning approach seems promising, since generalising to new data is a central aim of machine learning.
For a semantic model to be compatible with truth-conditional semantics, it is necessary to distinguish a concept (the meaning of a word) from a referent (an entity the word can refer to).8 The importance of this distinction has been noted for some time (for example: Ogden and Richards, 1923). A concept’s set of referents is called its extension.9