The Partner Modelling Questionnaire: A validated self-report measure of perceptions toward machines as dialogue partners

Paper · arXiv 2308.07164 · Published August 14, 2023

“The basic tenet of partner modelling is that people form a mental representation of their dialogue partner as a communicative and social entity [13, 30]. Originating in psycholinguistics, the concept proposes that this mental representation informs what people say to a given interlocutor, how they say it, and the types of tasks someone might entrust their partner to carry out [13, 15]. Hence, partner models might also be understood as a heuristic account of a partner’s communicative ability and social relevance that guides a speaker toward interaction and language behaviours that are appropriate for a given interlocutor. In this sense it is similar to accounts of mental models in cognitive psychology [e.g., 45, 46] and Norman’s explanation of mental models in human-computer interaction (HCI) [65]. Indeed, a partner model can be broadly understood as a mental model of the dialogue partner. Informed by research examining the role of partner models in HHD and HMD interactions [13, 15, 23, 28], and established explanations of closely related concepts such as mental models [45, 46, 65, 84] and theory of mind [2, 3, 20], recent work proposed a working definition of partner models, defining them as: " an interlocutor’s cognitive representation of beliefs about their dialogue partner’s communicative ability. These perceptions are multidimensional and include judgements about cognitive, empathetic and/or functional capabilities of a dialogue partner. Initially informed by previous experience, assumptions and stereotypes, partner models are dynamically updated based on a dialogue partner’s behaviour and/or events during dialogue" [33]

Competent/Incompetent

Dependable/Unreliable

Capable/Incapable

Consistent/Inconsistent

Reliable/Uncertain

Clear/Ambiguous

Direct/Meandering

Expert/Amateur

Efficient/Inefficient 0.64

Honest/Misleading 0.63

Precise/Vague

Cooperative/Uncooperative

Human-like/Machine-like

Life-like/Tool-like

Warm/Cold

Empathetic/Apathetic

Personal/Generic

Authentic/Fake

Social/Transactional

Flexible/Inflexible

Interactive/Stop-Start

Interpretive/Literal

Spontaneous/Predetermined”

“Speech interfaces may include a broad range of technologies, from Voice Assistants such as Amazon’s Alexa, Apple’s Siri, Google Assistant or Microsoft’s Cortana, to various speech-based chatbots like Eviebot and BoiBot, to telephony systems used in telephone banking or ticket booking systems. Essentially, when referring to speech interfaces we mean any computer system you have interacted with using speech. You may have accessed these using, among other things, a smartphone, smart speaker, desktop or laptop computer, and/or in-car.”

“4.5.1 Factor Naming and Internal Reliability. The unidimensional form of the PMQ (PMQ Total) demonstrated good overall internal reliability (𝛼=0.77). Naming conventions for questionnaire construction determine that factors are named based on items contained within a factor, with particular emphasis given to items with the highest factor loadings. As such PMQ factors appear to represent perceptions of communicative competence and dependability (Factor 1), perceptions of human-likeness in communication (Factor 2), and perceptions of communicative flexibility (Factor 3) respectively.

The communicative competence and dependability factor, consisted of 12 items that accounted for 49 % of the variance within the model and demonstrated strong internal reliability (𝛼=0.88). The strongest loading items were competent/ incompetent, dependable/unreliable and capable/incapable. Collectively, these items reflect perceptions towards whether the machine is a dependable and competent dialogue partner.

The human-likeness in communication factor contained 7 items that accounted for 32 % of the variance within the model, and also demonstrated strong internal reliability (𝛼=0.8). The strongest loading items were human-like/machinelike, life-like/tool-like and warm/cold, which were accompanied by other items that reflect on how alike or unlike humans a system is seen to be in the way it communicates. This supports previous intuitions that humans act as an archetype for people when reasoning about or evaluating speech interface systems [34].

Finally, the communicative flexibility factor contained four items that accounted for 19 % of the variance within the model and also had good internal reliability (𝛼=0.72). Items within factor 3 included flexible/inflexible, interactive/stopstart, interpretive/literal and spontaneous/predetermined, coalescing around the concept of how flexible or predetermined dialogue agent capabilities are perceived to be.”