Proactive behavior in voice assistants: A systematic review and conceptual model

Paper · Source
Design FrameworksSpeech VoiceConversation Agents

Yet, there is a lack of review studies synthesizing the current knowledge on how proactive behavior has been implemented in VAs and under what conditions proactivity has been found more or less suitable. To this end, we conducted a systematic

novel conceptual model encompassing context, initiation, and action components: Activity/status emerged as the primary contextual element, direct initiation was more common than indirect initiation, and suggestions were the primary action observed. Second, proactive behavior in VAs was predominantly explored in domestic and in-vehicle contexts, with only safety-critical and emergency situations demonstrating clear benefits for proactivity, compared to mixed findings for other scenarios

Given that the VA takes the lead in initiating interactions, there’s a risk of users perceiving these actions as interruptions, especially since voice assistants are usually not embodied but usually deprived of other visual or non-verbal cues or any kind of tangible “presence” and therefore, for instance, not as socially present as social robots. If executed with poor timing, proactive initiatives could easily be perceived as inappropriate or misaligned which could then have the potential to degrade the user-VA rapport and erode trust in the system

present multiple options and long responses through speech. It demands more time from the users and basic operations that can be done easily via GUIs such as undoing or browsing different options are harder to perform with VAs.

Eric Horvitz’s principles of mixed-initiative user interfaces (Horvitz, 1999, pp. 159–166), Yorke-Smith, Saadati, Myers, and Morley (2012) set out nine design principles for proactive CAs: valuable for the user; pertinent to the situation; competent with respect to the system’s abilities and knowledge; unobtrusive; transparent; controllable; deferent to the user; anticipatory about the current and future needs and opportunities; and safe. Eventually, the degree of proactivity should be tailored to the specific context and use case (Meurisch, Ionescu, Schmidt, & Mühlhauser, 2017), ranging from reactive responses (awaiting user prompts) to fully autonomous actions (independent of user input).