Dept. of Biomedical Engineering
Oregon Health & Science University
When engaged in a conversation, speakers use both verbal and non-verbal mechanisms to help coordinate the dialogue, ensuring that, at each point, the other is engaged in the dialogue, and is capable of hearing, understanding and responding to the speaker. The problem is that current Spoken Dialog Systems (SDSs) do not take full advantage of dialogue coordination mechanisms, which can lead to interactions that are unnatural and inefficient. However, we posit that an SDS should anticipate, recognize and potentially emulate the full richness of dialogue coordination mechanisms. In this dissertation research, we aim to further understand dialogue coordination mechanisms, and to assess how they might be used to ease human-computer interaction. We start by investigating what cues a human speaker uses to differentiate computer-directed speech from self-directed speech, and from human-directed speech, finding that in both cases speech directed to the computer is much louder. We next conduct a perceptual study to determine what cues people attend to when determining whether a speaker is addressing a computer or nearby human. Here we found that people tended to rely on the direction of the speaker's gaze, although this led to systematic errors in their judgments of addressee. We next investigate whether 'um' and 'uh' result from the same, or different cognitive processes, using human-human interaction data collected while clinicians interacted with children with typical development, autism, or developmental language disorder. Here we found that 'um' appears to be listener-oriented, and 'uh' speaker-oriented. Next, again using the data from above, we investigated what factors impact the length of inter-turn gaps, and whether there is an interaction between gaps, disuencies and social pressure to respond. Here we found that, after a question, speakers tend to respond more quickly, are more likely to start their speech with a disuency, and that the likelihood of a disuency increased with the length of the gap. Finally, we conduct a simulation study, using Reinforcement Learning, to demonstrate that dialogue policies can be created that take advantage of dialogue coordination mechanisms.
Center for Spoken Language Understanding
School of Medicine
Lunsford, Rebecca, "Toward improving dialogue coordination in spoken dialogue systems" (2012). Scholar Archive. 718.