Continuous Listening for Unconstrained Spoken Dialog

6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing |

A major hindrance to rendering spoken dialog systems capable of ongoing, continuous listening without requiring a push-to-talk device is the problem of distinguishing speech which is intended for the system from that which is overheard. We present a decision-theoretic approach to this problem that exploits Bayesian models of spoken dialog at four levels of analysis within a domain-independent, multi-modal computational architecture called Quartet. We applied Quartet to the task of navigating PowerPoint slide shows during a spoken presentation in a prototype system called Presenter. We describe the runtime behavior of Presenter as well as the results of an experimental study comparing the performance of Presenter to human subjects in discriminating arbitrarily formed spoken requests for slide navigation during a recorded lecture.