Detecting Actionable Items in Meetings by Convolutional Deep Structured Semantic Models

  • Yun-Nung Chen ,
  • Dilek Hakkani-Tür ,
  • Xiaodong He

Proceedings of 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2015) |

Published by IEEE - Institute of Electrical and Electronics Engineers

The recent success of voice interaction with smart devices (humanmachine genre) and improvements in speech recognition for conversational speech show the possibility of conversation-related applications. This paper investigates the task of actionable item detection in meetings (human-human genre), where the intelligent assistant dynamically provides the participants access to information (e.g. scheduling a meeting, taking notes) without interrupting the meetings. A convolutional deep structured semantic model (CDSSM) is applied to learn the latent semantics for human actions and utterances from human-machine (source genre) and human-human (target) interactions. Furthermore, considering the mismatch between source and target genre and scarcity of annotated data sets for the target genre, we develop adaptation techniques that adjust the learned embeddings to better fit the target genre. Experiments show that CDSSM performs better for actionable item detection compared to baselines using lexical features (27.5% relative) and other semantic features (15.9% relative) when the source genre and target genre match with each other. When the target genre mismatches with the source genre, our proposed adaptation techniques further improve the performance. The discussion and analysis of the experiments provide a reasonable direction for such an actionable item detection task 1.