Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads
- Ji He ,
- Mari Ostendorf ,
- Xiaodong He ,
- Jianshu Chen ,
- Jianfeng Gao ,
- Lihong Li ,
- Li Deng
EMNLP 2016 |
We introduce an online popularity prediction and tracking task as a benchmark task for reinforcement learning with a combinatorial, natural language action space. A specified number of discussion threads predicted to be popular are recommended, chosen from a fixed window of recent comments to track. Novel deep reinforcement learning architectures are studied for effective modeling of the value function associated with actions comprised of interdependent sub-actions. The proposed model, which represents dependence between sub-actions through a bi-directional LSTM, gives the best performance across different experimental configurations and domains, and it also generalizes well with varying numbers of recommendation requests.