Title:Evaluating Effectiveness and Portability of Reinforcement Learned Dialogue Strategies with real users: the TALK TownInfo Evaluation
Authors: Oliver Lemon ; Kallirroi Georgila ; James Henderson
Date:Dec 2006
Publication Title:IEEE/ACL Spoken Language Technology
Publication Type:Conference Paper Publication Status:Pre-print
We report evaluation results for real users of a learnt dialogue management policy versus a hand-coded policy in the TALK project's ``TownInfo'' tourist information system. The learnt policy, for filling and confirming information slots, was derived from COMMUNICATOR (flight-booking) data using Reinforcement Learning (RL) as described in Henderson et al. 2005, ported to the tourist information domain (using a general method that we propose here), and tested using 18 human users in 180 dialogues, who also used a state-of-the-art hand-coded dialogue policy embedded in an otherwise identical system. We found that users of the (ported) learned policy had an average gain in perceived task completion of 14.2% (from 67.6% to 81.8% at p<.03), that the hand-coded policy dialogues had on average 3.3 more system turns (p<.01), and that the user satisfaction results were comparable, even though the policy was learned for a different domain. Combining these in a dialogue reward score, we found a 14.4% increase for the learnt policy (a 23.8% relative increase, p<.03). These results are important because they show a) that results for real users are consistent with results for automatic evaluation of learned policies using simulated users, b) that a policy learned using linear function approximation over a very large policy space is effective for real users, and c) that policies learned using data for one domain can be used successfully in other domains. We also present a qualitative discussion of the learnt policy.
Bibtex format
author = { Oliver Lemon and Kallirroi Georgila and James Henderson },
title = {Evaluating Effectiveness and Portability of Reinforcement Learned Dialogue Strategies with real users: the TALK TownInfo Evaluation},
book title = {IEEE/ACL Spoken Language Technology},
year = 2006,
month = {Dec},
url = {},

