Machine-learning based dialogue managers are able to learn complex behaviorsin order to complete a task, but it is not straightforward to extend theircapabilities to new domains. We investigate different policies' ability tohandle uncooperative user behavior, and how well expertise in completing onetask (such as restaurant reservations) can be reapplied when learning a new one(e.g. booking a hotel). We introduce the Recurrent Embedding Dialogue Policy(REDP), which embeds system actions and dialogue states in the same vectorspace. REDP contains a memory component and attention mechanism based on amodified Neural Turing Machine, and significantly outperforms a baseline LSTMclassifier on this task. We also show that both our architecture and baselinesolve the bAbI dialogue task, achieving 100% test accuracy.