Abstract
Utterance rewriting aims to recover coreferences and omitted information fromthe latest turn of a multi-turn dialogue. Recently, methods that tag ratherthan linearly generate sequences have proven stronger in both in- andout-of-domain rewriting settings. This is due to a tagger's smaller searchspace as it can only copy tokens from the dialogue context. However, thesemethods may suffer from low coverage when phrases that must be added to asource utterance cannot be covered by a single context span. This can occur inlanguages like English that introduce tokens such as prepositions into therewrite for grammaticality. We propose a hierarchical context tagger (HCT) thatmitigates this issue by predicting slotted rules (e.g., "besides_") whose slotsare later filled with context spans. HCT (i) tags the source string withtoken-level edit actions and slotted rules and (ii) fills in the resulting ruleslots with spans from the dialogue context. This rule tagging allows HCT to addout-of-context tokens and multiple spans at once; we further cluster the rulesto truncate the long tail of the rule distribution. Experiments on severalbenchmarks show that HCT can outperform state-of-the-art rewriting systems by~2 BLEU points.