Automatic Grammar Augmentation for Robust Voice Command Recognition

Abstract

This paper proposes a novel pipeline for automatic grammar augmentation thatprovides a significant improvement in the voice command recognition accuracyfor systems with small footprint acoustic model (AM). The improvement isachieved by augmenting the user-defined voice command set, also called grammarset, with alternate grammar expressions. For a given grammar set, a set ofpotential grammar expressions (candidate set) for augmentation is constructedfrom an AM-specific statistical pronunciation dictionary that captures theconsistent patterns and errors in the decoding of AM induced by variations inpronunciation, pitch, tempo, accent, ambiguous spellings, and noise conditions.Using this candidate set, greedy optimization based and cross-entropy-method(CEM) based algorithms are considered to search for an augmented grammar setwith improved recognition accuracy utilizing a command-specific dataset. Ourexperiments show that the proposed pipeline along with algorithms considered inthis paper significantly reduce the mis-detection and mis-classification ratewithout increasing the false-alarm rate. Experiments also demonstrate theconsistent superior performance of CEM method over greedy-based algorithms.

Quick Read (beta)

loading the full paper ...