Abstract
We investigate a lattice-structured LSTM model for Chinese NER, which encodesa sequence of input characters as well as all potential words that match alexicon. Compared with character-based methods, our model explicitly leveragesword and word sequence information. Compared with word-based methods, latticeLSTM does not suffer from segmentation errors. Gated recurrent cells allow ourmodel to choose the most relevant characters and words from a sentence forbetter NER results. Experiments on various datasets show that lattice LSTMoutperforms both word-based and character-based LSTM baselines, achieving thebest results.
Quick Read (beta)
loading the full paper ...