Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language

  • 2020-10-13 15:06:07
  • Dan John Velasco
  • 1

Abstract

Low-resource languages such as Filipino suffer from data scarcity which makesit challenging to develop NLP applications for Filipino language. The use ofTransfer Learning (TL) techniques alleviates this problem in low-resourcesetting. In recent years, transformer-based models are proven to be effectivein low-resource tasks but faces challenges in accessibility due to its highcompute and memory requirements. There's a need for a cheaper but effectivealternative. This paper has three contributions. First, release a pre-trainedAWD LSTM language model for Filipino language. Second, benchmark AWD LSTM inthe Hate Speech classification task and show that it performs on par withtransformer-based models. Third, analyze the degradation rate of AWD-LSTM tosmaller data using degradation test and compare it with transformer-basedmodels. ----- Ang mga low-resource languages tulad ng Filipino ay gipit sa accessible nadatos kaya't mahirap gumawa ng mga applications sa wikang ito. Ang mga TransferLearning (TL) techniques ay malaking tulong para sa mga pagkakataong gipit tayosa datos. Sa mga nagdaang taon, nanaig ang mga transformer-based TL techniquespagdating sa low-resource tasks ngunit ito ay magastos sa resources. Kayanangangailangan ng mas mura pero epektibong alternatibo. Ang papel na ito aymay tatlong kontribusyon. Una, maglabas ng pre-trained AWD LSTM language modelsa wikang Filipino upang maging tuntungan sa pagbuo ng mga NLP applications sawikang Filipino. Pangalawa, mag benchmark ng AWD LSTM sa Hate Speechclassification task at ipakita na kayang nitong makipagsabayan sa mgatransformer-based models. Pangatlo, suriin ang degradation rate ng AWD-LSTM samas maliit na data gamit ang degradation test at ikumpara ito sa mgatransformer-based models.

 

Quick Read (beta)

loading the full paper ...