Abstract
Large language models (LLMs) have shown incredible performance in completingvarious real-world tasks. The current knowledge learning paradigm of LLMs ismainly based on learning from examples, in which LLMs learn the internal ruleimplicitly from a certain number of supervised examples. However, the learningparadigm may not well learn those complicated rules, especially when thetraining examples are limited. We are inspired that humans can learn the newtasks or knowledge in another way by learning from rules. That is, humans cangrasp the new tasks or knowledge quickly and generalize well given only adetailed rule and a few optional examples. Therefore, in this paper, we aim toexplore the feasibility of this new learning paradigm, which encodes therule-based knowledge into LLMs. We propose rule distillation, which first usesthe strong in-context abilities of LLMs to extract the knowledge from thetextual rules and then explicitly encode the knowledge into LLMs' parameters bylearning from the above in-context signals produced inside the model. Ourexperiments show that making LLMs learn from rules by our method is much moreefficient than example-based learning in both the sample size andgeneralization ability.