BriLLM: Brain-inspired Large Language Model

Abstract

This paper reports the first brain-inspired large language model (BriLLM).This is a non-Transformer, non-GPT, non-traditional machine learninginput-output controlled generative language model. The model is based on theSignal Fully-connected flowing (SiFu) definition on the directed graph in termsof the neural network, and has the interpretability of all nodes on the graphof the whole model, instead of the traditional machine learning model that onlyhas limited interpretability at the input and output ends. In the languagemodel scenario, the token is defined as a node in the graph. A randomly shapedor user-defined signal flow flows between nodes on the principle of "leastresistance" along paths. The next token or node to be predicted or generated isthe target of the signal flow. As a language model, BriLLM theoreticallysupports infinitely long $n$-gram models when the model size is independent ofthe input and predicted length of the model. The model's working signal flowprovides the possibility of recall activation and innate multi-modal supportsimilar to the cognitive patterns of the human brain. At present, we releasedthe first BriLLM version in Chinese, with 4000 tokens, 32-dimensional nodewidth, 16-token long sequence prediction ability, and language model predictionperformance comparable to GPT-1. More computing power will help us explore theinfinite possibilities depicted above.

Quick Read (beta)

loading the full paper ...