Abstract
While the volume of scholarly publications has increased at a frenetic pace,accessing and consuming the useful candidate papers, in very large digitallibraries, is becoming an essential and challenging task for scholars.Unfortunately, because of language barrier, some scientists (especially thejunior ones or graduate students who do not master other languages) cannotefficiently locate the publications hosted in a foreign language repository. Inthis study, we propose a novel solution, cross-language citation recommendationvia Hierarchical Representation Learning on Heterogeneous Graph (HRLHG), toaddress this new problem. HRLHG can learn a representation function by mappingthe publications, from multilingual repositories, to a low-dimensional jointembedding space from various kinds of vertexes and relations on a heterogeneousgraph. By leveraging both global (task specific) plus local (task independent)information as well as a novel supervised hierarchical random walk algorithm,the proposed method can optimize the publication representations by maximizingthe likelihood of locating the important cross-language neighborhoods on thegraph. Experiment results show that the proposed method can not only outperformstate-of-the-art baseline models, but also improve the interpretability of therepresentation model for cross-language citation recommendation task.