Abstract
This paper introduces a novel framework that leverages large language models(LLMs) for machine translation (MT). We start with one conjecture: an idealtranslation should contain complete and accurate information for a strongenough LLM to recover the original sentence. We generate multiple translationcandidates from a source language A to a target language B, and subsequentlytranslate these candidates back to the original language A. By evaluating thecycle consistency between the original and back-translated sentences usingmetrics such as token-level precision and accuracy, we implicitly estimate thetranslation quality in language B, without knowing its ground-truth. This alsohelps to evaluate the LLM translation capability, only with monolingualcorpora. For each source sentence, we identify the translation candidate withoptimal cycle consistency with the original sentence as the final answer. Ourexperiments demonstrate that larger LLMs, or the same LLM with more forwardpasses during inference, exhibit increased cycle consistency, aligning with theLLM model size scaling law and test-time computation scaling law. This workprovide methods for, 1) to implicitly evaluate translation quality of asentence in the target language, 2), to evaluate capability of LLM forany-to-any-language translation, and 3), how to generate a better translationfor a specific LLM.