Fine-tuning pretrained language models (LMs) without making any architecturalchanges has become a norm for learning various language downstream tasks.However, for non-language downstream tasks, a common practice is to employtask-specific designs for input, output layers, and loss functions. Forinstance, it is possible to fine-tune an LM into an MNIST classifier byreplacing the word embedding layer with an image patch embedding layer, theword token output layer with a 10-way output layer, and the word predictionloss with a 10-way classification loss, respectively. A natural questionarises: can LM fine-tuning solve non-language downstream tasks without changingthe model architecture or loss function? To answer this, we proposeLanguage-Interfaced Fine-Tuning (LIFT) and study its efficacy and limitationsby conducting an extensive empirical study on a suite of non-languageclassification and regression tasks. LIFT does not make any changes to themodel architecture or loss function, and it solely relies on the naturallanguage interface, enabling "no-code machine learning with LMs." We find thatLIFT performs relatively well across a wide range of low-dimensionalclassification and regression tasks, matching the performances of the bestbaselines in many cases, especially for the classification tasks. We report theexperimental results on the fundamental properties of LIFT, including itsinductive bias, sample efficiency, ability to extrapolate, robustness tooutliers and label noise, and generalization. We also analyze a fewproperties/techniques specific to LIFT, e.g., context-aware learning viaappropriate prompting, quantification of predictive uncertainty, and two-stagefine-tuning. Our code is available athttps://github.com/UW-Madison-Lee-Lab/LanguageInterfacedFineTuning.