Abstract
Large Language Models (LLMs) have shown strong performance on programmingtasks, but can they generate student-like code like real students - imperfect,iterative, and stylistically diverse? We present ParaStudent, a systematicstudy of LLM-based "student-like" code generation in an introductoryprogramming course setting. Using a dataset of timestamped student submissionsacross multiple semesters, we design low- and high-resolution experiments tomodel student progress and evaluate code outputs along semantic, functional,and stylistic dimensions. Our results show that fine-tuning significantlyimproves alignment with real student trajectories and captures error patterns,incremental improvements, and stylistic variations more faithfully. This studyshows that modeling realistic student code requires capturing learning dynamicsthrough context-aware generation, temporal modeling, and multi-dimensionalevaluation. Code for experiments and evaluation is available athttps://github.com/mmiroyan/ParaStudent.