Abstract
Accurately predicting enzyme functionality remains one of the majorchallenges in computational biology, particularly for enzymes with limitedstructural annotations or sequence homology. We present a novel multimodalQuantum Machine Learning (QML) framework that enhances Enzyme Commission (EC)classification by integrating four complementary biochemical modalities:protein sequence embeddings, quantum-derived electronic descriptors, moleculargraph structures, and 2D molecular image representations. Quantum VisionTransformer (QVT) backbone equipped with modality-specific encoders and aunified cross-attention fusion module. By integrating graph features andspatial patterns, our method captures key stereoelectronic interactions behindenzyme function. Experimental results demonstrate that our multimodal QVT modelachieves a top-1 accuracy of 85.1%, outperforming sequence-only baselines by asubstantial margin and achieving better performance results compared to otherQML models.