A Hardware-Efficient Photonic Tensor Core: Accelerating Deep Neural Networks with Structured Compression

Abstract

Recent advancements in artificial intelligence (AI) and deep neural networks(DNNs) have revolutionized numerous fields, enabling complex tasks byextracting intricate features from large datasets. However, the exponentialgrowth in computational demands has outstripped the capabilities of traditionalelectrical hardware accelerators. Optical computing offers a promisingalternative due to its inherent advantages of parallelism, high computationalspeed, and low power consumption. Yet, current photonic integrated circuits(PICs) designed for general matrix multiplication (GEMM) are constrained bylarge footprints, high costs of electro-optical (E-O) interfaces, and highcontrol complexity, limiting their scalability. To overcome these challenges,we introduce a block-circulant photonic tensor core (CirPTC) for astructure-compressed optical neural network (StrC-ONN) architecture. Byapplying a structured compression strategy to weight matrices, StrC-ONNsignificantly reduces model parameters and hardware requirements whilepreserving the universal representability of networks and maintainingcomparable expressivity. Additionally, we propose a hardware-aware trainingframework to compensate for on-chip nonidealities to improve model robustnessand accuracy. We experimentally demonstrate image processing and classificationtasks, achieving up to a 74.91% reduction in trainable parameters whilemaintaining competitive accuracies. Performance analysis expects acomputational density of 5.84 tera operations per second (TOPS) per mm^2 and apower efficiency of 47.94 TOPS/W, marking a 6.87-times improvement achievedthrough the hardware-software co-design approach. By reducing both hardwarerequirements and control complexity across multiple dimensions, this workexplores a new pathway to push the limits of optical computing in the pursuitof high efficiency and scalability.

Quick Read (beta)

loading the full paper ...