DPQ-HD: Post-Training Compression for Ultra-Low Power Hyperdimensional Computing

Abstract

Hyperdimensional Computing (HDC) is emerging as a promising approach for edgeAI, offering a balance between accuracy and efficiency. However, currentHDC-based applications often rely on high-precision models and/or encodingmatrices to achieve competitive performance, which imposes significantcomputational and memory demands, especially for ultra-low power devices. Whilerecent efforts use techniques like precision reduction and pruning to increasethe efficiency, most require retraining to maintain performance, making themexpensive and impractical. To address this issue, we propose a novel PostTraining Compression algorithm, Decomposition-Pruning-Quantization (DPQ-HD),which aims at compressing the end-to-end HDC system, achieving near floatingpoint performance without the need of retraining. DPQ-HD reduces computationaland memory overhead by uniquely combining the above three compressiontechniques and efficiently adapts to hardware constraints. Additionally, weintroduce an energy-efficient inference approach that progressively evaluatessimilarity scores such as cosine similarity and performs early exit to reducethe computation, accelerating prediction inference while maintaining accuracy.We demonstrate that DPQ-HD achieves up to 20-100x reduction in memory for imageand graph classification tasks with only a 1-2% drop in accuracy compared touncompressed workloads. Lastly, we show that DPQ-HD outperforms the existingpost-training compression methods and performs better or at par withretraining-based state-of-the-art techniques, requiring significantly lessoverall optimization time (up to 100x) and faster inference (up to 56x) on amicrocontroller

Quick Read (beta)

loading the full paper ...