Abstract
Hyperspectral image classification is gaining popularity for high-precisionvision tasks in remote sensing, thanks to their ability to capture visualinformation available in a wide continuum of spectra. Researchers have beenworking on automating Hyperspectral image classification, with recent effortsleveraging Vision-Transformers. However, most research models only spectrainformation and lacks attention to the locality (i.e., neighboring pixels),which may be not sufficiently discriminative, resulting in performancelimitations. To address this, we present three contributions: i) We introducethe Hyperspectral Locality-aware Image TransformEr (HyLITE), a visiontransformer that models both local and spectral information, ii) A novelregularization function that promotes the integration of local-to-globalinformation, and iii) Our proposed approach outperforms competing baselines bya significant margin, achieving up to 10% gains in accuracy. The trained modelsand the code are available at HyLITE.