Abstract
This paper compares the performance of BiLSTM and CNN+GRU deep learningmodels for Human Activity Recognition (HAR) on two WiFi-based Channel StateInformation (CSI) datasets: UT-HAR and NTU-Fi HAR. The findings indicate thatthe CNN+GRU model has a higher accuracy on the UT-HAR dataset (95.20%) thanksto its ability to extract spatial features. In contrast, the BiLSTM modelperforms better on the high-resolution NTU-Fi HAR dataset (92.05%) byextracting long-term temporal dependencies more effectively. The findingsstrongly emphasize the critical role of dataset characteristics andpreprocessing techniques in model performance improvement. We also show thereal-world applicability of such models in applications like healthcare andintelligent home systems, highlighting their potential for unobtrusive activityrecognition.