Developing a system to detect online offensive language is very important tothe health and the security of online users. Studies have shown that cyberhate,online harassment and other misuses of technology are on the rise, particularlyduring the global Coronavirus pandemic in 2020. According to the latest reportby the Anti-Defamation League (ADL), 35% of online users reported onlineharassment related to their identity-based characteristics, which is a 3%increase over 2019. Applying advanced techniques from the Natural LanguageProcessing (NLP) field to support the development of an online hate-freecommunity is a critical task for social justice. Transfer learning enhances theperformance of the classifier by allowing the transfer of knowledge from onedomain or one dataset to others that have not been seen before, thus,supporting the classifier to be more generalizable. In our study, we apply theprinciples of transfer learning cross multiple Arabic offensive languagedatasets to compare the effects on system performance. This study aims atinvestigating the effects of fine-tuning and training Bidirectional EncoderRepresentations from Transformers (BERT) model on multiple Arabic offensivelanguage datasets individually and testing it using other datasetsindividually. Our experiment starts with a comparison among multiple BERTmodels to guide the selection of the main model that is used for our study. Thestudy also investigates the effects of concatenating all datasets to be usedfor fine-tuning and training BERT model. Our results demonstrate the limitedeffects of transfer learning on the performance of the classifiers,particularly for highly dialectic comments.