Recent advancements in natural language processing have led to theproliferation of large language models (LLMs). These models have been shown toyield good performance, using in-context learning, even on unseen tasks andlanguages. Additionally, they have been widely adopted aslanguage-model-as-a-service commercial APIs like GPT-4 API. However, theirperformance on African languages is largely unknown. We present an analysis ofthree popular large language models (mT0, LLaMa 2, and GPT-4) on five tasks(news topic classification, sentiment classification, machine translation,question answering, and named entity recognition) across 30 African languages,spanning different language families and geographical regions. Our resultssuggest that all LLMs produce below-par performance on African languages, andthere is a large gap in performance compared to high-resource languages likeEnglish most tasks. We find that GPT-4 has an average or impressive performanceon classification tasks but very poor results on generative tasks like machinetranslation. Surprisingly, we find that mT0 had the best overall oncross-lingual QA, better than the state-of-the-art supervised model (i.e.fine-tuned mT5) and GPT-4 on African languages. Overall, LLaMa 2 records theworst performance due to its limited multilingual capabilities andEnglish-centric pre-training corpus. In general, our findings present acall-to-action to ensure African languages are well represented in largelanguage models, given their growing popularity.