In the United States, primary open-angle glaucoma (POAG) is the leading causeof blindness, especially among African American and Hispanic individuals. Deeplearning has been widely used to detect POAG using fundus images as itsperformance is comparable to or even surpasses diagnosis by clinicians.However, human bias in clinical diagnosis may be reflected and amplified in thewidely-used deep learning models, thus impacting their performance. Biases maycause (1) underdiagnosis, increasing the risks of delayed or inadequatetreatment, and (2) overdiagnosis, which may increase individuals' stress, fear,well-being, and unnecessary/costly treatment. In this study, we examined theunderdiagnosis and overdiagnosis when applying deep learning in POAG detectionbased on the Ocular Hypertension Treatment Study (OHTS) from 22 centers across16 states in the United States. Our results show that the widely-used deeplearning model can underdiagnose or overdiagnose underserved populations. Themost underdiagnosed group is female younger (< 60 yrs) group, and the mostoverdiagnosed group is Black older (>=60 yrs) group. Biased diagnosis throughtraditional deep learning methods may delay disease detection, treatment andcreate burdens among under-served populations, thereby, raising ethicalconcerns about using deep learning models in ophthalmology clinics.