Detecting object in aerial image is challenging task due to 1) objects areoften small and dense relative to images. 2) object scale varies in a largerange. 3) object number in different classes is imbalanced. Current solutionsalmost adopt cropping method: splitting high resolution images into serialssubregions (chips) and detecting on them. However, few works notice that someproblems including scale variation, object sparsity exist when directly trainnetwork with chips. In this work, Three augmentation methods are introduced.Specifically, we propose a scale adaptive module compatable with all existingcropping method. It dynamically adjust cropping size to balance coverproportion between objects and chips, which narrows object scale variation intraining and improves performance without bells and whistels; In addtion, weintroduce mosaic effective sloving object sparity and background similarityproblems in areial dataset; To balance catgory, we present mask resampling inchips providing higher quality training sample; Our model achievesstate-of-the-art perfomance on two popular aerial images datasets of VisDroneand UAVDT. Remarkably, All methods can independent apply to detectiorsincreasing performance steady without the sacrifice of inference efficiency.