Developing the flocking behavior for a dynamic squad of fixed-wing UAVs isstill a challenge due to kinematic complexity and environmental uncertainty. Inthis paper, we deal with the decentralized flocking and collision avoidanceproblem through deep reinforcement learning (DRL). Specifically, we formulate adecentralized DRL-based decision making framework from the perspective of everyfollower, where a collision avoidance mechanism is integrated into the flockingcontroller. Then, we propose a novel reinforcement learning algorithm PS-CACERfor training a shared control policy for all the followers. Besides, we designa plug-n-play embedding module based on convolutional neural networks and theattention mechanism. As a result, the variable-length system state can beencoded into a fixed-length embedding vector, which makes the learned DRLpolicy independent with the number and the order of followers. Finally,numerical simulation results demonstrate the effectiveness of the proposedmethod, and the learned policies can be directly transferred to semi-physicalsimulation without any parameter finetuning.