Recent adversarial attack developments have made reinforcement learning morevulnerable, and different approaches exist to deploy attacks against it, wherethe key is how to choose the right timing of the attack. Some work tries todesign an attack evaluation function to select critical points that will beattacked if the value is greater than a certain threshold. This approach makesit difficult to find the right place to deploy an attack without consideringthe long-term impact. In addition, there is a lack of appropriate indicators ofassessment during attacks. To make the attacks more intelligent as well as toremedy the existing problems, we propose the reinforcement learning-basedattacking framework by considering the effectiveness and stealthyspontaneously, while we also propose a new metric to evaluate the performanceof the attack model in these two aspects. Experimental results show theeffectiveness of our proposed model and the goodness of our proposed evaluationmetric. Furthermore, we validate the transferability of the model, and also itsrobustness under the adversarial training.