We explore an online learning reinforcement learning (RL) paradigm foroptimizing parallel particle tracing performance in distributed-memory systems.Our method combines three novel components: (1) a workload donation model, (2)a high-order workload estimation model, and (3) a communication cost model, tooptimize the performance of data-parallel particle tracing dynamically. First,we design an RL-based workload donation model. Our workload donation modelmonitors the workload of processes and creates RL agents to donate particlesand data blocks from high-workload processes to low-workload processes tominimize the execution time. The agents learn the donation strategy on-the-flybased on reward and cost functions. The reward and cost functions are designedto consider the processes' workload change and the data transfer cost for everydonation action. Second, we propose an online workload estimation model, inorder to help our RL model estimate the workload distribution of processes infuture computations. Third, we design the communication cost model thatconsiders both block and particle data exchange costs, helping the agents makeeffective decisions with minimized communication cost. We demonstrate that ouralgorithm adapts to different flow behaviors in large-scale fluid dynamics,ocean, and weather simulation data. Our algorithm improves parallel particletracing performance in terms of parallel efficiency, load balance, and costs ofI/O and communication for evaluations up to 16,384 processors.