Leveraging Data Augmentation and Siamese Learning for Predictive Process Monitoring

Abstract

Predictive Process Monitoring (PPM) enables forecasting future events oroutcomes of ongoing business process instances based on event logs. However,deep learning PPM approaches are often limited by the low variability and smallsize of real-world event logs. To address this, we introduce SiamSA-PPM, anovel self-supervised learning framework that combines Siamese learning withStatistical Augmentation for Predictive Process Monitoring. It employs threenovel statistically grounded transformation methods that leverage control-flowsemantics and frequent behavioral patterns to generate realistic, semanticallyvalid new trace variants. These augmented views are used within a Siameselearning setup to learn generalizable representations of process prefixeswithout the need for labeled supervision. Extensive experiments on real-lifeevent logs demonstrate that SiamSA-PPM achieves competitive or superiorperformance compared to the SOTA in both next activity and final outcomeprediction tasks. Our results further show that statistical augmentationsignificantly outperforms random transformations and improves variability inthe data, highlighting SiamSA-PPM as a promising direction for training dataenrichment in process prediction.

Quick Read (beta)

loading the full paper ...