Abstract
Feature generation involves creating new features from raw data to capturecomplex relationships among the original features, improving model robustnessand machine learning performance. Current methods using reinforcement learningfor feature generation have made feature exploration more flexible andefficient. However, several challenges remain: first, during feature expansion,a large number of redundant features are generated. When removing them, currentmethods only retain the best features each round, neglecting those that performpoorly initially but could improve later. Second, the state representation usedby current methods fails to fully capture complex feature relationships. Third,there are significant differences between discrete and continuous features intabular data, requiring different operations for each type. To address thesechallenges, we propose a novel dual-agent reinforcement learning method forfeature generation. Two agents are designed: the first generates new features,and the second determines whether they should be preserved. A self-attentionmechanism enhances state representation, and diverse operations distinguishinteractions between discrete and continuous features. The experimental resultson multiple datasets demonstrate that the proposed method is effective. Thecode is available at https://github.com/extess0/DARL.