client2vec: Towards Systematic Baselines for Banking Applications

Abstract

The workflow of data scientists normally involves potentially inefficientprocesses such as data mining, feature engineering and model selection. Recentresearch has focused on automating this workflow, partly or in its entirety, toimprove productivity. We choose the former approach and in this paper share ourexperience in designing the client2vec: an internal library to rapidly buildbaselines for banking applications. Client2vec uses marginalized stackeddenoising autoencoders on current account transactions data to create vectorembeddings which represent the behaviors of our clients. These representationscan then be used in, and optimized against, a variety of tasks such as clientsegmentation, profiling and targeting. Here we detail how we selected thealgorithmic machinery of client2vec and the data it works on and presentexperimental results on several business cases.

Quick Read (beta)

loading the full paper ...