Abstract
This paper proposes a foundation model called "CLaSP" that can search timeseries signals using natural language that describes the characteristics of thesignals as queries. Previous efforts to represent time series signal data innatural language have had challenges in designing a conventional class of timeseries signal characteristics, formulating their quantification, and creating adictionary of synonyms. To overcome these limitations, the proposed methodintroduces a neural network based on contrastive learning. This network isfirst trained using the datasets TRUCE and SUSHI, which consist of time seriessignals and their corresponding natural language descriptions. Previous studieshave proposed vocabularies that data analysts use to describe signalcharacteristics, and SUSHI was designed to cover these terms. We believe that aneural network trained on these datasets will enable data analysts to searchusing natural language vocabulary. Furthermore, our method does not require adictionary of predefined synonyms, and it leverages common sense knowledgeembedded in a large-scale language model (LLM). Experimental resultsdemonstrate that CLaSP enables natural language search of time series signaldata and can accurately learn the points at which signal data changes.