Concept-drifting Data Streams are Time Series; The Case for Continuous Adaptation

Abstract

Learning from data streams is an increasingly important topic in data mining,machine learning, and artificial intelligence in general. A major focus in thedata stream literature is on designing methods that can deal with conceptdrift, a challenge where the generating distribution changes over time. Ageneral assumption in most of this literature is that instances areindependently distributed in the stream. In this work we show that, in thecontext of concept drift, this assumption is contradictory, and that thepresence of concept drift necessarily implies temporal dependence; and thussome form of time series. This has important implications on model design anddeployment. We explore and highlight the these implications, and show thatHoeffding-tree based ensembles, which are very popular for learning in streams,are not naturally suited to learning \emph{within} drift; and can perform inthis scenario only at significant computational cost of destructive adaptation.On the other hand, we develop and parameterize gradient-descent methods anddemonstrate how they can perform \emph{continuous} adaptation with no explicitdrift-detection mechanism, offering major advantages in terms of accuracy andefficiency. As a consequence of our theoretical discussion and empiricalobservations, we outline a number of recommendations for deploying methods inconcept-drifting streams.

Quick Read (beta)

loading the full paper ...