Identifying Predictions That Influence the Future: Detecting Performative Concept Drift in Data Streams

Sergey Nikolayevich Dragomiretskiy

Premium content

Access to this content requires a subscription. You must be a premium user to view this content.

Monthly subscription - $9.99 Pay per view - $4.99 Access through your institution Login with Underline account

Need help?

Contact us

AAAI 2025

•

February 27, 2025

•

Philadelphia, United States

Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.

keywords:

ml

data streams

time series

Concept Drift has been extensively studied within the context of Stream Learning. However, it is often assumed that the deployed model’s predictions play no role in the concept drift the system experiences. Closer inspection reveals that this not necessarily/always the case. Automated trading might be prone to self-fulfilling feedback loops, a potential cause of the Flash Crash in May 6th 2010. Likewise, malicious entities might adapt to evade detectors in the adversarial setting resulting in a self-negating feedback loop that requires detectors to constantly re-train. Settings where a model may induce concept drift are called performative. In this work we investigate this phenomena.

Our contributions are the following: First we define performative drift within a stream learning setting and distinguish it from other causes of drift. We introduce a novel type of drift detection task, aimed at identifying potential performative concept drift in data streams. We propose a first such performative drift detection approach, called CheckerBoard Performative Drift Detection (CB-PDD). We apply CB-PDD to both synthetic and semi-synthetic datasets that exhibit varying degrees of self-fulfilling feedback loops. Results are positive with CB-PDD showing high efficacy, low false detection rates, resilience to intrinsic drift, comparability to other drift detection techniques and an ability to effectively detect performative drift in semi-synthetic datasets. Secondly we highlight the role intrinsic (traditional) drift plays in obfuscating performative drift and discuss the implications of these findings as well as the limitations of CB-PDD.