Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
An interesting phenomenon arises: Empirical Risk Minimization (ERM) sometimes outperforms methods specifically designed for out-of-distribution tasks. This motivates an investigation into the reasons behind such behavior beyond algorithmic design. In this study, we find that one such reason lies in the distribution shift across training domains. A large degree of distribution shift can lead to better performance even under ERM. Specifically, we derive several theoretical and empirical findings demonstrating that distribution shift plays a crucial role in model learning and benefits learning invariant prediction. First, the proposed upper bounds indicate that the degree of distribution shift directly affects the generalization ability of the learned models. If it is large, the generalization ability of the learned models can increase, approximating invariant prediction models that make stable predictions under arbitrary known or unseen domains; and vice versa. Moreover, we prove that under certain data conditions, ERM solutions can exhibit performance comparable to that of invariant prediction models. Second, the empirical validation results demonstrated that the predictions of the trained models approximate the ground-truth labels, provided that the degree of distribution shift in the training data increases.