Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Understanding when a pre-trained model generalizes well to a new task remains a key challenge in transfer learning. Classical theories bound target risk using divergences such as total variation, MMD, or Wasserstein distance, yet tasks with similar divergences often show very different transfer performance. We propose a structural framework that explains transferability through two factors: the Feature Overlap Rate (FOR), measuring how much target representation lies in the source-induced subspace, and the Effective Task Complexity (ETC), quantifying the entropy of latent subtasks. We derive a PAC-Bayesian bound where target risk depends on FOR and ETC, and show that larger models attenuate their negative effects. Experiments on six GLUE transfer pairs estimate FOR and ETC from encoder representations and compare them to classical divergences. Results show that FOR and ETC together explain over 80% of transfer risk variance, while divergences fail to do so. Our findings provide a geometry-aware perspective for diagnosing and guiding transfer learning.