Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Performance collapse is an intractable issue of Differentiable Architecture Search (DAS), where severe performance degradation of DAS happens when it trains on different search spaces or datasets. We theoretically analyze the issue from the information bottleneck (IB) perspective, and disclose that a solution to overcome this problem is to seek the bifurcation point of IB tradeoff between compression and prediction of the supernet. To this end, we propose a simple yet highly effective method, namely, Batch Entropy-decay Regularization (BER), to guide the learning of DAS, which restricts compression in DAS by imposing a penalty on the architecture parameters. Comprehensive theoretical analyses demonstrate that BER is able to completely resolve DAS's performance collapse issue. Compared with a number of state-of-the-art DAS variants, BER shows its overwhelmingly better performance on 7 search spaces (i.e., NAS-Bench-201, DARTS, S1-S4, MobileNet-like) and 5 popular datasets (i.e., CIFAR-10, CIFAR-100, ImageNet1k, PASCAL VOC 2007, and MS COCO 2017).