Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
While Transformers have revolutionized time series forecasting, they remain trapped by manual architecture design—every model uses the same attention mechanism, normalization, and activation choices. What if we could automatically discover the perfect architectural recipe for each dataset? This work introduces STrans (Spontaneous Transformer), a comprehensive neural architecture search framework for time series Transformers that simultaneously explores attention variants, normalization techniques, activation functions, and encoding operations. Using differentiable architecture search, STrans automatically discovers architectures that outperform manually designed baselines. However, the experiments reveal a surprising and counterintuitive finding: complex searched architectures often fail catastrophically, while simpler configurations generalize better. This "search overfitting" phenomenon challenges fundamental assumptions about automated architecture design in time series domains. The work not only advances automated model design but uncovers critical insights that will reshape how we think about neural architecture search for temporal data.