Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Recent advances in LLM-based recommendation have shown promising results, yet their generalization across domains remains limited due to the mismatch between language pretraining objectives and the recommendation task. Existing methods primarily rely on language-level knowledge transfer, failing to model the sequential dependencies among item-level recommendation—an essential factor for effective recommendation. To address this limitation, we propose RecBase, a domain-agnostic foundational model pretrained with a recommendation-oriented objective. RecBase leverages a large-scale, heterogeneous, cross-domain corpus with unified textual representations and feature mappings to enhance cross-domain generalization. To further align item semantics across domains, we introduce a unified item tokenizer that encodes items into hierarchical concept identifiers, enabling structured representation and efficient vocabulary sharing. The model is trained using an autoregressive objective to capture complex item-level sequential patterns. Experiments on eight real-world datasets show that RecBase, with only 1.5B parameters, consistently outperforms LLM-based baselines with up to 7B parameters in zero-shot and cross-domain recommendation scenarios. Code and pretrained checkpoints are included in the supplementary material.