Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Tandem mass spectrometry (MS/MS) is a critical tool for identifying molecular structures. By efficiently separating molecular fragments based on their mass-to-charge (m/z) ratios, it facilitates molecular generation and subsequent scientific discoveries. However, de novo molecular generation from MS/MS spectra remains fundamentally constrained by two paramount challenges: the vast chemical space requires effective structural constraints, and the absence of fine-grained substructural generation weakens the correspondences between spectral features and molecular structures. In this work, we propose MSAnchor, a novel two-stage framework for MS/MS-based molecular structure generation. We mitigate the search space challenge through the introduction of Anchor-Extended Molecular Scaffold (AEMS) representation that explicitly encodes side-chain anchoring points, thereby dramatically reducing combinatorial complexity. Leveraging the explicit attachment sites provided by AEMS, we develop anchor-specific priors that establish effective alignments between spectral features and molecular substructures. This fine-grained substructural correspondence is further enhanced by a modified Conditional Information Bottleneck (CIB) module that extracts the most informative spectral components in a structure-aware manner. These innovations enable MSAnchor to generate molecular structures that closely reflect spectral characteristics while constraining combinatorial complexity. Extensive experiments on the CANOPUS and MassSpecGym datasets demonstrate that MSAnchor achieves state-of-the-art performance in molecular structure prediction from MS/MS spectra, with performance improvements that are particularly more pronounced for molecules with higher complexity.
