Content not yet available
This lecture has no active video or poster.
Would you like to see your presentation here, made available to a global audience of researchers?
Add your own presentation or have us affordably record your next conference.
Large Language Models excel at semantic reasoning through high-dimensional geometry but systematically struggle with numerical reasoning due to the destruction of geometric continuity during tokenization. Traditional tokenization methods fragment numerical values into arbitrary tokens, undermining their inherent geometric and topological relationships. We introduce GeoNum, a geometrically coherent numerical embedding that addresses this fundamental challenge through polar coordinate representation. GeoNum employs polar decomposition to naturally decouple discrete ordinality for classification from continuous periodicity for regression, enabling unified discrete-continuous learning that preserves numerical cognition's dual nature. Through three-stage progressive training, GeoNum first learns continuous numerical representations via self-supervised reconstruction, then aligns these embeddings with textual representations through projection learning, and finally integrates into pre-trained LLMs via parameter-efficient fine-tuning. Empirical evaluations demonstrate GeoNum consistently surpasses baseline and state-of-the-art numerical encoding methods across multiple datasets, achieving substantial performance gains particularly in high-precision arithmetic tasks (e.g., ACC@0.1 improvements up to 48.6\%). GeoNum transforms numerical processing from fragmented tokenization to coherent geometric representation, enabling principled numerical understanding in language models.