Geodesic Semantic Search: Learning Local Riemannian Metrics for Citation Graph Retrieval
Brandon Yee, Lucas Wang, Kundana Kommini, Krishna Sharma
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/ycrg-labs/geodesic-searchOfficialIn paper★ 2
Abstract
We present Geodesic Semantic Search (GSS), a retrieval system that learns node-specific Riemannian metrics on citation graphs to enable geometry-aware semantic search. Unlike standard embedding-based retrieval that relies on fixed Euclidean distances, learns a low-rank metric tensor _i ^d r at each node, inducing a local positive semi-definite metric _i = _i _i^ + . This parameterization guarantees valid metrics while keeping the model tractable. Retrieval proceeds via multi-source Dijkstra on the learned geodesic distances, followed by Maximal Marginal Relevance reranking and path coherence filtering. On citation prediction benchmarks with 169K papers, achieves 23\% relative improvement in Recall@20 over SPECTER+FAISS baselines while providing interpretable citation paths. Our hierarchical coarse-to-fine search with k-means pooling reduces computational cost by 4 compared to flat geodesic search while maintaining 97\% retrieval quality. We provide theoretical analysis of when geodesic distances outperform direct similarity, characterize the approximation quality of low-rank metrics, and validate predictions empirically. Code and trained models are available at https://github.com/YCRG-Labs/geodesic-search.