SOTAVerified

Spatial Reasoning

Papers

Showing 351400 of 453 papers

TitleStatusHype
FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts0
FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models0
Following Instructions by Imagining and Reaching Visual Goals0
Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization0
FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors0
From 2D to 3D Cognition: A Brief Survey of General World Models0
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes0
From Patches to Objects: Exploiting Spatial Reasoning for Better Visual Representations0
From Spatial Relations to Spatial Configurations0
From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning0
Generating Human Motion in 3D Scenes from Text Descriptions0
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learning0
Geometric Feature Enhanced Knowledge Graph Embedding and Spatial Reasoning0
Geometry of 3D Environments and Sum of Squares Polynomials0
Global Information Guided Video Anomaly Detection0
GPT-4o System Card0
Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture0
GRASP: A Grid-Based Benchmark for Evaluating Commonsense Spatial Reasoning0
Grounded Reinforcement Learning for Visual Reasoning0
GSR-BENCH: A Benchmark for Grounded Spatial Reasoning Evaluation via Multimodal LLMs0
HAMMR: HierArchical MultiModal React agents for generic VQA0
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation0
History-Aware Question Answering in a Blocks World Dialogue System0
How to Enable LLM with 3D Capacity? A Survey of Spatial Reasoning in LLM0
Hyperdimensional Computing with Spiking-Phasor Neurons0
I Know About "Up"! Enhancing Spatial Reasoning in Visual Language Models Through 3D Reconstruction0
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies0
Improved Algorithms for Allen's Interval Algebra by Dynamic Programming with Sublinear Partitioning0
Incentivizing Multimodal Reasoning in Large Models for Direct Robot Manipulation0
Integrating Symbolic Reasoning into Neural Generative Models for Design Generation0
Intelligence of Things: A Spatial Context-Aware Control System for Smart Devices0
Jigsaw-Puzzles: From Seeing to Understanding to Reasoning in Vision-Language Models0
JSTR: Joint Spatio-Temporal Reasoning for Event-based Moving Object Detection0
LABNet: Local Graph Aggregation Network with Class Balanced Loss for Vehicle Re-Identification0
LanguageRefer: Spatial-Language Model for 3D Visual Grounding0
Large Language-Geometry Model: When LLM meets Equivariance0
Large Language Models and Mathematical Reasoning Failures0
Learning event representation: As sparse as possible, but not sparser0
Learning to encode spatial relations from natural language0
LEGO-Puzzles: How Good Are MLLMs at Multi-Step Spatial Reasoning?0
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding0
Location-Aware Self-Supervised Transformers for Semantic Segmentation0
Long Range Arena : A Benchmark for Efficient Transformers0
LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation0
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning0
Manhattan Junction Catalogue for Spatial Reasoning of Indoor Scenes0
Map Learning with Indistinguishable Locations0
Mathematical Definition and Systematization of Puzzle Rules0
MEBench: A Novel Benchmark for Understanding Mutual Exclusivity Bias in Vision-Language Models0
MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation0
Show:102550
← PrevPage 8 of 10Next →

No leaderboard results yet.