SOTAVerified

Spatial Reasoning

Papers

Showing 301325 of 453 papers

TitleStatusHype
A Survey for Foundation Models in Autonomous Driving0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities0
Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imageryCode2
StarCraftImage: A Dataset For Prototyping Spatial Reasoning Methods For Multi-Agent Environments0
Distortions in Judged Spatial Relations in Large Language Models0
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame BenchmarkCode1
Location Aware Modular Biencoder for Tourism Question AnsweringCode0
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding0
Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation0
Inherent limitations of LLMs regarding spatial informationCode0
Exploring and Improving the Spatial Reasoning Abilities of Large Language Models0
FollowEval: A Multi-Dimensional Benchmark for Assessing the Instruction-Following Capability of Large Language Models0
On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous DrivingCode2
What's "up" with vision-language models? Investigating their struggle with spatial reasoningCode1
Disentangling Extraction and Reasoning in Multi-hop Spatial ReasoningCode0
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in TextCode0
Vision-Language Models are Zero-Shot Reward Models for Reinforcement LearningCode1
Evaluating Robustness of Visual Representations for Object Assembly Task Requiring Spatio-Geometrical Reasoning0
Integrating Symbolic Reasoning into Neural Generative Models for Design Generation0
SlotGNN: Unsupervised Discovery of Multi-Object Representations and Visual Dynamics0
Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal ReasoningCode1
Improved Baselines with Visual Instruction TuningCode6
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous DrivingCode1
SmartPlay: A Benchmark for LLMs as Intelligent AgentsCode1
Show:102550
← PrevPage 13 of 19Next →

No leaderboard results yet.