SOTAVerified

Natural Language Queries

Papers

Showing 125 of 337 papers

TitleStatusHype
SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding0
Towards Probabilistic Question Answering Over Tabular Data0
A Modular Multitask Reasoning Framework Integrating Spatio-temporal Models and LLMs0
Invocable APIs derived from NL2SQL datasets for LLM Tool-Calling Evaluation0
Improving Personalized Search with Regularized Low-Rank Parameter UpdatesCode0
Technical Report for Argoverse2 Scenario Mining Challenges on Iterative Error Correction and Spatially-Aware Prompting0
MLVTG: Mamba-Based Feature Alignment and LLM-Driven Purification for Multi-Modal Video Temporal Grounding0
SEED: Enhancing Text-to-SQL Performance and Practical Usability Through Automatic Evidence GenerationCode1
OSGNet @ Ego4D Episodic Memory Challenge 2025Code1
DGMO: Training-Free Audio Source Separation through Diffusion-Guided Mask Optimization0
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing ScenesCode2
A Graph-Retrieval-Augmented Generation Framework Enhances Decision-Making in the Circular Economy0
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity AwarenessCode0
CoRet: Improved Retriever for Code Editing0
MGS3: A Multi-Granularity Self-Supervised Code Search Framework0
StreamLink: Large-Language-Model Driven Distributed Data Engineering System0
Text-Queried Audio Source Separation via Hierarchical Modeling0
Complex System Diagnostics Using a Knowledge Graph-Informed and Large Language Model-Enhanced Framework0
RefAV: Towards Planning-Centric Scenario MiningCode1
Structuring the Unstructured: A Multi-Agent System for Extracting and Querying Financial KPIs and Guidance0
LLM-Powered Agents for Navigating Venice's Historical Cadastre0
DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long VideosCode1
CRAFT: Training-Free Cascaded Retrieval for Tabular QA0
RAZER: Robust Accelerated Zero-Shot 3D Open-Vocabulary Panoptic Reconstruction with Spatio-Temporal Aggregation0
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex0
Show:102550
← PrevPage 1 of 14Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1EgoVideoR@1 Mean(0.3 and 0.5)23.68Unverified
2DeCafNet-100%R@1 Mean(0.3 and 0.5)18.86Unverified
3DeCafNet-50%R@1 Mean(0.3 and 0.5)17.93Unverified
4RGNetR@1 Mean(0.3 and 0.5)16.55Unverified
5DeCafNet-50% (no NaQ)R@1 Mean(0.3 and 0.5)15.32Unverified
6InternVideoR@1 Mean(0.3 and 0.5)13.26Unverified
7EgoVLPv2R@1 IoU=0.312.95Unverified
8UniMD+Sync.R@1 Mean(0.3 and 0.5)12.11Unverified
9ReLER@ZJU-AlibabaR@1 Mean(0.3 and 0.5)10.52Unverified
10EgoVLPR@1 Mean(0.3 and 0.5)8.35Unverified