SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 79017950 of 661570 papers

TitleStatusHype
Progressive Pretext Task Learning for Human Trajectory PredictionCode2
Scientific QA System with Verifiable AnswersCode2
Digital Twin Vehicular Edge Computing Network: Task Offloading and Resource AllocationCode2
Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded ScenesCode2
Does Refusal Training in LLMs Generalize to the Past Tense?Code2
SPINACH: SPARQL-Based Information Navigation for Challenging Real-World QuestionsCode2
Monocular Occupancy Prediction for Scalable Indoor ScenesCode2
TeethDreamer: 3D Teeth Reconstruction from Five Intra-oral PhotographsCode2
Towards High-Quality 3D Motion Transfer with Realistic Apparel AnimationCode2
Deep Diffusion Image Prior for Efficient OOD Adaptation in 3D Inverse ProblemsCode2
IDOL: Unified Dual-Modal Latent Diffusion for Human-Centric Joint Video-Depth GenerationCode2
OPEN: Object-wise Position Embedding for Multi-view 3D Object DetectionCode2
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading SystemsCode2
Differentiable Voxelization and Mesh MorphingCode2
Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented GenerationCode2
FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial MarketsCode2
Representation Learning and Identity Adversarial Training for Facial Behavior UnderstandingCode2
DataDream: Few-shot Guided Dataset GenerationCode2
AccDiffusion: An Accurate Method for Higher-Resolution Image GenerationCode2
iHuman: Instant Animatable Digital Humans From Monocular VideosCode2
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank GradientsCode2
Accessing Vision Foundation Models at ImageNet-level CostsCode2
Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?Code2
PolyRoom: Room-aware Transformer for Floorplan ReconstructionCode2
Target conversation extraction: Source separation using turn-taking dynamicsCode2
Sibyl: Simple yet Effective Agent Framework for Complex Real-world ReasoningCode2
SEED: A Simple and Effective 3D DETR in Point CloudsCode2
TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value EstimationCode2
xLSTMTime : Long-term Time Series Forecasting With xLSTMCode2
AutoGRAMS: Autonomous Graphical Agent Modeling SoftwareCode2
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark DatasetCode2
Follow the Rules: Reasoning for Video Anomaly Detection with Large Language ModelsCode2
Noise Calibration: Plug-and-play Content-Preserving Video Enhancement using Pre-trained Video Diffusion ModelsCode2
PARE-Net: Position-Aware Rotation-Equivariant Networks for Robust Point Cloud RegistrationCode2
Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKVCode2
Hydra: Bidirectional State Space Models Through Generalized Matrix MixersCode2
Arbitrary-Scale Video Super-Resolution with Structural and Textural PriorsCode2
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and SynthesisCode2
An Autonomous GIS Agent Framework for Geospatial Data RetrievalCode2
DiffRect: Latent Diffusion Label Rectification for Semi-supervised Medical Image SegmentationCode2
Image Compression for Machine and Human Vision with Spatial-Frequency AdaptationCode2
SPIQA: A Dataset for Multimodal Question Answering on Scientific PapersCode2
Flash normalization: fast RMSNorm for LLMsCode2
GOFA: A Generative One-For-All Model for Joint Graph Language ModelingCode2
Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning MambaCode2
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal TrainingCode2
PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric AgentsCode2
SpreadsheetLLM: Encoding Spreadsheets for Large Language ModelsCode2
PID: Physics-Informed Diffusion Model for Infrared Image GenerationCode2
GTA: A Benchmark for General Tool AgentsCode2
Show:102550
← PrevPage 159 of 13232Next →