SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 76517700 of 661570 papers

TitleStatusHype
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate SchedulerCode2
DeTPP: Leveraging Object Detection for Robust Long-Horizon Event PredictionCode2
ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLMCode2
UMERegRobust -- Universal Manifold Embedding Compatible Features for Robust Point Cloud RegistrationCode2
Scalable Autoregressive Image Generation with MambaCode2
MuMA-ToM: Multi-modal Multi-Agent Theory of MindCode2
Towards Evaluating and Building Versatile Large Language Models for MedicineCode2
Personality Alignment of Large Language ModelsCode2
KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting?Code2
VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality AssessmentCode2
Critique-out-Loud Reward ModelsCode2
BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal RepresentationCode2
Pano2Room: Novel View Synthesis from a Single Indoor PanoramaCode2
HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image SegmentationCode2
biorecap: an R package for summarizing bioRxiv preprints with a local LLMCode2
RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions TransformCode2
UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing ImagesCode2
Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning BenchmarksCode2
MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further TuningCode2
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation AnalysisCode2
ConFIG: Towards Conflict-free Training of Physics Informed Neural NetworksCode2
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree SearchCode2
BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language ModelCode2
deepmriprep: Voxel-based Morphometry (VBM) Preprocessing via Deep Neural NetworksCode2
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative DecodingCode2
GSLoc: Efficient Camera Pose Refinement via 3D Gaussian SplattingCode2
DEGAS: Detailed Expressions on Full-Body Gaussian AvatarsCode2
FLAME: Learning to Navigate with Multimodal LLM in Urban EnvironmentsCode2
PartGS:Learning Part-aware 3D Representations by Fusing 2D Gaussians and SuperquadricsCode2
PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series ForecastingCode2
LegalBench-RAG: A Benchmark for Retrieval-Augmented Generation in the Legal DomainCode2
TraDiffusion: Trajectory-Based Training-Free Image GenerationCode2
C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake DetectionCode2
PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image UnderstandingCode2
SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short DramaCode2
Selective Prompt Anchoring for Code GenerationCode2
TC-RAG:Turing-Complete RAG's Case study on Medical LLM SystemsCode2
Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian SplattingCode2
Segment Anything with Multiple ModalitiesCode2
An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval InterfaceCode2
Accelerating Giant Impact Simulations with Machine LearningCode2
A Survey on Benchmarks of Multimodal Large Language ModelsCode2
MIA-Tuner: Adapting Large Language Models as Pre-training Text DetectorCode2
OpenCity: Open Spatio-Temporal Foundation Models for Traffic PredictionCode2
RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor SearchCode2
ECG-Chat: A Large ECG-Language Model for Cardiac Disease DiagnosisCode2
PCP-MAE: Learning to Predict Centers for Point Masked AutoencodersCode2
xGen-MM (BLIP-3): A Family of Open Large Multimodal ModelsCode2
EasyRec: Simple yet Effective Language Models for RecommendationCode2
Efficient Autoregressive Audio Modeling via Next-Scale PredictionCode2
Show:102550
← PrevPage 154 of 13232Next →