SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 33013350 of 659983 papers

TitleStatusHype
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse AutoencodersCode3
Scikit-fingerprints: easy and efficient computation of molecular fingerprints in PythonCode3
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language ModelsCode3
E5-V: Universal Embeddings with Multimodal Large Language ModelsCode3
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge BasesCode3
Relation DETR: Exploring Explicit Position Relation Prior for Object DetectionCode3
VISA: Reasoning Video Object Segmentation via Large Language ModelsCode3
TCFormer: Visual Recognition via Token Clustering TransformerCode3
Scaling Diffusion Transformers to 16 Billion ParametersCode3
The Oscars of AI Theater: A Survey on Role-Playing with Language ModelsCode3
OVLW-DETR: Open-Vocabulary Light-Weighted Detection TransformerCode3
PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical CompetitionCode3
Evaluating Large Language Models with fmevalCode3
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use CasesCode3
Fast Matrix Multiplications for Lookup Table-Quantized LLMsCode3
Learning Dynamics of LLM FinetuningCode3
Restoring Images in Adverse Weather Conditions via Histogram TransformerCode3
Human-like Episodic Memory for Infinite Context LLMsCode3
A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and LocalizationCode3
LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion ModelsCode3
Single-Image Shadow Removal Using Deep Learning: A Comprehensive SurveyCode3
A Comprehensive Survey on Human Video Generation: Challenges, Methods, and InsightsCode3
Unifying 3D Representation and Control of Diverse Robots with a Single CameraCode3
WildGaussians: 3D Gaussian Splatting in the WildCode3
Video Diffusion Alignment via Reward GradientsCode3
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective FusionCode3
Inference Performance Optimization for Large Language Models on CPUsCode3
Neural Localizer Fields for Continuous 3D Human Pose and Shape EstimationCode3
EfficientQAT: Efficient Quantization-Aware Training for Large Language ModelsCode3
BiGym: A Demo-Driven Mobile Bi-Manual Manipulation BenchmarkCode3
Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation ModelsCode3
Robust Neural Information Retrieval: An Adversarial and Out-of-distribution PerspectiveCode3
Chat-Edit-3D: Interactive 3D Scene Editing via Text PromptsCode3
Scaling Retrieval-Based Language Models with a Trillion-Token DatastoreCode3
Revisiting, Benchmarking and Understanding Unsupervised Graph Domain AdaptationCode3
A Survey on LoRA of Large Language ModelsCode3
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work TasksCode3
Unified Approach for Hedging Impermanent Loss of Liquidity ProvisionCode3
LoRA-GA: Low-Rank Adaptation with Gradient ApproximationCode3
LaRa: Efficient Large-Baseline Radiance FieldsCode3
CountGD: Multi-Modal Open-World CountingCode3
Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular DataCode3
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem AugmentationCode3
Simplifying Deep Temporal Difference LearningCode3
OneRestore: A Universal Restoration Framework for Composite DegradationCode3
On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model LeaderboardsCode3
A Practical Review of Mechanistic Interpretability for Transformer-Based Language ModelsCode3
Consistency Flow Matching: Defining Straight Flows with Velocity ConsistencyCode3
What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language ModelsCode3
TokenPacker: Efficient Visual Projector for Multimodal LLMCode3
Show:102550
← PrevPage 67 of 13200Next →