SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,142 code links4,818 tasks

Papers

Showing 39013950 of 661570 papers

TitleStatusHype
LoRA+: Efficient Low Rank Adaptation of Large ModelsCode3
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language ModelsCode3
3D Diffuser Actor: Policy Diffusion with 3D Scene RepresentationsCode3
EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language ModelsCode3
GenAD: Generative End-to-End Autonomous DrivingCode3
OneBit: Towards Extremely Low-bit Large Language ModelsCode3
LLMDFA: Analyzing Dataflow in Code with Large Language ModelsCode3
3D Diffuser Actor: Policy Diffusion with 3D Scene RepresentationsCode3
Discovering and exploring cases of educational source code plagiarism with DolosCode3
BitDelta: Your Fine-Tune May Only Be Worth One BitCode3
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic ChipsCode3
QuRating: Selecting High-Quality Data for Training Language ModelsCode3
Data Engineering for Scaling Language Models to 128K ContextCode3
OptiMUS: Scalable Optimization Modeling with (MI)LP Solvers and Large Language ModelsCode3
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-TuningCode3
GES: Generalized Exponential Splatting for Efficient Radiance Field RenderingCode3
Traj-LIO: A Resilient Multi-LiDAR Multi-IMU State Estimator Through Sparse Gaussian ProcessCode3
Magic-Me: Identity-Specific Video Customized DiffusionCode3
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal RetrieversCode3
VerMCTS: Synthesizing Multi-Step Programs using a Verifier, a Large Language Model, and Tree SearchCode3
SPO: Sequential Monte Carlo Policy OptimisationCode3
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language ModelsCode3
Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language ModelsCode3
Scaling Laws for Fine-Grained Mixture of ExpertsCode3
Q-Bench+: A Benchmark for Multi-modal Foundation Models on Low-level Vision from Single Images to PairsCode3
X-LoRA: Mixture of Low-Rank Adapter Experts, a Flexible Framework for Large Language Models with Applications in Protein Mechanics and Molecular DesignCode3
OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated LearningCode3
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts ModelsCode3
ResumeFlow: An LLM-facilitated Pipeline for Personalized Resume Generation and RefinementCode3
FNSPID: A Comprehensive Financial News Dataset in Time SeriesCode3
ForestColl: Throughput-Optimal Collective Communications on Heterogeneous Network FabricsCode3
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian SplattingCode3
The boundary of neural network trainability is fractalCode3
Noise Contrastive Alignment of Language Models with Explicit RewardsCode3
Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive SurveyCode3
Editable Scene Simulation for Autonomous Driving via Collaborative LLM-AgentsCode3
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-DesignCode3
Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion ModelsCode3
MEMORYLLM: Towards Self-Updatable Large Language ModelsCode3
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context MemoryCode3
Temporal Graph Analysis with TGXCode3
ConsistI2V: Enhancing Visual Consistency for Image-to-Video GenerationCode3
Does confidence calibration improve conformal prediction?Code3
OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous DrivingCode3
CogCoM: Train Large Vision-Language Models Diving into Details through Chain of ManipulationsCode3
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API CallsCode3
DistiLLM: Towards Streamlined Distillation for Large Language ModelsCode3
The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax MimicryCode3
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
Deep Learning for Multivariate Time Series Imputation: A SurveyCode3
Show:102550
← PrevPage 79 of 13232Next →