SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 41514200 of 661570 papers

TitleStatusHype
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
OctoPack: Instruction Tuning Code Large Language ModelsCode3
Learning Smooth Humanoid Locomotion through Lipschitz-Constrained PoliciesCode3
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization LandscapeCode3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-ExpertsCode3
On the use of deep learning for phase recoveryCode3
Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language ModelsCode3
NVS-Solver: Video Diffusion Model as Zero-Shot Novel View SynthesizerCode3
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language ModelCode3
MAPIE: an open-source library for distribution-free uncertainty quantificationCode3
PhysX: Physical-Grounded 3D Asset GenerationCode3
Sigma: Siamese Mamba Network for Multi-Modal Semantic SegmentationCode3
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at ScaleCode3
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign LanguagesCode3
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving ScenesCode3
Automatic Intrinsic Reward Shaping for Exploration in Deep Reinforcement LearningCode3
LLM4CP: Adapting Large Language Models for Channel PredictionCode3
Universal Actions for Enhanced Embodied Foundation ModelsCode3
ChatRex: Taming Multimodal LLM for Joint Perception and UnderstandingCode3
DROID-Splat: Combining end-to-end SLAM with 3D Gaussian SplattingCode3
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action DetectionCode3
Relaxing Accurate Initialization Constraint for 3D Gaussian SplattingCode3
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion RecognitionCode3
DDColor: Towards Photo-Realistic Image Colorization via Dual DecodersCode3
PCDCNet: A Surrogate Model for Air Quality Forecasting with Physical-Chemical Dynamics and ConstraintsCode3
MACE: Mass Concept Erasure in Diffusion ModelsCode3
MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein EmbeddingCode3
TopoTune : A Framework for Generalized Combinatorial Complex Neural NetworksCode3
FlipSketch: Flipping Static Drawings to Text-Guided Sketch AnimationsCode3
DoWhy: An End-to-End Library for Causal InferenceCode3
Relative Pose Estimation through Affine Corrections of Monocular Depth PriorsCode3
DistiLLM: Towards Streamlined Distillation for Large Language ModelsCode3
TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming VideosCode3
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy OptimizationCode3
Music2Latent: Consistency Autoencoders for Latent Audio CompressionCode3
Advanced Video Inpainting Using Optical Flow-Guided Efficient DiffusionCode3
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly DetectionCode3
A Survey on the Memory Mechanism of Large Language Model based AgentsCode3
ACEGEN: Reinforcement learning of generative chemical agents for drug discoveryCode3
Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and PlanningCode3
RiNALMo: General-Purpose RNA Language Models Can Generalize Well on Structure Prediction TasksCode3
Embodied Understanding of Driving ScenariosCode3
Personalized Image Generation with Deep Generative Models: A Decade SurveyCode3
R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPOCode3
Datasheet for the PileCode3
UniMERNet: A Universal Network for Real-World Mathematical Expression RecognitionCode3
imitation: Clean Imitation Learning ImplementationsCode3
Efficient Video Action Detection with Token Dropout and Context RefinementCode3
Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of GeneralizationCode3
LLM-Pruner: On the Structural Pruning of Large Language ModelsCode3
Show:102550
← PrevPage 84 of 13232Next →