SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 63016350 of 177340 papers

TitleStatusHype
SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language ModelsCode2
QFFT, Question-Free Fine-Tuning for Adaptive ReasoningCode2
NeRFusion: Fusing Radiance Fields for Large-Scale Scene ReconstructionCode2
CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept MatchingCode2
Pantograph: A Machine-to-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4Code2
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid ManipulationCode2
BEFUnet: A Hybrid CNN-Transformer Architecture for Precise Medical Image SegmentationCode2
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio CaptioningCode2
Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning StrategyCode2
Dense Optical Tracking: Connecting the DotsCode2
Fast Inner-Product Algorithms and Architectures for Deep Neural Network AcceleratorsCode2
YOLO-Pose: Enhancing YOLO for Multi Person Pose Estimation Using Object Keypoint Similarity LossCode2
Deep Learning for Camera Calibration and Beyond: A SurveyCode2
BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object DetectionCode2
SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World UsersCode2
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model GenerationCode2
Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial ScenesCode2
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight CompressionCode2
VRSBench: A Versatile Vision-Language Benchmark Dataset for Remote Sensing Image UnderstandingCode2
Structured Attention Composition for Temporal Action LocalizationCode2
DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability CurvatureCode2
ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image GenerationCode2
Uni-Perceiver-MoE: Learning Sparse Generalist Models with Conditional MoEsCode2
ImMesh: An Immediate LiDAR Localization and Meshing FrameworkCode2
MidiCaps: A large-scale MIDI dataset with text captionsCode2
MACRec: a Multi-Agent Collaboration Framework for RecommendationCode2
Content-Style Decoupling for Unsupervised Makeup Transfer without Generating Pseudo Ground TruthCode2
Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional EncodingCode2
Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and PerspectivesCode2
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language ModelsCode2
Infinite Recommendation Networks: A Data-Centric ApproachCode2
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human FeedbackCode2
Efficient LLM Inference on CPUsCode2
Challenges and Opportunities in Offline Reinforcement Learning from Visual ObservationsCode2
VMAS: A Vectorized Multi-Agent Simulator for Collective Robot LearningCode2
Democratizing Contrastive Language-Image Pre-training: A CLIP Benchmark of Data, Model, and SupervisionCode2
Deep Learning Methods for Partial Differential Equations and Related Parameter Identification ProblemsCode2
Samba: Semantic Segmentation of Remotely Sensed Images with State Space ModelCode2
Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent LearningCode2
Critique-out-Loud Reward ModelsCode2
Low-light Image Enhancement via CLIP-Fourier Guided Wavelet DiffusionCode2
LLM-PBE: Assessing Data Privacy in Large Language ModelsCode2
Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve ThemCode2
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action ChainCode2
LeviTor: 3D Trajectory Oriented Image-to-Video SynthesisCode2
MAT: Mask-Aware Transformer for Large Hole Image InpaintingCode2
MANIQA: Multi-dimension Attention Network for No-Reference Image Quality AssessmentCode2
A Closer Look at Learned Optimization: Stability, Robustness, and Inductive BiasesCode2
Rec-R1: Bridging Generative Large Language Models and User-Centric Recommendation Systems via Reinforcement LearningCode2
MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep LearningCode2
Show:102550
← PrevPage 127 of 3547Next →