SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 70517100 of 661570 papers

TitleStatusHype
Improving Causal Reasoning in Large Language Models: A SurveyCode2
PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy ReductionCode2
Literature Meets Data: A Synergistic Approach to Hypothesis GenerationCode2
PAPILLON: Privacy Preservation from Internet-based and Local Language Model EnsemblesCode2
xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar MemoriesCode2
DI-MaskDINO: A Joint Object Detection and Instance Segmentation ModelCode2
Diffusion Transformer PolicyCode2
Mitigating Object Hallucination via Concentric Causal AttentionCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and EvolutionCode2
Multi-IF: Benchmarking LLMs on Multi-Turn and Multilingual Instructions FollowingCode2
Integrating Reinforcement Learning with Foundation Models for Autonomous Robotics: Methods and PerspectivesCode2
Pantograph: A Machine-to-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4Code2
RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and StyleCode2
3D-GANTex: 3D Face Reconstruction with StyleGAN3-based Multi-View Images and 3DDFA based Mesh GenerationCode2
Analysing the Residual Stream of Language Models Under Knowledge ConflictsCode2
Improve Vision Language Model Chain-of-thought ReasoningCode2
Compute-Constrained Data SelectionCode2
CamI2V: Camera-Controlled Image-to-Video Diffusion ModelCode2
LLaVA-KD: A Framework of Distilling Multimodal Large Language ModelsCode2
Reducing Hallucinations in Vision-Language Models via Latent Space SteeringCode2
Beyond Browsing: API-Based Web AgentsCode2
TIPS: Text-Image Pretraining with Spatial AwarenessCode2
RANSAC Back to SOTA: A Two-stage Consensus Filtering for Real-time 3D RegistrationCode2
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation EngineeringCode2
LoRA-IR: Taming Low-Rank Experts for Efficient All-in-One Image RestorationCode2
SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent EvaluationCode2
IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement LearningCode2
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One StepCode2
A Multimodal Vision Foundation Model for Clinical DermatologyCode2
DM-Codec: Distilling Multimodal Representations for Speech TokenizationCode2
SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuningCode2
Spatial-Mamba: Effective Visual State Space Models via Structure-Aware State FusionCode2
Dynamic Factor Allocation Leveraging Regime-Switching SignalsCode2
Comparing Differentiable and Dynamic Ray Tracing: Introducing the Multipath Lifetime MapCode2
Combining Hough Transform and Deep Learning Approaches to Reconstruct ECG Signals From PrintoutsCode2
Beyond Autoregression: Discrete Diffusion for Complex Reasoning and PlanningCode2
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image GenerationCode2
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation CapabilitiesCode2
AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial ScenariosCode2
Montessori-Instruct: Generate Influential Training Data Tailored for Student LearningCode2
A Systematic Study of Cross-Layer KV Sharing for Efficient LLM InferenceCode2
CybORG++: An Enhanced Gym for the Development of Autonomous Cyber AgentsCode2
How to Evaluate Reward Models for RLHFCode2
REEF: Representation Encoding Fingerprints for Large Language ModelsCode2
Artificial Kuramoto Oscillatory NeuronsCode2
On the Role of Attention Heads in Large Language Model SafetyCode2
SimLayerKV: A Simple Framework for Layer-Level KV Cache ReductionCode2
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMsCode2
UniDrive: Towards Universal Driving Perception Across Camera ConfigurationsCode2
Show:102550
← PrevPage 142 of 13232Next →