SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 30013050 of 659983 papers

TitleStatusHype
A Comprehensive Survey of Small Language Models in the Era of Large Language Models: Techniques, Enhancements, Applications, Collaboration with LLMs, and TrustworthinessCode3
Digitizing Touch with an Artificial Multimodal FingertipCode3
Degradation-Aware Residual-Conditioned Optimal Transport for Unified Image RestorationCode3
FilterNet: Harnessing Frequency Filters for Time Series ForecastingCode3
Rule Based Rewards for Language Model SafetyCode3
ZIM: Zero-Shot Image Matting for AnythingCode3
Face Anonymization Made SimpleCode3
GameGen-X: Interactive Open-world Game Video GenerationCode3
Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software ImprovementCode3
A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-MakingCode3
SelfCodeAlign: Self-Alignment for Code GenerationCode3
XRDSLAM: A Flexible and Modular Framework for Deep Learning based SLAMCode3
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent TasksCode3
AndroidLab: Training and Systematic Benchmarking of Android Autonomous AgentsCode3
OS-ATLAS: A Foundation Action Model for Generalist GUI AgentsCode3
PF3plat: Pose-Free Feed-Forward 3D Gaussian SplattingCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative FrameworkCode3
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM InferenceCode3
Modular Duality in Deep LearningCode3
AutoKaggle: A Multi-Agent Framework for Autonomous Data Science CompetitionsCode3
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse AutoencodersCode3
Centaur: a foundation model of human cognitionCode3
Improving Model Evaluation using SMART Filtering of Benchmark DatasetsCode3
OGBench: Benchmarking Offline Goal-Conditioned RLCode3
Paint Bucket Colorization Using Anime Character Color Design SheetsCode3
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 TrainingCode3
ArxivDIGESTables: Synthesizing Scientific Literature into Tables using Language ModelsCode3
Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to AdvancesCode3
A Joint Representation Using Continuous and Discrete Features for Cardiovascular Diseases Risk Prediction on Chest CT ScansCode3
PDL: A Declarative Prompt Programming LanguageCode3
Scaling up Masked Diffusion Models on TextCode3
Large Spatial Model: End-to-end Unposed Images to Semantic 3DCode3
3D-Adapter: Geometry-Consistent Multi-View Diffusion for High-Quality 3D GenerationCode3
SMITE: Segment Me In TimECode3
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic ScenesCode3
Scaling Diffusion Language Models via Adaptation from Autoregressive ModelsCode3
LEADS: Lightweight Embedded Assisted Driving SystemCode3
VoiceBench: Benchmarking LLM-Based Voice AssistantsCode3
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language UnderstandingCode3
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive LossCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
Generalizing Motion Planners with Mixture of Experts for Autonomous DrivingCode3
Multi-Level Speaker Representation for Target Speaker ExtractionCode3
Pipeline Gradient-based Model Training on Analog In-memory AcceleratorsCode3
A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future TrendsCode3
Streaming Deep Reinforcement Learning Finally WorksCode3
DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video GenerationCode3
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion ModelCode3
An Evolved Universal Transformer MemoryCode3
Show:102550
← PrevPage 61 of 13200Next →