SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 66016650 of 661570 papers

TitleStatusHype
SegVol: Universal and Interactive Volumetric Medical Image SegmentationCode2
FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character DesignCode2
OccWorld: Learning a 3D Occupancy World Model for Autonomous DrivingCode2
Adapter is All You Need for Tuning Visual TasksCode2
Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D CamerasCode2
H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language ModelsCode2
Achieving Cross Modal Generalization with Multimodal Unified RepresentationCode2
M^4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and ModelsCode2
Language Models can Solve Computer TasksCode2
Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous DrivingCode2
Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent RepresentationCode2
Spike-driven TransformerCode2
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style AdapterCode2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
Aligning and Prompting Everything All at Once for Universal Visual PerceptionCode2
Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor GenerationCode2
GauHuman: Articulated Gaussian Splatting from Monocular Human VideosCode2
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-SolvingCode2
Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion modelsCode2
Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement LearningCode2
AnimateZero: Video Diffusion Models are Zero-Shot Image AnimatorsCode2
Mind2Web: Towards a Generalist Agent for the WebCode2
ClimateLearn: Benchmarking Machine Learning for Weather and Climate ModelingCode2
When Do Transformers Shine in RL? Decoupling Memory from Credit AssignmentCode2
UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion ModelsCode2
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion ModelsCode2
Multi-Branch Auxiliary Fusion YOLO with Re-parameterization Heterogeneous Convolutional for accurate object detectionCode2
LMDrive: Closed-Loop End-to-End Driving with Large Language ModelsCode2
CLIP in Medical Imaging: A SurveyCode2
Steering Llama 2 via Contrastive Activation AdditionCode2
PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation LearningCode2
StyleSinger: Style Transfer for Out-of-Domain Singing Voice SynthesisCode2
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAXCode2
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language ModelsCode2
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model InferenceCode2
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet UpcyclingCode2
OpenRL: A Unified Reinforcement Learning FrameworkCode2
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world ScenariosCode2
Improving the Stability and Efficiency of Diffusion Models for Content Consistent Super-ResolutionCode2
Graph Neural Networks for Tabular Data Learning: A Survey with Taxonomy and DirectionsCode2
Grimoire is All You Need for Enhancing Large Language ModelsCode2
ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of VideoCode2
PhilEO Bench: Evaluating Geo-Spatial Foundation ModelsCode2
RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric ScaleCode2
Deep Covariance Alignment for Domain Adaptive Remote Sensing Image SegmentationCode2
EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective AnalysisCode2
Adversarial Supervision Makes Layout-to-Image Diffusion Models ThriveCode2
DiffMoog: a Differentiable Modular Synthesizer for Sound MatchingCode2
A Survey on Learning from Graphs with Heterophily: Recent Advances and Future DirectionsCode2
Denoising Diffusion Probabilistic ModelsCode2
Show:102550
← PrevPage 133 of 13232Next →