SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

659,983 papers248,104 code links4,818 tasks

Papers

Showing 13511400 of 659983 papers

TitleStatusHype
DeepFilterNet: Perceptually Motivated Real-Time Speech EnhancementCode4
Locally Typical SamplingCode4
Beyond Reward Hacking: Causal Rewards for Large Language Model AlignmentCode4
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language ModelsCode4
EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense PredictionCode4
AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning datasetCode4
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context LengthCode4
TIAViz: A Browser-based Visualization Tool for Computational Pathology ModelsCode4
MegActor: Harness the Power of Raw Video for Vivid Portrait AnimationCode4
SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted IndexesCode4
Retrieval-Generation Synergy Augmented Large Language ModelsCode4
OtterHD: A High-Resolution Multi-modality ModelCode4
Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context LearningCode4
Universal and Transferable Adversarial Attacks on Aligned Language ModelsCode4
Generative Representational Instruction TuningCode4
Competition-Level Code Generation with AlphaCodeCode4
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language ModelCode4
Zero-1-to-3: Zero-shot One Image to 3D ObjectCode4
Guiding a Diffusion Model with a Bad Version of ItselfCode4
ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and UnderstandingCode4
FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning SystemCode4
CrisperWhisper: Accurate Timestamps on Verbatim Speech TranscriptionsCode4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision TokenCode4
Attention Mesh: High-fidelity Face Mesh Prediction in Real-timeCode4
LBM: Latent Bridge Matching for Fast Image-to-Image TranslationCode4
No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and BenchmarksCode4
Dynamic 3D Gaussians: Tracking by Persistent Dynamic View SynthesisCode4
Improving and generalizing flow-based generative models with minibatch optimal transportCode4
DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid FrameworkCode4
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal TransformersCode4
LLM-Enhanced Data ManagementCode4
CoBa: Convergence Balancer for Multitask Finetuning of Large Language ModelsCode4
Sailor: Open Language Models for South-East AsiaCode4
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech TranslationCode4
Large Language Models for Data Annotation and Synthesis: A SurveyCode4
Vision GNN: An Image is Worth Graph of NodesCode4
PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and PricesCode4
OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and MetricsCode4
Convolutional Kolmogorov-Arnold NetworksCode4
FSID: Fully Synthetic Image Denoising via Procedural Scene GenerationCode4
StyleBooth: Image Style Editing with Multimodal InstructionCode4
Inception-Based Crowd Counting -- Being Fast while Remaining AccurateCode4
PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and BenchmarkCode4
miniCTX: Neural Theorem Proving with (Long-)ContextsCode4
How is ChatGPT's behavior changing over time?Code4
Deep Patch Visual SLAMCode4
Towards Automated Circuit Discovery for Mechanistic InterpretabilityCode4
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement LearningCode4
TigerBot: An Open Multilingual Multitask LLMCode4
PLAID: An Efficient Engine for Late Interaction RetrievalCode4
Show:102550
← PrevPage 28 of 13200Next →