SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 80518100 of 661570 papers

TitleStatusHype
LLMs as Hackers: Autonomous Linux Privilege Escalation AttacksCode2
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry AreaCode2
Text2BIM: Generating Building Models Using a Large Language Model-based Multi-Agent FrameworkCode2
FastCPH: Efficient Survival Analysis for Neural NetworksCode2
C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake DetectionCode2
PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation AnalysisCode2
BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal RepresentationCode2
Scalable Autoregressive Image Generation with MambaCode2
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement LearningCode2
MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models AgentsCode2
LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token EmbeddingsCode2
Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable SegmentationCode2
Stochastic Parameter DecompositionCode2
Enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare ApplicationsCode2
Boosting Vision-Language Models for Histopathology Classification: Predict all at onceCode2
FunctionChat-Bench: Comprehensive Evaluation of Language Models' Generative Capabilities in Korean Tool-use DialogsCode2
Make Your ViT-based Multi-view 3D Detectors Faster via Token CompressionCode2
Towards a Unified View of Preference Learning for Large Language Models: A SurveyCode2
UniDet3D: Multi-dataset Indoor 3D Object DetectionCode2
A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven RefinementCode2
Assessing SPARQL capabilities of Large Language ModelsCode2
DiffusionPen: Towards Controlling the Style of Handwritten Text GenerationCode2
ThermalGaussian: Thermal 3D Gaussian SplattingCode2
What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)?Code2
Recent Trends of Multimodal Affective Computing: A Survey from NLP PerspectiveCode2
EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidanceCode2
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and SynthesisCode2
Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language ModelsCode2
Large Language Models are Strong Audio-Visual Speech Recognition LearnersCode2
HSIGene: A Foundation Model For Hyperspectral Image GenerationCode2
Small Language Models: Survey, Measurements, and InsightsCode2
Archon: An Architecture Search Framework for Inference-Time TechniquesCode2
Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and BenchmarksCode2
PointSAM: Pointly-Supervised Segment Anything Model for Remote Sensing ImagesCode2
LTNtorch: PyTorch Implementation of Logic Tensor NetworksCode2
Occupancy-Based Dual ContouringCode2
Revisiting the Solution of Meta KDD Cup 2024: CRAGCode2
Source-Free Domain Adaptation for YOLO Object DetectionCode2
Game4Loc: A UAV Geo-Localization Benchmark from Game DataCode2
E.T. Bench: Towards Open-Ended Event-Level Video-Language UnderstandingCode2
Resolving Multi-Condition Confusion for Finetuning-Free Personalized Image GenerationCode2
Rethinking the Power of Timestamps for Robust Time Series Forecasting: A Global-Local Fusion PerspectiveCode2
Melody-Guided Music GenerationCode2
Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image RestorationCode2
GSPR: Multimodal Place Recognition Using 3D Gaussian Splatting for Autonomous DrivingCode2
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion ControlCode2
WAFT: Warping-Alone Field Transforms for Optical FlowCode2
Selective Aggregation for Low-Rank Adaptation in Federated LearningCode2
StickyLand: Breaking the Linear Presentation of Computational NotebooksCode2
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language ModelsCode2
Show:102550
← PrevPage 162 of 13232Next →