SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

474,278 papers248,326 code links4,818 tasks

Papers

Showing 1760117650 of 474278 papers

TitleStatusHype
Co-MTP: A Cooperative Trajectory Prediction Framework with Multi-Temporal Fusion for Autonomous DrivingCode1
Cross-domain Few-shot Object Detection with Multi-modal Textual EnrichmentCode1
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language ModelsCode1
BiDeV: Bilateral Defusing Verification for Complex Claim Fact-CheckingCode1
MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy SpectraCode1
CipherFace: A Fully Homomorphic Encryption-Driven Framework for Secure Cloud-Based Facial RecognitionCode1
TimePFN: Effective Multivariate Time Series Forecasting with Synthetic DataCode1
Linear Attention for Efficient Bidirectional Sequence ModelingCode1
Mapping 1,000+ Language Models via the Log-Likelihood VectorCode1
Understanding the Emergence of Multimodal Representation AlignmentCode1
Int2Int: a framework for mathematics with transformersCode1
Weakly Supervised Video Scene Graph Generation via Natural Language SupervisionCode1
CondiQuant: Condition Number Based Low-Bit Quantization for Image Super-ResolutionCode1
TurboFuzzLLM: Turbocharging Mutation-based Fuzzing for Effectively Jailbreaking Large Language Models in PracticeCode1
CoT-ICL Lab: A Petri Dish for Studying Chain-of-Thought Learning from In-Context DemonstrationsCode1
Lung-DDPM: Semantic Layout-guided Diffusion Models for Thoracic CT Image SynthesisCode1
KVLink: Accelerating Large Language Models via Efficient KV Cache ReuseCode1
Scaling Sparse and Dense Retrieval in Decoder-Only LLMsCode1
M3-AGIQA: Multimodal, Multi-Round, Multi-Aspect AI-Generated Image Quality AssessmentCode1
SiMHand: Mining Similar Hands for Large-Scale 3D Hand Pose Pre-trainingCode1
Forgotten Polygons: Multimodal Large Language Models are Shape-BlindCode1
Probe Pruning: Accelerating LLMs through Dynamic Pruning via Model-ProbingCode1
Almost AI, Almost Human: The Challenge of Detecting AI-Polished WritingCode1
Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-TuningCode1
Adversarial Prompt Evaluation: Systematic Benchmarking of Guardrails Against Prompt Input Attacks on LLMsCode1
R-LoRA: Random Initialization of Multi-Head LoRA for Multi-Task LearningCode1
TabMixer: advancing tabular data analysis with an enhanced MLP-mixer approachCode1
Scale-Free Graph-Language ModelsCode1
Leader-Follower Formation Tracking Control of Quadrotor UAVs Using Bearing MeasurementsCode1
Self-Taught Agentic Long Context UnderstandingCode1
ARS: Automatic Routing Solver with Large Language ModelsCode1
Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language ModelsCode1
FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMsCode1
Monocular Depth Estimation and Segmentation for Transparent Object with Iterative Semantic and Geometric FusionCode1
LabTOP: A Unified Model for Lab Test Outcome Prediction on Electronic Health RecordsCode1
How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain SimulationCode1
Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA DesignCode1
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation ModelsCode1
Plan-over-Graph: Towards Parallelable LLM Agent ScheduleCode1
EAGER-LLM: Enhancing Large Language Models as Recommenders through Exogenous Behavior-Semantic IntegrationCode1
Adaptive Convolution for CNN-based Speech Enhancement ModelsCode1
PEARL: Towards Permutation-Resilient LLMsCode1
Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural ArtifactsCode1
Towards Routing and Edge Computing in Satellite-Terrestrial Networks: A Column Generation ApproachCode1
InductionBench: LLMs Fail in the Simplest Complexity ClassCode1
FacaDiffy: Inpainting Unseen Facade Parts Using Diffusion ModelsCode1
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific InformationCode1
STeCa: Step-level Trajectory Calibration for LLM Agent LearningCode1
Generating π-Functional Molecules Using STGG+ with Active LearningCode1
Dynamic Low-Rank Sparse Adaptation for Large Language ModelsCode1
Show:102550
← PrevPage 353 of 9486Next →