SOTAVerified

The Open Verification Layer for ML Research

Community benchmark tracking and reproducibility verification. Built for researchers and autonomous research agents.

661,570 papers248,326 code links4,818 tasks

Papers

Showing 1085110900 of 661570 papers

TitleStatusHype
SOTOPIA: Interactive Evaluation for Social Intelligence in Language AgentsCode2
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based ArchitectureCode2
Iterative Methods for Vecchia-Laplace Approximations for Latent Gaussian Process ModelsCode2
LLMs as Hackers: Autonomous Linux Privilege Escalation AttacksCode2
BitNet: Scaling 1-bit Transformers for Large Language ModelsCode2
GenEval: An Object-Focused Framework for Evaluating Text-to-Image AlignmentCode2
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian SplattingCode2
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language ModelsCode2
AdaLomo: Low-memory Optimization with Adaptive Learning RateCode2
LAMP: Learn A Motion Pattern for Few-Shot-Based Video GenerationCode2
IDRNet: Intervention-Driven Relation Network for Semantic SegmentationCode2
FATE-LLM: A Industrial Grade Federated Learning Framework for Large Language ModelsCode2
HairCLIPv2: Unifying Hair Editing via Proxy Feature BlendingCode2
On Generative Agents in RecommendationCode2
Character-LLM: A Trainable Agent for Role-PlayingCode2
Few-Shot Learning Patterns in Financial Time-Series for Trend-Following StrategiesCode2
The Calysto Scheme ProjectCode2
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody ModellingCode2
An Expression Tree Decoding Strategy for Mathematical Equation GenerationCode2
Hawkeye: A PyTorch-based Library for Fine-Grained Image Recognition with Deep LearningCode2
A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language ModelsCode2
From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language ModelsCode2
ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language ModelsCode2
X-Pose: Detecting Any KeypointsCode2
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training ParadigmCode2
GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion ModelsCode2
Jailbreaking Black Box Large Language Models in Twenty QueriesCode2
Learning to Act from Actionless Videos through Dense CorrespondencesCode2
UniPAD: A Universal Pre-training Paradigm for Autonomous DrivingCode2
DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image EditingCode2
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language ModelsCode2
OmniControl: Control Any Joint at Any Time for Human Motion GenerationCode2
Im4D: High-Fidelity and Real-Time Novel View Synthesis for Dynamic ScenesCode2
Octopus: Embodied Vision-Language Programmer from Environmental FeedbackCode2
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-SpecificityCode2
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion ModelsCode2
ProbTS: Benchmarking Point and Distributional Forecasting across Diverse Prediction HorizonsCode2
VeCLIP: Improving CLIP Training via Visual-enriched CaptionsCode2
Mini-DALLE3: Interactive Text to Image by Prompting Large Language ModelsCode2
LLark: A Multimodal Instruction-Following Language Model for MusicCode2
Large Language Models Are Zero-Shot Time Series ForecastersCode2
DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion modelCode2
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech RecognitionCode2
Making Large Language Models Perform Better in Knowledge Graph CompletionCode2
TopoMLP: A Simple yet Strong Pipeline for Driving Topology ReasoningCode2
A Semantic Invariant Robust Watermark for Large Language ModelsCode2
Lemur: Harmonizing Natural Language and Code for Language AgentsCode2
Uni3D: Exploring Unified 3D Representation at ScaleCode2
Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningCode2
Conformal Prediction for Deep Classifier via Label RankingCode2
Show:102550
← PrevPage 218 of 13232Next →