SOTAVerified

Large Language Model

Papers

Showing 651700 of 6097 papers

TitleStatusHype
un^2CLIP: Improving CLIP's Visual Detail Capturing Ability via Inverting unCLIPCode1
VCapsBench: A Large-scale Fine-grained Benchmark for Video Caption Quality EvaluationCode1
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM AgentsCode1
ChatCFD: an End-to-End CFD Agent with Domain-specific Structured ThinkingCode1
REAL-Prover: Retrieval Augmented Lean Prover for Mathematical ReasoningCode1
GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K ResolutionCode1
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language NavigationCode1
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language ModelsCode1
REARANK: Reasoning Re-ranking Agent via Reinforcement LearningCode1
Multimodal LLM-Guided Semantic Correction in Text-to-Image DiffusionCode1
Unifying Multimodal Large Language Model Capabilities and Modalities via Model MergingCode1
NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question AnsweringCode1
Think or Not? Exploring Thinking Efficiency in Large Reasoning Models via an Information-Theoretic LensCode1
UniTTS: An end-to-end TTS system without decoupling of acoustic and semantic informationCode1
A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial OptimizationCode1
ChemMLLM: Chemical Multimodal Large Language ModelCode1
CRAKEN: Cybersecurity LLM Agent with Knowledge-Based ExecutionCode1
How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following BehaviorCode1
PiFlow: Principle-aware Scientific Discovery with Multi-Agent CollaborationCode1
U-SAM: An audio language Model for Unified Speech, Audio, and Music UnderstandingCode1
BusterX: MLLM-Powered AI-Generated Video Forgery Detection and ExplanationCode1
Tiny QA Benchmark++: Ultra-Lightweight, Synthetic Multilingual Dataset Generation & Smoke-Tests for Continuous LLM EvaluationCode1
Unifying Segment Anything in Microscopy with Multimodal Large Language ModelCode1
ImagineBench: Evaluating Reinforcement Learning with Large Language Model RolloutsCode1
Measuring General Intelligence with Generated GamesCode1
MELLM: Exploring LLM-Powered Micro-Expression Understanding Enhanced by Subtle Motion PerceptionCode1
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryCode1
WirelessAgent: Large Language Model Agents for Intelligent Wireless NetworksCode1
MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model FrameworkCode1
UniBiomed: A Universal Foundation Model for Grounded Biomedical Image InterpretationCode1
PhenoAssistant: A Conversational Multi-Agent AI System for Automated Plant PhenotypingCode1
LEAM: A Prompt-only Large Language Model-enabled Antenna Modeling MethodCode1
Expressing stigma and inappropriate responses prevents LLMs from safely replacing mental health providersCode1
Walk the Talk? Measuring the Faithfulness of Large Language Model ExplanationsCode1
Enhancing the Geometric Problem-Solving Ability of Multimodal LLMs via Symbolic-Neural IntegrationCode1
Retrieval-Augmented Generation with Conflicting EvidenceCode1
SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction UnderstandingCode1
AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly DetectionCode1
HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design TasksCode1
Omni-Dish: Photorealistic and Faithful Image Generation and Editing for Arbitrary Chinese DishesCode1
Fine-tuning a Large Language Model for Automating Computational Fluid Dynamics SimulationsCode1
Model Utility Law: Evaluating LLMs beyond Performance through Mechanism Interpretable MetricCode1
Apt-Serve: Adaptive Request Scheduling on Hybrid Cache for Scalable LLM Inference ServingCode1
Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM CollaborationCode1
Hessian of Perplexity for Large Language Models by PyTorch autograd (Open Source)Code1
Efficient Dynamic Clustering-Based Document Compression for Retrieval-Augmented-GenerationCode1
Representation Bending for Large Language Model SafetyCode1
Rethinking Key-Value Cache Compression Techniques for Large Language Model ServingCode1
Whisper-LM: Improving ASR Models with Language Models for Low-Resource LanguagesCode1
Imagine All The Relevance: Scenario-Profiled Indexing with Knowledge Expansion for Dense RetrievalCode1
Show:102550
← PrevPage 14 of 122Next →

No leaderboard results yet.