SOTAVerified

Hallucination

Papers

Showing 201225 of 1816 papers

TitleStatusHype
GraphArena: Benchmarking Large Language Models on Graph Computational ProblemsCode1
Grounded Chain-of-Thought for Multimodal Large Language ModelsCode1
Phare: A Safety Probe for Large Language ModelsCode1
GeoBenchX: Benchmarking LLMs for Multistep Geospatial TasksCode1
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and InteractivityCode1
BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language ModelsCode1
Balanced Classification: A Unified Framework for Long-Tailed Object DetectionCode1
Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for HallucinationsCode1
Benchmarking LLM Faithfulness in RAG with Evolving LeaderboardsCode1
BachGAN: High-Resolution Image Synthesis from Salient Object LayoutCode1
Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal RepresentationsCode1
Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models on Medical Challenge Problems & HallucinationsCode1
Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training ModelCode1
Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference OptimizationCode1
FlySearch: Exploring how vision-language models exploreCode1
Can LLMs be Good Graph Judge for Knowledge Graph Construction?Code1
PAINT: Paying Attention to INformed Tokens to Mitigate Hallucination in Large Vision-Language ModelCode1
Generating Natural Language Proofs with Verifier-Guided SearchCode1
An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language ModelsCode1
Harnessing GPT-4V(ision) for Insurance: A Preliminary ExplorationCode1
BIGPrior: Towards Decoupling Learned Prior Hallucination and Data Fidelity in Image RestorationCode1
KnowRL: Exploring Knowledgeable Reinforcement Learning for FactualityCode1
Federated Recommendation via Hybrid Retrieval Augmented GenerationCode1
Automatic Curriculum Expert Iteration for Reliable LLM ReasoningCode1
AdaPlanner: Adaptive Planning from Feedback with Language ModelsCode1
Show:102550
← PrevPage 9 of 73Next →

No leaderboard results yet.