SOTAVerified

World Knowledge

Papers

Showing 226250 of 818 papers

TitleStatusHype
Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language ModelsCode0
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data0
LoFTI: Localization and Factuality Transfer to Indian LocalesCode0
VISA: Reasoning Video Object Segmentation via Large Language ModelsCode3
Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent CommunitiesCode1
VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving0
Language Representations Can be What Recommenders Need: Findings and PotentialsCode2
BAPO: Base-Anchored Preference Optimization for Overcoming Forgetting in Large Language Models Personalization0
LLaRA: Supercharging Robot Learning Data for Vision-Language PolicyCode3
Scaling Synthetic Data Creation with 1,000,000,000 PersonasCode11
Mental Modeling of Reinforcement Learning Agents by Language Models0
LABOR-LLM: Language-Based Occupational Representations with Large Language Models0
Mitigating Hallucination in Fictional Character Role-PlayCode0
Exploring Factual Entailment with NLI: A News Media Study0
Evaluating the Ability of Large Language Models to Reason about Cardinal Directions0
On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models0
LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text EnvironmentsCode2
OCALM: Object-Centric Assessment with Language Models0
What Teaches Robots to Walk, Teaches Them to Trade too -- Regime Adaptive Execution using Informed Data and LLMs0
Locating and Extracting Relational Concepts in Large Language ModelsCode0
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia0
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop ReasoningCode0
Are Large Language Models True Healthcare Jacks-of-All-Trades? Benchmarking Across Health Professions Beyond Physician ExamsCode0
A Systematic Analysis of Large Language Models as Soft Reasoners: The Case of Syllogistic InferencesCode0
RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language ModelsCode2
Show:102550
← PrevPage 10 of 33Next →

No leaderboard results yet.