SOTAVerified

World Knowledge

Papers

Showing 251300 of 818 papers

TitleStatusHype
Bravo MaRDI: A Wikibase Powered Knowledge Graph on MathematicsCode0
DynaBench: A benchmark dataset for learning dynamical systems from low-resolution dataCode0
My Teacher Thinks The World Is Flat! Interpreting Automatic Essay Scoring MechanismCode0
Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related QueriesCode0
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World KnowledgeCode0
NLITrans at SemEval-2018 Task 12: Transfer of Semantic Knowledge for Argument ComprehensionCode0
DREAM: A Challenge Dataset and Models for Dialogue-Based Reading ComprehensionCode0
BottleHumor: Self-Informed Humor Explanation using the Information Bottleneck PrincipleCode0
Morph Call: Probing Morphosyntactic Content of Multilingual TransformersCode0
DORA The Explorer: Directed Outreaching Reinforcement Action-SelectionCode0
Self-Bootstrapped Visual-Language Model for Knowledge Selection and Question AnsweringCode0
Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference AlignmentCode0
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMsCode0
An Empirical Study on Few-shot Knowledge Probing for Pretrained Language ModelsCode0
A Study of Implicit Ranking Unfairness in Large Language ModelsCode0
Does Commonsense help in detecting Sarcasm?Code0
Mitigating Hallucination in Fictional Character Role-PlayCode0
Mitigating Temporal Misalignment by Discarding Outdated FactsCode0
MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News SummarizationCode0
Anchoring Path for Inductive Relation Prediction in Knowledge GraphsCode0
MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided ConversationsCode0
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge InjectionCode0
BiasKG: Adversarial Knowledge Graphs to Induce Bias in Large Language ModelsCode0
Memory-Modular Classification: Learning to Generalize with Memory ReplacementCode0
Advancing and Benchmarking Personalized Tool Invocation for LLMsCode0
Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian LanguagesCode0
Mechanistic Understanding and Mitigation of Language Model Non-Factual HallucinationsCode0
Logic Attention Based Neighborhood Aggregation for Inductive Knowledge Graph EmbeddingCode0
LoRec: Large Language Model for Robust Sequential Recommendation against Poisoning AttacksCode0
Locating and Extracting Relational Concepts in Large Language ModelsCode0
LoFTI: Localization and Factuality Transfer to Indian LocalesCode0
LLM-based Agent Simulation for Maternal Health Interventions: Uncertainty Estimation and Decision-focused EvaluationCode0
Localizing Active Objects from Egocentric Vision with Symbolic World KnowledgeCode0
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and ChallengesCode0
CoRTEx: Contrastive Learning for Representing Terms via Explanations with Applications on Constructing Biomedical Knowledge GraphsCode0
Benchmarking Multi-Image Understanding in Vision and Language Models: Perception, Knowledge, Reasoning, and Multi-Hop ReasoningCode0
LLM4CD: Leveraging Large Language Models for Open-World Knowledge Augmented Cognitive DiagnosisCode0
LLaVA-VSD: Large Language-and-Vision Assistant for Visual Spatial DescriptionCode0
LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language ModelCode0
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data UncertaintyCode0
Modeling Semantic Plausibility by Injecting World KnowledgeCode0
Language models show human-like content effects on reasoning tasksCode0
Large Language Models Need Consultants for Reasoning: Becoming an Expert in a Complex Human System Through Behavior SimulationCode0
Contextual Knowledge Pursuit for Faithful Visual SynthesisCode0
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation dataCode0
Log Probabilities Are a Reliable Estimate of Semantic Plausibility in Base and Instruction-Tuned Language ModelsCode0
Knowledge Graph Completion with Mixed Geometry Tensor FactorizationCode0
Knowledge Boundary and Persona Dynamic Shape A Better Social Media AgentCode0
Knowledge Generation -- Variational Bayes on Knowledge GraphsCode0
Language Model Behavior: A Comprehensive SurveyCode0
Show:102550
← PrevPage 6 of 17Next →

No leaderboard results yet.