| GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Playing Pokémon Red via Deep Reinforcement Learning | Feb 27, 2025 | Deep Reinforcement LearningLanguage Modeling | CodeCode Available | 1 |
| KEDRec-LM: A Knowledge-distilled Explainable Drug Recommendation Large Language Model | Feb 27, 2025 | Drug DiscoveryKnowledge Graphs | —Unverified | 0 |
| From Retrieval to Generation: Comparing Different Approaches | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| M-LLM Based Video Frame Selection for Efficient Video Understanding | Feb 27, 2025 | EgoSchemaLanguage Modeling | —Unverified | 0 |
| Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Do Sparse Autoencoders Generalize? A Case Study of Answerability | Feb 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Collaborative Stance Detection via Small-Large Language Model Consistency Verification | Feb 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| ChatMol: A Versatile Molecule Designer Based on the Numerically Enhanced Large Language Model | Feb 27, 2025 | Bayesian OptimizationDrug Discovery | —Unverified | 0 |
| SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model | Feb 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Conformal Linguistic Calibration: Trading-off between Factuality and Specificity | Feb 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| I Know What I Don't Know: Improving Model Cascades Through Confidence Tuning | Feb 26, 2025 | Decoderimage-classification | —Unverified | 0 |
| TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency | Feb 26, 2025 | intent-classificationIntent Classification | CodeCode Available | 0 |
| Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision | Feb 26, 2025 | Audio SynthesisAutomatic Speech Recognition | —Unverified | 0 |
| Pathology Report Generation and Multimodal Representation Learning for Cutaneous Melanocytic Lesions | Feb 26, 2025 | Cross-Modal RetrievalLanguage Modeling | —Unverified | 0 |
| On the Importance of Text Preprocessing for Multimodal Representation Learning and Pathology Report Generation | Feb 26, 2025 | Cross-Modal RetrievalHallucination | —Unverified | 0 |
| AgentSociety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms | Feb 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ANPMI: Assessing the True Comprehension Capabilities of LLMs for Multiple Choice Questions | Feb 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A City of Millions: Mapping Literary Social Networks At Scale | Feb 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training | Feb 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Improving Representation Learning of Complex Critical Care Data with ICU-BERT | Feb 26, 2025 | Feature EngineeringLanguage Modeling | —Unverified | 0 |
| Kanana: Compute-efficient Bilingual Language Models | Feb 26, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Revealing Treatment Non-Adherence Bias in Clinical Machine Learning Using Large Language Models | Feb 26, 2025 | Causal InferenceLanguage Modeling | —Unverified | 0 |
| Evaluating Gender Bias in German Machine Translation | Feb 26, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems | Feb 25, 2025 | Bayesian OptimizationHyperparameter Optimization | —Unverified | 0 |