| Dissecting Human and LLM Preferences | Feb 17, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections | Nov 17, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models | Aug 19, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 | 5 |
| ClusterLLM: Large Language Models as a Guide for Text Clustering | May 24, 2023 | ClusteringLanguage Modelling | CodeCode Available | 1 | 5 |
| Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes | Oct 22, 2024 | GSM8KLanguage Modeling | CodeCode Available | 1 | 5 |
| CoLLM: A Large Language Model for Composed Image Retrieval | Mar 25, 2025 | Image RetrievalLanguage Modeling | CodeCode Available | 1 | 5 |
| CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control | Mar 14, 2025 | Computational EfficiencyLanguage Modeling | CodeCode Available | 1 | 5 |
| Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences | Jan 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CloudEval-YAML: A Practical Benchmark for Cloud Configuration Generation | Nov 10, 2023 | BenchmarkingCloud Computing | CodeCode Available | 1 | 5 |
| Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources | Sep 18, 2024 | GPULanguage Modeling | CodeCode Available | 1 | 5 |
| MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration | Nov 14, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model | Mar 31, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 | 5 |
| MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models | Feb 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL | Jun 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| C-LLM: Learn to Check Chinese Spelling Errors Character by Character | Jun 24, 2024 | Chinese Spell CheckingLanguage Modeling | CodeCode Available | 1 | 5 |
| M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis | Feb 17, 2025 | Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA) | CodeCode Available | 1 | 5 |
| Common Sense Enhanced Knowledge-based Recommendation with Large Language Model | Mar 27, 2024 | Common Sense ReasoningKnowledge Graphs | CodeCode Available | 1 | 5 |
| Empowering Large Language Model for Continual Video Question Answering with Collaborative Prompting | Oct 1, 2024 | Continual LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| Empowering Many, Biasing a Few: Generalist Credit Scoring through Large Language Models | Oct 1, 2023 | Decision MakingLanguage Modelling | CodeCode Available | 1 | 5 |
| Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach | May 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Can Large Language Models Understand Molecules? | Jan 5, 2024 | Drug DiscoveryLanguage Modelling | CodeCode Available | 1 | 5 |
| M^3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| DesCo: Learning Object Recognition with Rich Language Descriptions | Jun 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| LUMA: A Benchmark Dataset for Learning from Uncertain and Multimodal Data | Jun 14, 2024 | BenchmarkingDecision Making | CodeCode Available | 1 | 5 |