| CityBench: Evaluating the Capabilities of Large Language Models for Urban Tasks | Jun 20, 2024 | General KnowledgeHuman Dynamics | CodeCode Available | 1 | 5 |
| Detecting Hallucinations in Large Language Model Generation: A Token Probability Approach | May 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge Distillation | Mar 21, 2024 | Data AugmentationDecision Making | CodeCode Available | 1 | 5 |
| Multi-Modal Classifiers for Open-Vocabulary Object Detection | Jun 8, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing | Feb 4, 2025 | Collaborative InferenceLanguage Modeling | CodeCode Available | 1 | 5 |
| Citekit: A Modular Toolkit for Large Language Model Citation Generation | Aug 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Explaining Relationships Between Scientific Documents | Feb 2, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher | Aug 21, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| Aligning LLM Agents by Learning Latent Preference from User Edits | Apr 23, 2024 | DescriptiveLanguage Modelling | CodeCode Available | 1 | 5 |
| Democratizing Reasoning Ability: Tailored Learning from Large Language Model | Oct 20, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 1 | 5 |
| Controllable Dialogue Simulation with In-Context Learning | Oct 9, 2022 | Data AugmentationIn-Context Learning | CodeCode Available | 1 | 5 |
| DesCo: Learning Object Recognition with Rich Language Descriptions | Jun 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning | Oct 11, 2024 | Data PoisoningLanguage Modeling | CodeCode Available | 1 | 5 |
| DefenderBench: A Toolkit for Evaluating Language Agents in Cybersecurity Environments | May 31, 2025 | Large Language Model | CodeCode Available | 1 | 5 |
| A Comprehensive Evaluation of Contemporary ML-Based Solvers for Combinatorial Optimization | May 22, 2025 | Combinatorial OptimizationLanguage Modeling | CodeCode Available | 1 | 5 |
| ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation | Dec 20, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes | May 25, 2023 | Computed Tomography (CT)Image Generation | CodeCode Available | 1 | 5 |
| Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search | May 24, 2024 | Code GenerationLanguage Modelling | CodeCode Available | 1 | 5 |
| PRD: Peer Rank and Discussion Improve Large Language Model based Evaluations | Jul 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| MF-LLM: Simulating Population Decision Dynamics via a Mean-Field Large Language Model Framework | Apr 30, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| Working Memory Capacity of ChatGPT: An Empirical Study | Apr 30, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 | 5 |
| Adapting Vision-Language Foundation Model for Next Generation Medical Ultrasound Image Analysis | Jun 10, 2025 | Domain AdaptationLarge Language Model | CodeCode Available | 1 | 5 |
| GIST: Generating Image-Specific Text for Fine-grained Object Classification | Jul 21, 2023 | ClassificationFine-Grained Image Classification | CodeCode Available | 1 | 5 |
| Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection | Dec 15, 2022 | Deep LearningGraph Learning | CodeCode Available | 1 | 5 |
| ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences | Nov 10, 2023 | Dialogue GenerationLanguage Modeling | CodeCode Available | 1 | 5 |