| DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling | Mar 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| Do Large Language Model Benchmarks Test Reliability? | Feb 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning | Oct 10, 2024 | Language ModellingLarge Language Model | CodeCode Available | 1 | 5 |
| AutoProteinEngine: A Large Language Model Driven Agent Framework for Multimodal AutoML in Protein Engineering | Nov 7, 2024 | AutoMLHyperparameter Optimization | CodeCode Available | 1 | 5 |
| DRAMA-X: A Fine-grained Intent Prediction and Risk Reasoning Benchmark For Driving | Jun 21, 2025 | Autonomous DrivingDescriptive | CodeCode Available | 1 | 5 |
| LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation | Sep 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| GPTailor: Large Language Model Pruning Through Layer Cutting and Stitching | Jun 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 | 5 |
| Enabling LLM Knowledge Analysis via Extensive Materialization | Nov 7, 2024 | Knowledge Base ConstructionLarge Language Model | CodeCode Available | 1 | 5 |
| LLM experiments with simulation: Large Language Model Multi-Agent System for Simulation Model Parametrization in Digital Twins | May 28, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 | 5 |
| Distillation Matters: Empowering Sequential Recommenders to Match the Performance of Large Language Model | May 1, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 1 | 5 |