| TestAgent: An Adaptive and Intelligent Expert for Human Assessment | Jun 3, 2025 | Large Language ModelQuestion Selection | —Unverified | 0 |
| AutoJudger: An Agent-Driven Framework for Efficient Benchmarking of MLLMs | May 27, 2025 | BenchmarkingQuestion Selection | CodeCode Available | 0 |
| QA-prompting: Improving Summarization with Large Language Models using Question-Answering | May 20, 2025 | In-Context LearningQuestion Answering | CodeCode Available | 0 |
| Decentralized Arena: Towards Democratic and Scalable Automatic Evaluation of Language Models | May 19, 2025 | BenchmarkingChatbot | CodeCode Available | 1 |
| Adaptive political surveys and GPT-4: Tackling the cold start problem with simulated user interactions | Mar 12, 2025 | Question Selection | CodeCode Available | 0 |
| PSCon: Product Search Through Conversations | Feb 19, 2025 | Intent DetectionKeyword Extraction | CodeCode Available | 0 |
| Active Task Disambiguation with LLMs | Feb 6, 2025 | Experimental DesignQuestion Selection | CodeCode Available | 1 |
| Diffusion-Inspired Cold Start with Sufficient Prior in Computerized Adaptive Testing | Nov 19, 2024 | Question Selection | CodeCode Available | 1 |
| An incremental preference elicitation-based approach to learning potentially non-monotonic preferences in multi-criteria sorting | Sep 4, 2024 | Active LearningQuestion Selection | —Unverified | 0 |
| Fast and Adaptive Questionnaires for Voting Advice Applications | Apr 2, 2024 | DecoderMissing Values | CodeCode Available | 0 |