| CMoralEval: A Moral Evaluation Benchmark for Chinese Large Language Models | Aug 19, 2024 | DiversityLanguage Modeling | CodeCode Available | 1 |
| MSDiagnosis: A Benchmark for Evaluating Large Language Models in Multi-Step Clinical Diagnosis | Aug 19, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant | Aug 19, 2024 | DescriptiveFace Swapping | CodeCode Available | 1 |
| A Comparison of Large Language Model and Human Performance on Random Number Generation Tasks | Aug 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Crossing New Frontiers: Knowledge-Augmented Large Language Model Prompting for Zero-Shot Text-Based De Novo Molecule Design | Aug 18, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis | Aug 18, 2024 | Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA) | —Unverified | 0 |
| HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model | Aug 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Grammatical Error Feedback: An Implicit Evaluation Approach | Aug 18, 2024 | Grammatical Error CorrectionLanguage Modeling | —Unverified | 0 |
| Moonshine: Distilling Game Content Generators into Steerable Generative Models | Aug 18, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FASST: Fast LLM-based Simultaneous Speech Translation | Aug 18, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |