| A statistical model for aggregating judgments by incorporating peer predictions | Mar 14, 2017 | counterfactualMultiple-choice | —Unverified | 0 | 0 |
| Advanced Financial Reasoning at Scale: A Comprehensive Evaluation of Large Language Models on CFA Level III | Jun 29, 2025 | Model SelectionMultiple-choice | —Unverified | 0 | 0 |
| Hypothesis Testing for Quantifying LLM-Human Misalignment in Multiple Choice Settings | Jun 17, 2025 | Decision MakingLanguage Modeling | —Unverified | 0 | 0 |
| Identification of mental fatigue in language comprehension tasks based on EEG and deep learning | Apr 14, 2021 | ClassificationEEG | —Unverified | 0 | 0 |
| Treatment Effects with Multidimensional Unobserved Heterogeneity: Identification of the Marginal Treatment Effect | Sep 23, 2022 | Multiple-choice | —Unverified | 0 | 0 |
| Identifying Multiple Personalities in Large Language Models with External Evaluation | Feb 22, 2024 | Multiple-choice | —Unverified | 0 | 0 |
| How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering? | Jun 19, 2025 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites | Apr 25, 2024 | 4kLanguage Modeling | —Unverified | 0 | 0 |
| IIE-NLP-Eyas at SemEval-2021 Task 4: Enhancing PLM for ReCAM with Special Tokens, Re-Ranking, Siamese Encoders and Back Translation | Feb 25, 2021 | Multiple-choiceQuestion Answering | —Unverified | 0 | 0 |
| Confidence-Aware Learning Assistant | Feb 15, 2021 | Multiple-choice | —Unverified | 0 | 0 |