| Benchmarking and Improving Text-to-SQL Generation under Ambiguity | Oct 20, 2023 | BenchmarkingDiversity | CodeCode Available | 0 |
| Is this sentence valid? An Arabic Dataset for Commonsense Validation | Aug 25, 2020 | Natural Language UnderstandingSentence | CodeCode Available | 0 |
| Skeptical binary inferences in multi-label problems with sets of probabilities | May 2, 2022 | valid | CodeCode Available | 0 |
| Are Red Roses Red? Evaluating Consistency of Question-Answering Models | Jul 1, 2019 | Question Answeringvalid | CodeCode Available | 0 |
| Adversarial Examples as an Input-Fault Tolerance Problem | Nov 30, 2018 | valid | CodeCode Available | 0 |
| SLIDE: a surrogate fairness constraint to ensure fairness consistency | Feb 7, 2022 | Fairnessvalid | CodeCode Available | 0 |
| Temporal Knowledge Base Completion: New Algorithms and Evaluation Protocols | May 2, 2020 | Knowledge Base CompletionKnowledge Graph Completion | CodeCode Available | 0 |
| Behavioral Augmentation of UML Class Diagrams: An Empirical Study of Large Language Models for Method Generation | Jun 1, 2025 | Model SelectionPrompt Engineering | CodeCode Available | 0 |
| An Empirical Analysis of how Internet Access Influences Public Opinion towards Undocumented Immigrants and Unaccompanied Children | Sep 28, 2021 | Time Series Analysisvalid | CodeCode Available | 0 |
| Universal Knowledge Graph Embeddings | Oct 23, 2023 | Entity DisambiguationGraph Embedding | CodeCode Available | 0 |