| Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction | May 23, 2023 | Aspect-Based Sentiment AnalysisAspect-Based Sentiment Analysis (ABSA) | CodeCode Available | 0 |
| Benchmarking Machine Translation with Cultural Awareness | May 23, 2023 | BenchmarkingIn-Context Learning | CodeCode Available | 0 |
| Multilingual Large Language Models Are Not (Yet) Code-Switchers | May 23, 2023 | BenchmarkingLanguage Identification | —Unverified | 0 |
| Robust Model-Based Optimization for Challenging Fitness Landscapes | May 23, 2023 | Benchmarkingmodel | CodeCode Available | 0 |
| Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate | May 22, 2023 | BenchmarkingMath | —Unverified | 0 |
| How Fragile is Relation Extraction under Entity Replacements? | May 22, 2023 | BenchmarkingCausal Inference | CodeCode Available | 0 |
| A Benchmark on Extremely Weakly Supervised Text Classification: Reconcile Seed Matching and Prompting Approaches | May 22, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Value-at-Risk-Based Portfolio Insurance: Performance Evaluation and Benchmarking Against CPPI in a Markov-Modulated Regime-Switching Market | May 21, 2023 | BenchmarkingFinancial Analysis | —Unverified | 0 |
| Patterns of Convergence and Bound Constraint Violation in Differential Evolution on SBOX-COST Benchmarking Suite | May 20, 2023 | Benchmarking | —Unverified | 0 |
| TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks | May 19, 2023 | Benchmarking | —Unverified | 0 |