| Chronocept: Instilling a Sense of Time in Machines | May 12, 2025 | Fact CheckingRAG | CodeCode Available | 1 |
| GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation | Apr 30, 2025 | 3D Molecule GenerationBenchmarking | CodeCode Available | 1 |
| Classification under Nuisance Parameters and Generalized Label Shift in Likelihood-Free Inference | Feb 8, 2024 | Domain AdaptationUncertainty Quantification | CodeCode Available | 1 |
| CoCoA-MT: A Dataset and Benchmark for Contrastive Controlled MT with Application to Formality | May 9, 2022 | Machine TranslationSentence | CodeCode Available | 1 |
| A Closer Look at Invalid Action Masking in Policy Gradient Algorithms | Jun 25, 2020 | Deep Reinforcement LearningReal-Time Strategy Games | CodeCode Available | 1 |
| Characterizing information loss in a chaotic double pendulum with the Information Bottleneck | Oct 25, 2022 | valid | CodeCode Available | 1 |
| ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models | Jun 7, 2023 | valid | CodeCode Available | 1 |
| A Chinese Multi-label Affective Computing Dataset Based on Social Media Network Users | Nov 13, 2024 | Marketingvalid | CodeCode Available | 1 |
| Certified Deductive Reasoning with Language Models | Jun 6, 2023 | Logical Reasoningvalid | CodeCode Available | 1 |
| ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback | May 29, 2023 | Decision MakingDrug Discovery | CodeCode Available | 1 |