| Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data | Mar 20, 2024 | Memorization | CodeCode Available | 2 |
| A Decade's Battle on Dataset Bias: Are We There Yet? | Mar 13, 2024 | Memorization | CodeCode Available | 2 |
| SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis | Mar 4, 2024 | BenchmarkingDrug Discovery | CodeCode Available | 2 |
| Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration | Nov 10, 2023 | Inference AttackMembership Inference Attack | CodeCode Available | 2 |
| LawBench: Benchmarking Legal Knowledge of Large Language Models | Sep 28, 2023 | ArticlesBenchmarking | CodeCode Available | 2 |
| SimplyRetrieve: A Private and Lightweight Retrieval-Centric Generative AI Tool | Aug 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Drive Like a Human: Rethinking Autonomous Driving with Large Language Models | Jul 14, 2023 | Autonomous DrivingCommon Sense Reasoning | CodeCode Available | 2 |
| Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models | Jun 7, 2023 | DiversityImage Generation | CodeCode Available | 2 |
| Causal Reasoning and Large Language Models: Opening a New Frontier for Causality | Apr 28, 2023 | Causal DiscoveryCommon Sense Reasoning | CodeCode Available | 2 |
| DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation | Nov 18, 2022 | Code GenerationMemorization | CodeCode Available | 2 |