| EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation | Oct 13, 2024 | Contrastive LearningLanguage Modeling | —Unverified | 0 |
| Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code | Oct 13, 2024 | Code GenerationHallucination | —Unverified | 0 |
| HARDMath: A Benchmark Dataset for Challenging Problems in Applied Mathematics | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LoRE: Logit-Ranked Retriever Ensemble for Enhancing Open-Domain Question Answering | Oct 13, 2024 | Answer GenerationLanguage Modeling | —Unverified | 0 |
| Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation | Oct 12, 2024 | Code GenerationLanguage Modeling | —Unverified | 0 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 |
| LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning | Oct 12, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 0 |
| Enterprise Benchmarks for Large Language Model Evaluation | Oct 11, 2024 | BenchmarkingLanguage Model Evaluation | CodeCode Available | 0 |
| LLMD: A Large Language Model for Interpreting Longitudinal Medical Records | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| nach0-pc: Multi-task Language Model with Molecular Point Cloud Encoder | Oct 11, 2024 | Drug DiscoveryLanguage Modeling | —Unverified | 0 |