| Adaptive Reasoning and Acting in Medical Language Agents | Oct 13, 2024 | Decision MakingDiagnostic | —Unverified | 0 |
| EasyJudge: an Easy-to-use Tool for Comprehensive Response Evaluation of LLMs | Oct 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code | Oct 13, 2024 | Code GenerationHallucination | —Unverified | 0 |
| LoRE: Logit-Ranked Retriever Ensemble for Enhancing Open-Domain Question Answering | Oct 13, 2024 | Answer GenerationLanguage Modeling | —Unverified | 0 |
| COrAL: Order-Agnostic Language Modeling for Efficient Iterative Refinement | Oct 12, 2024 | Code GenerationComputational Efficiency | CodeCode Available | 0 |
| Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation | Oct 12, 2024 | Code GenerationLanguage Modeling | —Unverified | 0 |
| LINKED: Eliciting, Filtering and Integrating Knowledge in Large Language Model for Commonsense Reasoning | Oct 12, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 0 |
| LLMD: A Large Language Model for Interpreting Longitudinal Medical Records | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Enterprise Benchmarks for Large Language Model Evaluation | Oct 11, 2024 | BenchmarkingLanguage Model Evaluation | CodeCode Available | 0 |
| nach0-pc: Multi-task Language Model with Molecular Point Cloud Encoder | Oct 11, 2024 | Drug DiscoveryLanguage Modeling | —Unverified | 0 |
| The Same But Different: Structural Similarities and Differences in Multilingual Language Modeling | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| ACER: Automatic Language Model Context Extension via Retrieval | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Language-Model-Assisted Bi-Level Programming for Reward Learning from Internet Videos | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization | Oct 11, 2024 | GSM8KLanguage Modeling | CodeCode Available | 2 |
| Can a large language model be a gaslighter? | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Calibrated Cache Model for Few-Shot Vision-Language Model Adaptation | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Efficiently Scanning and Resampling Spatio-Temporal Tasks with Irregular Observations | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both | Oct 11, 2024 | Knowledge DistillationLanguage Modeling | —Unverified | 0 |
| Aerial Vision-and-Language Navigation via Semantic-Topo-Metric Representation Guided LLM Reasoning | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Emergent social conventions and collective bias in LLM populations | Oct 11, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Parameter-Efficient Fine-Tuning of State Space Models | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| uto\!L: Autonomous Evaluation of LLMs for Truth Maintenance and Reasoning Tasks | Oct 11, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| SimpleStrat: Diversifying Language Model Generation with Stratification | Oct 11, 2024 | DiversityLanguage Modeling | —Unverified | 0 |
| Generation with Dynamic Vocabulary | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Baichuan-Omni Technical Report | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |