| Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks | Jan 11, 2025 | Code GenerationHumanEval | —Unverified | 0 |
| Large Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement Learning | Sep 4, 2024 | Long-Context UnderstandingMulti-Objective Reinforcement Learning | —Unverified | 0 |
| XL^2Bench: A Benchmark for Extremely Long Context Understanding with Long-range Dependencies | Apr 8, 2024 | Long-Context UnderstandingReading Comprehension | —Unverified | 0 |
| Fine-Tuning Medical Language Models for Enhanced Long-Contextual Understanding and Domain Expertise | Jul 16, 2024 | DiagnosticLong-Context Understanding | —Unverified | 0 |
| LIFT: Improving Long Context Understanding Through Long Input Fine-Tuning | Dec 18, 2024 | In-Context LearningLong-Context Understanding | —Unverified | 0 |
| LIFT: Improving Long Context Understanding of Large Language Models through Long Input Fine-Tuning | Feb 20, 2025 | In-Context LearningLong-Context Understanding | —Unverified | 0 |
| Facilitating Long Context Understanding via Supervised Chain-of-Thought Reasoning | Feb 18, 2025 | 2kLong-Context Understanding | —Unverified | 0 |
| Enhancing Scientific Reproducibility Through Automated BioCompute Object Creation Using Retrieval-Augmented Generation from Publications | Sep 23, 2024 | HallucinationLong-Context Understanding | —Unverified | 0 |
| ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities | Jul 19, 2024 | 4k8k | —Unverified | 0 |
| Towards Robust Evaluation of STEM Education: Leveraging MLLMs in Project-Based Learning | May 16, 2025 | HallucinationInformation Retrieval | —Unverified | 0 |
| Can LLMs Maintain Fundamental Abilities under KV Cache Compression? | Feb 4, 2025 | Arithmetic ReasoningCode Generation | —Unverified | 0 |
| MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores | Apr 23, 2025 | Long-Context Understandingtoken-classification | —Unverified | 0 |
| Beyond the Limits: A Survey of Techniques to Extend the Context Length in Large Language Models | Feb 3, 2024 | Logical ReasoningLong-Context Understanding | —Unverified | 0 |
| Beyond Needle(s) in the Embodied Haystack: Environment, Architecture, and Training Considerations for Long Context Reasoning | May 22, 2025 | Long-Context Understanding | —Unverified | 0 |
| PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding | Jun 18, 2025 | Long-Context Understanding | —Unverified | 0 |
| Repository Structure-Aware Training Makes SLMs Better Issue Resolver | Dec 26, 2024 | Long-Context Understanding | —Unverified | 0 |
| ATLAS: Learning to Optimally Memorize the Context at Test Time | May 29, 2025 | Common Sense ReasoningLanguage Modeling | —Unverified | 0 |
| Retrieval Or Holistic Understanding? Dolce: Differentiate Our Long Context Evaluation Tasks | Sep 10, 2024 | Long-Context UnderstandingRetrieval | —Unverified | 0 |
| Revisiting Parallel Context Windows: A Frustratingly Simple Alternative and Chain-of-Thought Deterioration | May 24, 2023 | Long-Context Understanding | —Unverified | 0 |
| A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis | Jul 24, 2023 | Code GenerationDenoising | —Unverified | 0 |
| What matters when building vision-language models? | May 3, 2024 | 1 Image, 2*2 StitchingImage Retrieval | —Unverified | 0 |
| Anomaly Detection of Tabular Data Using LLMs | Jun 24, 2024 | Anomaly DetectionLong-Context Understanding | —Unverified | 0 |
| How Effective Is Self-Consistency for Long-Context Problems? | Nov 2, 2024 | Long-Context UnderstandingPosition | —Unverified | 0 |
| Token Weighting for Long-Range Language Modeling | Mar 12, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| MesaNet: Sequence Modeling by Locally Optimal Test-Time Training | Jun 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |