| TimeSoccer: An End-to-End Multimodal Large Language Model for Soccer Commentary Generation | Apr 24, 2025 | Caption GenerationDense Video Captioning | —Unverified | 0 |
| ParamΔ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost | Apr 23, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Planning with Diffusion Models for Target-Oriented Dialogue Systems | Apr 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Monte Carlo Planning with Large Language Model for Text-Based Game Agents | Apr 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Improving Significant Wave Height Prediction Using Chronos Models | Apr 23, 2025 | Computational EfficiencyLanguage Modeling | —Unverified | 0 |
| SplitReason: Learning To Offload Reasoning | Apr 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| In-Context Learning can distort the relationship between sequence likelihoods and biological fitness | Apr 23, 2025 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion | Apr 23, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FaceInsight: A Multimodal Large Language Model for Face Perception | Apr 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Do It For Me vs. Do It With Me: Investigating User Perceptions of Different Paradigms of Automation in Copilots for Feature-Rich Software | Apr 22, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LLMs meet Federated Learning for Scalable and Secure IoT Management | Apr 22, 2025 | Computational EfficiencyDecision Making | —Unverified | 0 |
| Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs DeepSeek-V3 | Apr 22, 2025 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| DATETIME: A new benchmark to measure LLM translation and reasoning capabilities | Apr 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Research on Cloud Platform Network Traffic Monitoring and Anomaly Detection System based on Large Language Models | Apr 22, 2025 | Anomaly DetectionComputational Efficiency | —Unverified | 0 |
| What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns | Apr 22, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Large Language Model Empowered Privacy-Protected Framework for PHI Annotation in Clinical Notes | Apr 22, 2025 | De-identificationLanguage Modeling | —Unverified | 0 |
| Enhancing TCR-Peptide Interaction Prediction with Pretrained Language Models and Molecular Representations | Apr 22, 2025 | BenchmarkingFew-Shot Learning | —Unverified | 0 |
| RepliBench: Evaluating the Autonomous Replication Capabilities of Language Model Agents | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Kuwain 1.5B: An Arabic SLM via Language Injection | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Speculative Sampling via Exponential Races | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models | Apr 21, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions | Apr 21, 2025 | EthicsLanguage Modeling | —Unverified | 0 |
| Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark | Apr 21, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| PROMPTEVALS: A Dataset of Assertions and Guardrails for Custom Production Large Language Model Pipelines | Apr 20, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |