| Efficient World Models with Context-Aware Tokenization | Jun 27, 2024 | Deep Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 2 |
| T-FREE: Subword Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings | Jun 27, 2024 | Cross-Lingual TransferTransfer Learning | CodeCode Available | 2 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Chat AI: A Seamless Slurm-Native Solution for HPC-Based Services | Jun 27, 2024 | Scheduling | CodeCode Available | 2 |
| DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance | Jun 26, 2024 | Image Generation | CodeCode Available | 2 |
| ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models | Jun 26, 2024 | Classification | CodeCode Available | 2 |
| A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems | Jun 26, 2024 | Audio Source SeparationDecoder | CodeCode Available | 2 |
| CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs | Jun 26, 2024 | Chart Understanding | CodeCode Available | 2 |
| KAGNNs: Kolmogorov-Arnold Networks meet Graph Learning | Jun 26, 2024 | Graph ClassificationGraph Learning | CodeCode Available | 2 |
| JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models | Jun 26, 2024 | LLM JailbreakSurvey | CodeCode Available | 2 |
| RetroGFN: Diverse and Feasible Retrosynthesis using GFlowNets | Jun 26, 2024 | RetrosynthesisSingle-step retrosynthesis | CodeCode Available | 2 |
| LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context Inference | Jun 26, 2024 | multimodal interaction | CodeCode Available | 2 |
| MathOdyssey: Benchmarking Mathematical Problem-Solving Skills in Large Language Models Using Odyssey Math Data | Jun 26, 2024 | BenchmarkingMath | CodeCode Available | 2 |
| The Surprising Effectiveness of Multimodal Large Language Models for Video Moment Retrieval | Jun 26, 2024 | Action LocalizationMoment Retrieval | CodeCode Available | 2 |
| GenRL: Multimodal-foundation world models for generalization in embodied agents | Jun 26, 2024 | BenchmarkingReinforcement Learning (RL) | CodeCode Available | 2 |
| MatchTime: Towards Automatic Soccer Game Commentary Generation | Jun 26, 2024 | | CodeCode Available | 2 |
| WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models | Jun 26, 2024 | ChatbotRed Teaming | CodeCode Available | 2 |
| Stable Diffusion Segmentation for Biomedical Images with Single-step Reverse Process | Jun 26, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration | Jun 26, 2024 | Contrastive LearningDeblurring | CodeCode Available | 2 |
| WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs | Jun 26, 2024 | | CodeCode Available | 2 |
| A Closer Look into Mixture-of-Experts in Large Language Models | Jun 26, 2024 | Computational EfficiencyDiversity | CodeCode Available | 2 |
| SynRS3D: A Synthetic Dataset for Global 3D Semantic Understanding from Monocular Remote Sensing Imagery | Jun 26, 2024 | Domain AdaptationEarth Observation | CodeCode Available | 2 |
| EmT: A Novel Transformer for Generalized Cross-subject EEG Emotion Recognition | Jun 26, 2024 | EEGEEG Emotion Recognition | CodeCode Available | 2 |
| EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation | Jun 26, 2024 | Action AnticipationAction Recognition | CodeCode Available | 2 |
| Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos | Jun 26, 2024 | Novel View SynthesisPoint Tracking | CodeCode Available | 2 |