| Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance | Oct 17, 2024 | Offline RLRe-Ranking | CodeCode Available | 1 |
| Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding | Oct 17, 2024 | HallucinationObject Hallucination | CodeCode Available | 1 |
| Interpreting Temporal Graph Neural Networks with Koopman Theory | Oct 17, 2024 | Dimensionality ReductionEpidemiology | CodeCode Available | 1 |
| EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning | Oct 17, 2024 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 |
| ControlAgent: Automating Control System Design via Novel Integration of LLM Agents and Domain Expertise | Oct 17, 2024 | Specificity | CodeCode Available | 1 |
| Unlocking the Capabilities of Masked Generative Models for Image Synthesis via Self-Guidance | Oct 17, 2024 | DiversityImage Generation | CodeCode Available | 1 |
| LESS: Label-Efficient and Single-Stage Referring 3D Segmentation | Oct 17, 2024 | cross-modal alignmentInstance Segmentation | CodeCode Available | 1 |
| Starbucks: Improved Training for 2D Matryoshka Embeddings | Oct 17, 2024 | Language Modellingtext similarity | CodeCode Available | 1 |
| Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers | Oct 17, 2024 | | CodeCode Available | 1 |
| Reward-free World Models for Online Imitation Learning | Oct 17, 2024 | Imitation LearningQ-Learning | CodeCode Available | 1 |
| MIRAGE-Bench: Automatic Multilingual Benchmark Arena for Retrieval-Augmented Generation Systems | Oct 17, 2024 | Answer GenerationLanguage Modeling | CodeCode Available | 1 |
| Hybrid bundle-adjusting 3D Gaussians for view consistent rendering with pose optimization | Oct 17, 2024 | Novel View Synthesis | CodeCode Available | 1 |
| TCP-Diffusion: A Multi-modal Diffusion Model for Global Tropical Cyclone Precipitation Forecasting with Change Awareness | Oct 17, 2024 | Precipitation Forecasting | CodeCode Available | 1 |
| RAMPA: Robotic Augmented Reality for Machine Programming by DemonstrAtion | Oct 17, 2024 | | CodeCode Available | 1 |
| MobA: Multifaceted Memory-Enhanced Adaptive Planning for Efficient Mobile Task Automation | Oct 17, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| SiamSeg: Self-Training with Contrastive Learning for Unsupervised Domain Adaptation Semantic Segmentation in Remote Sensing | Oct 17, 2024 | Contrastive LearningDiversity | CodeCode Available | 1 |
| FaithBench: A Diverse Hallucination Benchmark for Summarization by Modern LLMs | Oct 17, 2024 | DiversityHallucination | CodeCode Available | 1 |
| PORTAL: Scalable Tabular Foundation Models via Content-Specific Tokenization | Oct 17, 2024 | Self-Supervised Learning | CodeCode Available | 1 |
| DN-4DGS: Denoised Deformable Network with Temporal-Spatial Aggregation for Dynamic Scene Rendering | Oct 17, 2024 | 3DGSNeRF | CodeCode Available | 1 |
| FIRE: Fact-checking with Iterative Retrieval and Verification | Oct 17, 2024 | Claim VerificationFact Checking | CodeCode Available | 1 |
| Diffusing States and Matching Scores: A New Framework for Imitation Learning | Oct 17, 2024 | continuous-controlContinuous Control | CodeCode Available | 1 |
| EP-SAM: Weakly Supervised Histopathology Segmentation via Enhanced Prompt with Segment Anything | Oct 17, 2024 | DiagnosticGPU | CodeCode Available | 1 |
| Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all | Oct 17, 2024 | AllBenchmarking | CodeCode Available | 1 |
| Can MLLMs Understand the Deep Implication Behind Chinese Images? | Oct 17, 2024 | | CodeCode Available | 1 |
| Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs | Oct 17, 2024 | Quantization | CodeCode Available | 1 |
| Learning Graph Quantized Tokenizers | Oct 17, 2024 | Graph LearningQuantization | CodeCode Available | 1 |
| UniGS: Modeling Unitary 3D Gaussians for Novel View Synthesis from Sparse-view Images | Oct 17, 2024 | 3D ReconstructionDecoder | CodeCode Available | 1 |
| A Simulation System Towards Solving Societal-Scale Manipulation | Oct 17, 2024 | | CodeCode Available | 1 |
| Preference Diffusion for Recommendation | Oct 17, 2024 | Recommendation SystemsSequential Recommendation | CodeCode Available | 1 |
| Looking Inward: Language Models Can Learn About Themselves by Introspection | Oct 17, 2024 | Out-of-Distribution Generalization | CodeCode Available | 1 |
| Interpret and Control Dense Retrieval with Sparse Latent Features | Oct 17, 2024 | Retrieval | CodeCode Available | 1 |
| Diffusion Curriculum: Synthetic-to-Real Generative Curriculum Learning via Image-Guided Diffusion | Oct 17, 2024 | Data AugmentationImage Generation | CodeCode Available | 1 |
| Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation | Oct 17, 2024 | Decision Making | CodeCode Available | 1 |
| Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning | Oct 17, 2024 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 1 |
| Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge | Oct 16, 2024 | Classificationimage-classification | CodeCode Available | 1 |
| CREAM: Consistency Regularized Self-Rewarding Language Models | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Rethinking Token Reduction for State Space Models | Oct 16, 2024 | MambaState Space Models | CodeCode Available | 1 |
| FragNet: A Graph Neural Network for Molecular Property Prediction with Four Levels of Interpretability | Oct 16, 2024 | Drug DiscoveryGraph Neural Network | CodeCode Available | 1 |
| HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks | Oct 16, 2024 | Code GenerationHumanEval | CodeCode Available | 1 |
| VividMed: Vision Language Model with Versatile Visual Grounding for Medicine | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks | Oct 16, 2024 | Mathparameter-efficient fine-tuning | CodeCode Available | 1 |
| HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims | Oct 16, 2024 | Fact CheckingLanguage Modeling | CodeCode Available | 1 |
| Counterfactual Generative Modeling with Variational Causal Inference | Oct 16, 2024 | Causal Inferencecounterfactual | CodeCode Available | 1 |
| Meta-Unlearning on Diffusion Models: Preventing Relearning Unlearned Concepts | Oct 16, 2024 | | CodeCode Available | 1 |
| In-vivo high-resolution χ-separation at 7T | Oct 16, 2024 | | CodeCode Available | 1 |
| Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models | Oct 16, 2024 | Denoising | CodeCode Available | 1 |
| Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting | Oct 16, 2024 | Graph Neural NetworkSpatio-Temporal Forecasting | CodeCode Available | 1 |
| Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning | Oct 16, 2024 | 8k | CodeCode Available | 1 |
| Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models | Oct 16, 2024 | Computational EfficiencyTest-time Adaptation | CodeCode Available | 1 |
| Revealing the Barriers of Language Agents in Planning | Oct 16, 2024 | | CodeCode Available | 1 |