| BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling | Mar 7, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| JAX-SPH: A Differentiable Smoothed Particle Hydrodynamics Framework | Mar 7, 2024 | Dataset Generation | CodeCode Available | 2 |
| LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error | Mar 7, 2024 | Continual LearningIn-Context Learning | CodeCode Available | 2 |
| Large Language Models are In-Context Molecule Learners | Mar 7, 2024 | Cross-Modal RetrievalIn-Context Learning | CodeCode Available | 2 |
| AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors | Mar 7, 2024 | Facial Action Unit DetectionTransfer Learning | CodeCode Available | 2 |
| Online Adaptation of Language Models with a Memory of Amortized Contexts | Mar 7, 2024 | Language ModellingMeta-Learning | CodeCode Available | 2 |
| Mastering Memory Tasks with World Models | Mar 7, 2024 | Model-based Reinforcement LearningState Space Models | CodeCode Available | 2 |
| QAQ: Quality Adaptive Quantization for LLM KV Cache | Mar 7, 2024 | QuantizationQuestion Answering | CodeCode Available | 2 |
| Active Generalized Category Discovery | Mar 7, 2024 | Active Learningimbalanced classification | CodeCode Available | 2 |
| An Item is Worth a Prompt: Versatile Image Editing with Disentangled Control | Mar 7, 2024 | Descriptive | CodeCode Available | 2 |
| Backtracing: Retrieving the Cause of the Query | Mar 6, 2024 | Information RetrievalLanguage Modeling | CodeCode Available | 2 |
| Extend Your Own Correspondences: Unsupervised Distant Point Cloud Registration by Progressive Distance Extension | Mar 6, 2024 | Point Cloud Registration | CodeCode Available | 2 |
| Mamba4Rec: Towards Efficient Sequential Recommendation with Selective State Space Models | Mar 6, 2024 | MambaRecommendation Systems | CodeCode Available | 2 |
| Task Attribute Distance for Few-Shot Learning: Theoretical Analysis and Applications | Mar 6, 2024 | AttributeData Augmentation | CodeCode Available | 2 |
| MeaCap: Memory-Augmented Zero-shot Image Captioning | Mar 6, 2024 | Caption GenerationImage Captioning | CodeCode Available | 2 |
| Learning to Decode Collaboratively with Multiple Language Models | Mar 6, 2024 | Instruction Following | CodeCode Available | 2 |
| Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People | Mar 6, 2024 | | CodeCode Available | 2 |
| MolNexTR: A Generalized Deep Learning Model for Molecular Image Recognition | Mar 6, 2024 | Data AugmentationDeep Learning | CodeCode Available | 2 |
| ShortGPT: Layers in Large Language Models are More Redundant Than You Expect | Mar 6, 2024 | Quantization | CodeCode Available | 2 |
| VastTrack: Vast Category Visual Object Tracking | Mar 6, 2024 | ObjectObject Tracking | CodeCode Available | 2 |
| DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training | Mar 6, 2024 | DenoisingDiversity | CodeCode Available | 2 |
| GPTopic: Dynamic and Interactive Topic Representations | Mar 6, 2024 | | CodeCode Available | 2 |
| An L-BFGS-B approach for linear and nonlinear system identification under _1 and group-Lasso regularization | Mar 6, 2024 | State Space Modelssubspace methods | CodeCode Available | 2 |
| NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging | Mar 6, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| Diffusion-based Generative Prior for Low-Complexity MIMO Channel Estimation | Mar 6, 2024 | | CodeCode Available | 2 |
| Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning | Mar 6, 2024 | Multimodal ReasoningQuestion Answering | CodeCode Available | 2 |
| MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer | Mar 5, 2024 | | CodeCode Available | 2 |
| Towards Measuring and Modeling "Culture" in LLMs: A Survey | Mar 5, 2024 | Survey | CodeCode Available | 2 |
| FinReport: Explainable Stock Earnings Forecasting via News Factor Analyzing Model | Mar 5, 2024 | Stock Market Prediction | CodeCode Available | 2 |
| Interactive Continual Learning: Fast and Slow Thinking | Mar 5, 2024 | Continual LearningOutlier Detection | CodeCode Available | 2 |
| PPFlow: Target-aware Peptide Design with Torsional Flow Matching | Mar 5, 2024 | Drug DesignDrug Discovery | CodeCode Available | 2 |
| Android in the Zoo: Chain-of-Action-Thought for GUI Agents | Mar 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Learning without Exact Guidance: Updating Large-scale High-resolution Land Cover Maps from Low-resolution Historical Labels | Mar 5, 2024 | Pseudo LabelSemantic Segmentation | CodeCode Available | 2 |
| Semantic Human Mesh Reconstruction with Textures | Mar 5, 2024 | | CodeCode Available | 2 |
| InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | Mar 5, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| ESM All-Atom: Multi-scale Protein Language Model for Unified Molecular Modeling | Mar 5, 2024 | AllLanguage Modeling | CodeCode Available | 2 |
| What do we learn from inverting CLIP models? | Mar 5, 2024 | | CodeCode Available | 2 |
| TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts | Mar 5, 2024 | Graph AttentionGraph Embedding | CodeCode Available | 2 |
| PointCore: Efficient Unsupervised Point Cloud Anomaly Detector Using Local-Global Features | Mar 4, 2024 | Anomaly DetectionAutonomous Driving | CodeCode Available | 2 |
| Multi-perspective Improvement of Knowledge Graph Completion with Large Language Models | Mar 4, 2024 | Knowledge Graph CompletionKnowledge Graphs | CodeCode Available | 2 |
| Trainable Fractional Fourier Transform | Mar 4, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Large language models surpass human experts in predicting neuroscience results | Mar 4, 2024 | | CodeCode Available | 2 |
| VTG-GPT: Tuning-Free Zero-Shot Video Temporal Grounding with GPT | Mar 4, 2024 | Image CaptioningZero-shot Moment Retrieval | CodeCode Available | 2 |
| Learning to Solve Job Shop Scheduling under Uncertainty | Mar 4, 2024 | Combinatorial OptimizationDeep Reinforcement Learning | CodeCode Available | 2 |
| xT: Nested Tokenization for Larger Context in Large Images | Mar 4, 2024 | | CodeCode Available | 2 |
| MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection | Mar 4, 2024 | GPUMamba | CodeCode Available | 2 |
| Birbal: An efficient 7B instruct-model fine-tuned with curated datasets | Mar 4, 2024 | GPU | CodeCode Available | 2 |
| Wukong: Towards a Scaling Law for Large-Scale Recommendation | Mar 4, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| PixIT: Joint Training of Speaker Diarization and Speech Separation from Real-world Multi-speaker Recordings | Mar 4, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Making Pre-trained Language Models Great on Tabular Prediction | Mar 4, 2024 | Prediction | CodeCode Available | 2 |