| FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder | Jan 18, 2024 | | CodeCode Available | 2 |
| R-Judge: Benchmarking Safety Risk Awareness for LLM Agents | Jan 18, 2024 | Benchmarking | CodeCode Available | 2 |
| A Simple Latent Diffusion Approach for Panoptic Segmentation and Mask Inpainting | Jan 18, 2024 | Instance SegmentationInteractive Segmentation | CodeCode Available | 2 |
| SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model | Jan 18, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Enabling Efficient Equivariant Operations in the Fourier Basis via Gaunt Tensor Products | Jan 18, 2024 | | CodeCode Available | 2 |
| Towards Language-Driven Video Inpainting via Multimodal Large Language Models | Jan 18, 2024 | Video Inpainting | CodeCode Available | 2 |
| LangProp: A code optimization framework using Large Language Models applied to driving | Jan 18, 2024 | Autonomous DrivingCode Generation | CodeCode Available | 2 |
| Spatial-Temporal Large Language Model for Traffic Prediction | Jan 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Adaptive Kalman-Informed Transformer | Jan 18, 2024 | Sensor Fusion | CodeCode Available | 2 |
| Cooperative Edge Caching Based on Elastic Federated and Multi-Agent Deep Reinforcement Learning in Next-Generation Network | Jan 18, 2024 | Deep Reinforcement LearningFederated Learning | CodeCode Available | 2 |
| Objects With Lighting: A Real-World Dataset for Evaluating Reconstruction and Rendering for Object Relighting | Jan 17, 2024 | Inverse RenderingNovel View Synthesis | CodeCode Available | 2 |
| Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior | Jan 17, 2024 | 3D GenerationText to 3D | CodeCode Available | 2 |
| Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis | Jan 17, 2024 | DisentanglementImage Generation | CodeCode Available | 2 |
| Autonomous Catheterization with Open-source Simulator and Expert Trajectory | Jan 17, 2024 | | CodeCode Available | 2 |
| RWKV-TS: Beyond Traditional Recurrent Neural Network for Time Series Tasks | Jan 17, 2024 | Computational EfficiencyTime Series | CodeCode Available | 2 |
| Tri^2-plane: Thinking Head Avatar via Feature Pyramid | Jan 17, 2024 | | CodeCode Available | 2 |
| Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Jan 17, 2024 | GPUImage Classification | CodeCode Available | 2 |
| PPSURF: Combining Patches and Point Convolutions for Detailed Surface Reconstruction | Jan 16, 2024 | Surface Reconstruction | CodeCode Available | 2 |
| Tuning Language Models by Proxy | Jan 16, 2024 | Domain AdaptationMath | CodeCode Available | 2 |
| Adversarial Supervision Makes Layout-to-Image Diffusion Models Thrive | Jan 16, 2024 | Domain GeneralizationImage Generation | CodeCode Available | 2 |
| DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment | Jan 16, 2024 | DisentanglementSelf-Supervised Learning | CodeCode Available | 2 |
| WAVES: Benchmarking the Robustness of Image Watermarks | Jan 16, 2024 | Benchmarking | CodeCode Available | 2 |
| UV-SAM: Adapting Segment Anything Model for Urban Village Identification | Jan 16, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Transcending the Limit of Local Window: Advanced Super-Resolution Transformer with Adaptive Token Dictionary | Jan 16, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models (Exemplified as A Video Agent) | Jan 16, 2024 | Scheduling | CodeCode Available | 2 |
| EmoLLMs: A Series of Emotional Large Language Models and Annotation Tools for Comprehensive Affective Analysis | Jan 16, 2024 | Instruction Followingregression | CodeCode Available | 2 |
| MMToM-QA: Multimodal Theory of Mind Question Answering | Jan 16, 2024 | Question AnsweringTheory of Mind Modeling | CodeCode Available | 2 |
| OBSeg: Accurate and Fast Instance Segmentation Framework Using Segmentation Foundation Models with Oriented Bounding Box Prompts | Jan 16, 2024 | Amodal Instance SegmentationInstance Segmentation | CodeCode Available | 2 |
| Spatial-Semantic Collaborative Cropping for User Generated Content | Jan 16, 2024 | Image Cropping | CodeCode Available | 2 |
| Efficient4D: Fast Dynamic 3D Object Generation from a Single-view Video | Jan 16, 2024 | Image GenerationImage to 3D | CodeCode Available | 2 |
| Fixed Point Diffusion Models | Jan 16, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| SciInstruct: a Self-Reflective Instruction Annotated Dataset for Training Scientific Language Models | Jan 15, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| E3x: E(3)-Equivariant Deep Learning Made Easy | Jan 15, 2024 | Deep Learning | CodeCode Available | 2 |
| Authorship Obfuscation in Multilingual Machine-Generated Text Detection | Jan 15, 2024 | Adversarial RobustnessBenchmarking | CodeCode Available | 2 |
| Improved Implicit Neural Representation with Fourier Reparameterized Training | Jan 15, 2024 | | CodeCode Available | 2 |
| Integrate Any Omics: Towards genome-wide data integration for patient stratification | Jan 15, 2024 | Data IntegrationDiversity | CodeCode Available | 2 |
| Fine-Grained Prototypes Distillation for Few-Shot Object Detection | Jan 15, 2024 | Few-Shot Object DetectionMeta-Learning | CodeCode Available | 2 |
| PMFSNet: Polarized Multi-scale Feature Self-attention Network For Lightweight Medical Image Segmentation | Jan 15, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| CoVO-MPC: Theoretical Analysis of Sampling-based MPC and Optimal Covariance Design | Jan 14, 2024 | Model-based Reinforcement LearningModel Predictive Control | CodeCode Available | 2 |
| PDE Generalization of In-Context Operator Networks: A Study on 1D Scalar Nonlinear Conservation Laws | Jan 14, 2024 | Operator learning | CodeCode Available | 2 |
| EHRAgent: Code Empowers Large Language Models for Few-shot Complex Tabular Reasoning on Electronic Health Records | Jan 13, 2024 | Code GenerationFew-Shot Learning | CodeCode Available | 2 |
| Graph Language Models | Jan 13, 2024 | Knowledge GraphsLanguage Modeling | CodeCode Available | 2 |
| Extending LLMs' Context Window with 100 Samples | Jan 13, 2024 | Position | CodeCode Available | 2 |
| Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification | Jan 12, 2024 | ClusteringPerson Re-Identification | CodeCode Available | 2 |
| Prometheus-Vision: Vision-Language Model as a Judge for Fine-Grained Evaluation | Jan 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Expected Shapley-Like Scores of Boolean Functions: Complexity and Applications to Probabilistic Databases | Jan 12, 2024 | | CodeCode Available | 2 |
| Vehicle: Bridging the Embedding Gap in the Verification of Neuro-Symbolic Programs | Jan 12, 2024 | | CodeCode Available | 2 |
| Mission: Impossible Language Models | Jan 12, 2024 | | CodeCode Available | 2 |
| Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery | Jan 12, 2024 | Object RecognitionRoad Segmentation | CodeCode Available | 2 |
| Motion2VecSets: 4D Latent Vector Set Diffusion for Non-rigid Shape Reconstruction and Tracking | Jan 12, 2024 | 4D reconstructionDenoising | CodeCode Available | 2 |