| MoEUT: Mixture-of-Experts Universal Transformers | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Underwater Image Enhancement by Diffusion Model with Customized CLIP-Classifier | May 25, 2024 | Image EnhancementImage Generation | CodeCode Available | 2 |
| Bigger, Regularized, Optimistic: scaling for compute and sample-efficient continuous control | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Analytic Federated Learning | May 25, 2024 | Federated Learning | CodeCode Available | 2 |
| REACT: Real-time Efficiency and Accuracy Compromise for Tradeoffs in Scene Graph Generation | May 25, 2024 | Graph GenerationObject | CodeCode Available | 2 |
| Accelerating Transformers with Spectrum-Preserving Token Merging | May 25, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization | May 25, 2024 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Optimizing Large Language Models for OpenAPI Code Completion | May 24, 2024 | Code CompletionCode Generation | CodeCode Available | 2 |
| PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis | May 24, 2024 | Art AnalysisComputational Efficiency | CodeCode Available | 2 |
| iVideoGPT: Interactive VideoGPTs are Scalable World Models | May 24, 2024 | Decision MakingModel-based Reinforcement Learning | CodeCode Available | 2 |
| Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving | May 24, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code | May 24, 2024 | | CodeCode Available | 2 |
| Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models | May 24, 2024 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 |
| Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement | May 24, 2024 | HallucinationImage Comprehension | CodeCode Available | 2 |
| Fast-PGM: Fast Probabilistic Graphical Model Learning and Inference | May 24, 2024 | | CodeCode Available | 2 |
| Diffusion Actor-Critic with Entropy Regulator | May 24, 2024 | Decision MakingMuJoCo | CodeCode Available | 2 |
| LM4LV: A Frozen Large Language Model for Low-level Vision Tasks | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models | May 24, 2024 | Visual Question Answering | CodeCode Available | 2 |
| Sparse maximal update parameterization: A holistic approach to sparse training dynamics | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Medformer: A Multi-Granularity Patching Transformer for Medical Time-Series Classification | May 24, 2024 | EEGElectrocardiography (ECG) | CodeCode Available | 2 |
| Fieldscale: Locality-Aware Field-based Adaptive Rescaling for Thermal Infrared Image | May 24, 2024 | Image Quality Assessment | CodeCode Available | 2 |
| Composed Image Retrieval for Remote Sensing | May 24, 2024 | Composed Image Retrieval (CoIR)Descriptive | CodeCode Available | 2 |
| Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models | May 24, 2024 | Image GenerationMachine Unlearning | CodeCode Available | 2 |
| What is a Goldilocks Face Verification Test Set? | May 24, 2024 | Face RecognitionFace Verification | CodeCode Available | 2 |
| DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation | May 24, 2024 | 3D ReconstructionCamera Calibration | CodeCode Available | 2 |
| Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation | May 24, 2024 | In-Context LearningText to SQL | CodeCode Available | 2 |
| Out of Many, One: Designing and Scaffolding Proteins at the Scale of the Structural Universe with Genie 2 | May 24, 2024 | Data AugmentationDiversity | CodeCode Available | 2 |
| MambaVC: Learned Visual Compression with Selective State Spaces | May 24, 2024 | Long-range modelingState Space Models | CodeCode Available | 2 |
| Diffusion Bridge Implicit Models | May 24, 2024 | DenoisingDiversity | CodeCode Available | 2 |
| Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models | May 24, 2024 | Atari GamesMathematical Reasoning | CodeCode Available | 2 |
| AnalogCoder: Analog Circuit Design via Training-Free Code Generation | May 23, 2024 | Code Generation | CodeCode Available | 2 |
| AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2 | May 23, 2024 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| Agent Planning with World Knowledge Model | May 23, 2024 | modelWorld Knowledge | CodeCode Available | 2 |
| EHRMamba: Towards Generalizable and Scalable Foundation Models for Electronic Health Records | May 23, 2024 | Mamba | CodeCode Available | 2 |
| Extracting Prompts by Inverting LLM Outputs | May 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| S-Eval: Towards Automated and Comprehensive Safety Evaluation for Large Language Models | May 23, 2024 | Benchmarking | CodeCode Available | 2 |
| Efficient Visual State Space Model for Image Deblurring | May 23, 2024 | DeblurringImage Deblurring | CodeCode Available | 2 |
| RoGs: Large Scale Road Surface Reconstruction with Meshgrid Gaussian | May 23, 2024 | Autonomous DrivingSurface Reconstruction | CodeCode Available | 2 |
| Advancing Spiking Neural Networks for Sequential Modeling with Central Pattern Generators | May 23, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks | May 23, 2024 | Decision Making | CodeCode Available | 2 |
| Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation | May 23, 2024 | DenoisingImage Denoising | CodeCode Available | 2 |
| Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models | May 23, 2024 | Mixture-of-ExpertsVisual Question Answering | CodeCode Available | 2 |
| SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models | May 23, 2024 | Natural Language UnderstandingQuantization | CodeCode Available | 2 |
| RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports | May 23, 2024 | DiagnosticMulti-Label Classification | CodeCode Available | 2 |
| Flatten Anything: Unsupervised Neural Surface Parameterization | May 23, 2024 | | CodeCode Available | 2 |
| Metric Flow Matching for Smooth Interpolations on the Data Manifold | May 23, 2024 | Trajectory Prediction | CodeCode Available | 2 |
| Calibrated Self-Rewarding Vision Language Models | May 23, 2024 | HallucinationLanguage Modelling | CodeCode Available | 2 |
| Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition | May 23, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Mamba-R: Vision Mamba ALSO Needs Registers | May 23, 2024 | MambaSemantic Segmentation | CodeCode Available | 2 |
| PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference | May 23, 2024 | | CodeCode Available | 2 |