| CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving | May 15, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 3 |
| UnMarker: A Universal Attack on Defensive Image Watermarking | May 14, 2024 | DeepFake DetectionDenoising | CodeCode Available | 3 |
| EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training | May 14, 2024 | Data AugmentationSelf-Supervised Learning | CodeCode Available | 3 |
| Improving Transformers with Dynamically Composable Multi-Head Attention | May 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning | May 13, 2024 | Data AugmentationGSM8K | CodeCode Available | 3 |
| Rethinking Histology Slide Digitization Workflows for Low-Resource Settings | May 13, 2024 | Deblurringwhole slide images | CodeCode Available | 3 |
| MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels | May 13, 2024 | Information RetrievalRetrieval | CodeCode Available | 3 |
| Deep Learning-Based Object Pose Estimation: A Comprehensive Survey | May 13, 2024 | Deep LearningObject | CodeCode Available | 3 |
| BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps | May 12, 2024 | Computational Efficiency | CodeCode Available | 3 |
| NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU | May 12, 2024 | CPUDeep Learning | CodeCode Available | 3 |
| TKAN: Temporal Kolmogorov-Arnold Networks | May 12, 2024 | Kolmogorov-Arnold NetworksManagement | CodeCode Available | 3 |
| EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation | May 11, 2024 | Computational EfficiencyDecoder | CodeCode Available | 3 |
| Koopman-Based Surrogate Modelling of Turbulent Rayleigh-Bénard Convection | May 10, 2024 | | CodeCode Available | 3 |
| An Investigation of Incorporating Mamba for Speech Enhancement | May 10, 2024 | MambaSpeech Enhancement | CodeCode Available | 3 |
| A Survey of Large Language Models for Graphs | May 10, 2024 | Graph LearningLink Prediction | CodeCode Available | 3 |
| Are EEG-to-Text Models Working? | May 10, 2024 | BenchmarkingEEG | CodeCode Available | 3 |
| Kolmogorov-Arnold Networks are Radial Basis Function Networks | May 10, 2024 | Kolmogorov-Arnold Networks | CodeCode Available | 3 |
| Ditto: Quantization-aware Secure Inference of Transformers upon MPC | May 9, 2024 | Quantization | CodeCode Available | 3 |
| MAD-ICP: It Is All About Matching Data -- Robust and Informed LiDAR Odometry | May 9, 2024 | All | CodeCode Available | 3 |
| Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving | May 8, 2024 | Autonomous DrivingLIDAR Semantic Segmentation | CodeCode Available | 3 |
| vAttention: Dynamic Memory Management for Serving LLMs without PagedAttention | May 7, 2024 | GPUManagement | CodeCode Available | 3 |
| ACEGEN: Reinforcement learning of generative chemical agents for drug discovery | May 7, 2024 | BenchmarkingDecision Making | CodeCode Available | 3 |
| FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes | May 7, 2024 | 3D Point Cloud Classification3D Semantic Segmentation | CodeCode Available | 3 |
| Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer | May 7, 2024 | Image GenerationSuper-Resolution | CodeCode Available | 3 |
| AlphaMath Almost Zero: Process Supervision without Process | May 6, 2024 | Mathematical ReasoningMath Word Problem Solving | CodeCode Available | 3 |
| ImageInWords: Unlocking Hyper-Detailed Image Descriptions | May 5, 2024 | Image GenerationSpecificity | CodeCode Available | 3 |
| Vision-based 3D occupancy prediction in autonomous driving: a review and outlook | May 4, 2024 | Autonomous DrivingPrediction | CodeCode Available | 3 |
| U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers | May 4, 2024 | Image GenerationInductive Bias | CodeCode Available | 3 |
| DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos | May 3, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 3 |
| MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts | May 2, 2024 | Combinatorial OptimizationMixture-of-Experts | CodeCode Available | 3 |
| SparseTSF: Modeling Long-term Time Series Forecasting with 1k Parameters | May 2, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 3 |
| MANTIS: Interleaved Multi-Image Instruction Tuning | May 2, 2024 | | CodeCode Available | 3 |
| Spider: A Unified Framework for Context-dependent Concept Segmentation | May 2, 2024 | Transparent objects | CodeCode Available | 3 |
| Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning | May 1, 2024 | ARCGSM8K | CodeCode Available | 3 |
| HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning | Apr 30, 2024 | parameter-efficient fine-tuning | CodeCode Available | 3 |
| MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model | Apr 30, 2024 | Motion GenerationMotion Synthesis | CodeCode Available | 3 |
| Lightplane: Highly-Scalable Components for Neural 3D Fields | Apr 30, 2024 | 3D Reconstruction | CodeCode Available | 3 |
| RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing | Apr 30, 2024 | Computational EfficiencyHallucination | CodeCode Available | 3 |
| SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound | Apr 30, 2024 | DecoderLanguage Modelling | CodeCode Available | 3 |
| EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars | Apr 29, 2024 | | CodeCode Available | 3 |
| Accelerating Production LLMs with Combined Token/Embedding Speculators | Apr 29, 2024 | | CodeCode Available | 3 |
| Middle Architecture Criteria | Apr 27, 2024 | | CodeCode Available | 3 |
| The Common Core Ontologies | Apr 27, 2024 | | CodeCode Available | 3 |
| MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition | Apr 26, 2024 | Emotion RecognitionMulti-Label Learning | CodeCode Available | 3 |
| REvoLd: Ultra-Large Library Screening with an Evolutionary Algorithm in Rosetta | Apr 26, 2024 | Drug Discovery | CodeCode Available | 3 |
| MV-VTON: Multi-View Virtual Try-On with Diffusion Models | Apr 26, 2024 | Virtual Try-on | CodeCode Available | 3 |
| Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming Services | Apr 25, 2024 | GPU | CodeCode Available | 3 |
| Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey | Apr 25, 2024 | 4kImage Super-Resolution | CodeCode Available | 3 |
| COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations | Apr 25, 2024 | Contrastive LearningMusic Generation | CodeCode Available | 3 |
| Evolve Cost-aware Acquisition Functions Using Large Language Models | Apr 25, 2024 | Bayesian OptimizationDecision Making | CodeCode Available | 3 |