| DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability | Jun 27, 2024 | Speech Synthesistext-to-speech | CodeCode Available | 2 |
| RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation | Jun 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks | Jul 4, 2022 | | CodeCode Available | 2 |
| Odd-One-Out: Anomaly Detection by Comparing with Neighbors | Jun 28, 2024 | 8kAnomaly Detection | CodeCode Available | 2 |
| E.T. the Exceptional Trajectories: Text-to-camera-trajectory generation with character awareness | Jul 1, 2024 | 3D Generation | CodeCode Available | 2 |
| MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation | Jul 2, 2024 | In-Context Learning | CodeCode Available | 2 |
| A Bounding Box is Worth One Token: Interleaving Layout and Text in a Large Language Model for Document Understanding | Jul 2, 2024 | document understandingKey Information Extraction | CodeCode Available | 2 |
| Centerline Boundary Dice Loss for Vascular Segmentation | Jul 1, 2024 | Segmentation | CodeCode Available | 2 |
| Benchmarking Predictive Coding Networks -- Made Simple | Jul 1, 2024 | Benchmarking | CodeCode Available | 2 |
| A Survey of Personalization: From RAG to Agent | Apr 14, 2025 | RAGRetrieval | CodeCode Available | 2 |
| Discovering symbolic expressions with parallelized tree search | Jul 5, 2024 | Equation Discoveryregression | CodeCode Available | 2 |
| TongGu: Mastering Classical Chinese Understanding with Knowledge-Grounded Large Language Models | Jul 4, 2024 | RAGRetrieval-augmented Generation | CodeCode Available | 2 |
| See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition | Jul 7, 2024 | parameter-efficient fine-tuning | CodeCode Available | 2 |
| RPN: Reconciled Polynomial Network Towards Unifying PGMs, Kernel SVMs, MLP and KAN | Jul 5, 2024 | | CodeCode Available | 2 |
| Language Representations Can be What Recommenders Need: Findings and Potentials | Jul 7, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 |
| Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition | Jul 7, 2024 | Emotion RecognitionMultimodal Sentiment Analysis | CodeCode Available | 2 |
| Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps | Jul 9, 2024 | ArticlesHallucination | CodeCode Available | 2 |
| LuSNAR:A Lunar Segmentation, Navigation and Reconstruction Dataset based on Muti-sensor for Autonomous Exploration | Jul 9, 2024 | 3D ReconstructionAutonomous Navigation | CodeCode Available | 2 |
| MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos | Jul 11, 2024 | NeRF | CodeCode Available | 2 |
| Adaptive Parametric Activation | Jul 11, 2024 | imbalanced classificationInstance Segmentation | CodeCode Available | 2 |
| WayveScenes101: A Dataset and Benchmark for Novel View Synthesis in Autonomous Driving | Jul 11, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization | Jul 11, 2024 | Contrastive LearningTransfer Learning | CodeCode Available | 2 |
| xLSTMTime : Long-term Time Series Forecasting With xLSTM | Jul 14, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| Image Compression for Machine and Human Vision with Spatial-Frequency Adaptation | Jul 13, 2024 | Image Compression | CodeCode Available | 2 |
| GOFA: A Generative One-For-All Model for Joint Graph Language Modeling | Jul 12, 2024 | AllLanguage Modeling | CodeCode Available | 2 |
| TTSDS -- Text-to-Speech Distribution Score | Jul 17, 2024 | text-to-speechText to Speech | CodeCode Available | 2 |
| UrbanWorld: An Urban World Model for 3D City Generation | Jul 16, 2024 | Decision MakingLanguage Modelling | CodeCode Available | 2 |
| GV-Bench: Benchmarking Local Feature Matching for Geometric Verification of Long-term Loop Closure Detection | Jul 16, 2024 | BenchmarkingLoop Closure Detection | CodeCode Available | 2 |
| A Comprehensive Survey of Mamba Architectures for Medical Image Analysis: Classification, Segmentation, Restoration and Beyond | Oct 3, 2024 | MambaMedical Image Analysis | CodeCode Available | 2 |
| GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features | Jul 17, 2024 | Anomaly DetectionSelf-Driving Cars | CodeCode Available | 2 |
| Weak-to-Strong Reasoning | Jul 18, 2024 | GSM8KMath | CodeCode Available | 2 |
| PlacidDreamer: Advancing Harmony in Text-to-3D Generation | Jul 19, 2024 | 3D GenerationText to 3D | CodeCode Available | 2 |
| A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks | Jul 18, 2024 | | CodeCode Available | 2 |
| Forecasting GPU Performance for Deep Learning Training and Inference | Jul 18, 2024 | Deep LearningGPU | CodeCode Available | 2 |
| Intelligent Artistic Typography: A Comprehensive Review of Artistic Text Design and Generation | Jul 20, 2024 | Text Generation | CodeCode Available | 2 |
| MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation | Jul 21, 2024 | DiversityMusic Generation | CodeCode Available | 2 |
| Decomposed Meta-Learning for Few-Shot Named Entity Recognition | Apr 12, 2022 | Entity TypingFew-shot NER | CodeCode Available | 2 |
| PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects | Jul 23, 2024 | Instance SegmentationObject | CodeCode Available | 2 |
| A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data | Jul 23, 2024 | Autonomous DrivingAutonomous Racing | CodeCode Available | 2 |
| Perm: A Parametric Representation for Multi-Style 3D Hair Modeling | Jul 28, 2024 | Image Generation | CodeCode Available | 2 |
| Tabular Data Augmentation for Machine Learning: Progress and Prospects of Embracing Generative AI | Jul 31, 2024 | | CodeCode Available | 2 |
| MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction | Jul 31, 2024 | Autonomous DrivingPrediction | CodeCode Available | 2 |
| Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention | Aug 1, 2024 | Image Generation | CodeCode Available | 2 |
| Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach | Aug 2, 2024 | cross-modal alignmentMultiple Object Tracking | CodeCode Available | 2 |
| radarODE: An ODE-Embedded Deep Learning Model for Contactless ECG Reconstruction from Millimeter-Wave Radar | Aug 3, 2024 | Decoder | CodeCode Available | 2 |
| 500xCompressor: Generalized Prompt Compression for Large Language Models | Aug 6, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| VERINA: Benchmarking Verifiable Code Generation | May 29, 2025 | BenchmarkingCode Generation | CodeCode Available | 2 |
| MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents | Aug 8, 2024 | | CodeCode Available | 2 |
| wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech | Aug 8, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Causal Agent based on Large Language Model | Aug 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |