| https://arxiv.org/pdf/2409.07491 | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection | Sep 13, 2024 | MambaOpen Vocabulary Object Detection | CodeCode Available | 2 |
| PiEEG-16 to Measure 16 EEG Channels with Raspberry Pi for Brain-Computer Interfaces and EEG devices | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| Integrating Neural Operators with Diffusion Models Improves Spectral Representation in Turbulence Modeling | Sep 13, 2024 | Computational Efficiency | CodeCode Available | 2 |
| PHemoNet: A Multimodal Network for Physiological Signals | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| Ruri: Japanese General Text Embeddings | Sep 12, 2024 | Knowledge Distillation | CodeCode Available | 2 |
| DSBench: How Far Are Data Science Agents to Becoming Data Science Experts? | Sep 12, 2024 | | CodeCode Available | 2 |
| Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis | Sep 12, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| TSELM: Target Speaker Extraction using Discrete Tokens and Language Models | Sep 12, 2024 | Audio GenerationTarget Speaker Extraction | CodeCode Available | 2 |
| SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer | Sep 12, 2024 | Target Sound Extraction | CodeCode Available | 2 |
| ProbTalk3D: Non-Deterministic Emotion Controllable Speech-Driven 3D Facial Animation Synthesis Using VQ-VAE | Sep 12, 2024 | | CodeCode Available | 2 |
| EZIGen: Enhancing zero-shot personalized image generation with precise subject encoding and decoupled guidance | Sep 12, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)? | Sep 12, 2024 | | CodeCode Available | 2 |
| TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text Encoder | Sep 12, 2024 | Diffusion PersonalizationDisentanglement | CodeCode Available | 2 |
| Deep Height Decoupling for Precise Vision-based 3D Occupancy Prediction | Sep 12, 2024 | 3D geometry | CodeCode Available | 2 |
| Super Monotonic Alignment Search | Sep 12, 2024 | CPUGPU | CodeCode Available | 2 |
| Improving Text-guided Object Inpainting with Semantic Pre-inpainting | Sep 12, 2024 | DenoisingObject | CodeCode Available | 2 |
| MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Sep 11, 2024 | Autonomous DrivingFeature Engineering | CodeCode Available | 2 |
| Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models | Sep 11, 2024 | DenoisingDisentanglement | CodeCode Available | 2 |
| 1M-Deepfakes Detection Challenge | Sep 11, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| Synthetic continued pretraining | Sep 11, 2024 | Data AugmentationLanguage Modelling | CodeCode Available | 2 |
| Recent Trends of Multimodal Affective Computing: A Survey from NLP Perspective | Sep 11, 2024 | Aspect-Based Sentiment AnalysisEmotion Recognition | CodeCode Available | 2 |
| SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis | Sep 11, 2024 | DecoderSpeech Synthesis | CodeCode Available | 2 |
| ThermalGaussian: Thermal 3D Gaussian Splatting | Sep 11, 2024 | 3DGSNeRF | CodeCode Available | 2 |
| HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning | Sep 11, 2024 | Large Language Model | CodeCode Available | 2 |
| What is the Role of Small Models in the LLM Era: A Survey | Sep 10, 2024 | | CodeCode Available | 2 |
| Towards Generalizable Scene Change Detection | Sep 10, 2024 | Change DetectionScene Change Detection | CodeCode Available | 2 |
| SaRA: High-Efficient Diffusion Model Fine-tuning with Progressive Sparse Low-Rank Adaptation | Sep 10, 2024 | Video Generation | CodeCode Available | 2 |
| PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation | Sep 10, 2024 | | CodeCode Available | 2 |
| DetailCLIP: Detail-Oriented CLIP for Fine-Grained Tasks | Sep 10, 2024 | Contrastive LearningImage Reconstruction | CodeCode Available | 2 |
| EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis | Sep 10, 2024 | Contrastive LearningCross-Modal Retrieval | CodeCode Available | 2 |
| Learning Generative Interactive Environments By Trained Agent Exploration | Sep 10, 2024 | | CodeCode Available | 2 |
| TransformerRanker: A Tool for Efficiently Finding the Best-Suited Language Models for Downstream Classification Tasks | Sep 9, 2024 | ClassificationLanguage Modeling | CodeCode Available | 2 |
| FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations | Sep 9, 2024 | Federated LearningPrivacy Preserving | CodeCode Available | 2 |
| IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS | Sep 9, 2024 | DenoisingSpeech Enhancement | CodeCode Available | 2 |
| GASP: Gaussian Splatting for Physic-Based Simulations | Sep 9, 2024 | | CodeCode Available | 2 |
| Revisiting the Solution of Meta KDD Cup 2024: CRAG | Sep 9, 2024 | RAGRetrieval | CodeCode Available | 2 |
| Assessing SPARQL capabilities of Large Language Models | Sep 9, 2024 | BenchmarkingKnowledge Graphs | CodeCode Available | 2 |
| DiffusionPen: Towards Controlling the Style of Handwritten Text Generation | Sep 9, 2024 | DiversityHTR | CodeCode Available | 2 |
| PiEEG-16 to Measure 16 EEG Channels with Raspberry Pi for Brain-Computer Interfaces and EEG devices | Sep 8, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| A Survey on Mixup Augmentations and Beyond | Sep 8, 2024 | Image ClassificationSelf-Supervised Learning | CodeCode Available | 2 |
| A Survey on Diffusion Models for Recommender Systems | Sep 8, 2024 | Data AugmentationRecommendation Systems | CodeCode Available | 2 |
| The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss | Sep 8, 2024 | | CodeCode Available | 2 |
| A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement | Sep 8, 2024 | Code Generation | CodeCode Available | 2 |
| OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs | Sep 8, 2024 | Entity LinkingRAG | CodeCode Available | 2 |
| Evaluating Neural Networks Architectures for Spring Reverb Modelling | Sep 8, 2024 | | CodeCode Available | 2 |
| FedModule: A Modular Federated Learning Framework | Sep 7, 2024 | Federated LearningPersonalized Federated Learning | CodeCode Available | 2 |
| A Comprehensive Survey on Evidential Deep Learning and Its Applications | Sep 7, 2024 | Autonomous DrivingDeep Learning | CodeCode Available | 2 |
| forester: A Tree-Based AutoML Tool in R | Sep 7, 2024 | AutoMLSurvival Analysis | CodeCode Available | 2 |
| GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Sep 6, 2024 | 3DGS3D human pose and shape estimation | CodeCode Available | 2 |