| Skinned Motion Retargeting with Dense Geometric Interaction Perception | Oct 28, 2024 | motion retargeting | CodeCode Available | 2 |
| LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Oct 28, 2024 | Video GenerationVideo Reconstruction | CodeCode Available | 2 |
| BSD: a Bayesian framework for parametric models of neural spectra | Oct 28, 2024 | Bayesian InferenceEEG | CodeCode Available | 2 |
| NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks | Oct 28, 2024 | Quantization | CodeCode Available | 2 |
| Flaming-hot Initiation with Regular Execution Sampling for Large Language Models | Oct 28, 2024 | DiversityMath | CodeCode Available | 2 |
| Audio Deepfake Detection with Self-Supervised XLS-R and SLS Classifier | Oct 28, 2024 | Audio Deepfake DetectionAudio Generation | CodeCode Available | 2 |
| Domain Adaptation with a Single Vision-Language Embedding | Oct 28, 2024 | Domain AdaptationOne-shot Unsupervised Domain Adaptation | CodeCode Available | 2 |
| ODRL: A Benchmark for Off-Dynamics Reinforcement Learning | Oct 28, 2024 | Benchmarkingreinforcement-learning | CodeCode Available | 2 |
| PaPaGei: Open Foundation Models for Optical Physiological Signals | Oct 27, 2024 | Contrastive LearningDomain Generalization | CodeCode Available | 2 |
| Accelerating Direct Preference Optimization with Prefix Sharing | Oct 27, 2024 | Computational Efficiency | CodeCode Available | 2 |
| TabDiff: a Multi-Modal Diffusion Model for Tabular Data Generation | Oct 27, 2024 | ImputationTabular Data Generation | CodeCode Available | 2 |
| GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation | Oct 27, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Wavelet-based Mamba with Fourier Adjustment for Low-light Image Enhancement | Oct 27, 2024 | DecoderImage Enhancement | CodeCode Available | 2 |
| Fast Best-of-N Decoding via Speculative Rejection | Oct 26, 2024 | | CodeCode Available | 2 |
| emg2qwerty: A Large Dataset with Baselines for Touch Typing using Surface Electromyography | Oct 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| UniVST: A Unified Framework for Training-free Localized Video Style Transfer | Oct 26, 2024 | Style TransferVideo Editing | CodeCode Available | 2 |
| ResAD: A Simple Framework for Class Generalizable Anomaly Detection | Oct 26, 2024 | Anomaly Detection | CodeCode Available | 2 |
| A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation | Oct 25, 2024 | Graph LearningOut-of-Distribution Generalization | CodeCode Available | 2 |
| Artificial Intelligence of Things: A Survey | Oct 25, 2024 | Survey | CodeCode Available | 2 |
| MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors | Oct 25, 2024 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| NeuroClips: Towards High-fidelity and Smooth fMRI-to-Video Reconstruction | Oct 25, 2024 | SSIMVideo Reconstruction | CodeCode Available | 2 |
| OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization | Oct 25, 2024 | Imitation Learning | CodeCode Available | 2 |
| CoqPilot, a plugin for LLM-based generation of proofs | Oct 25, 2024 | Benchmarking | CodeCode Available | 2 |
| Double Difference Earthquake Location with Graph Neural Networks | Oct 25, 2024 | Graph Neural Network | CodeCode Available | 2 |
| Model merging with SVD to tie the Knots | Oct 25, 2024 | model | CodeCode Available | 2 |
| TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning | Oct 25, 2024 | EgoSchemaHallucination | CodeCode Available | 2 |
| Utilizing Image Transforms and Diffusion Models for Generative Modeling of Short and Long Time Series | Oct 25, 2024 | State Space ModelsTime Series | CodeCode Available | 2 |
| Moving Object Segmentation in Point Cloud Data using Hidden Markov Models | Oct 24, 2024 | Semantic Segmentation | CodeCode Available | 2 |
| PixelGaussian: Generalizable 3D Gaussian Reconstruction from Arbitrary Views | Oct 24, 2024 | | CodeCode Available | 2 |
| Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch | Oct 24, 2024 | MathMathematical Reasoning | CodeCode Available | 2 |
| MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark | Oct 24, 2024 | | CodeCode Available | 2 |
| Open6DOR: Benchmarking Open-instruction 6-DoF Object Rearrangement and A VLM-based Approach | Oct 24, 2024 | BenchmarkingInstruction Following | CodeCode Available | 2 |
| Real-time 3D-aware Portrait Video Relighting | Oct 24, 2024 | NeRF | CodeCode Available | 2 |
| Retrieval-Augmented Diffusion Models for Time Series Forecasting | Oct 24, 2024 | DenoisingRetrieval | CodeCode Available | 2 |
| Probabilistic Language-Image Pre-Training | Oct 24, 2024 | | CodeCode Available | 2 |
| LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor Search | Oct 24, 2024 | ClusteringGPU | CodeCode Available | 2 |
| Context is Key: A Benchmark for Forecasting with Essential Textual Information | Oct 24, 2024 | Decision MakingTime Series | CodeCode Available | 2 |
| Distill Visual Chart Reasoning Ability from LLMs to MLLMs | Oct 24, 2024 | Multimodal ReasoningVisual Reasoning | CodeCode Available | 2 |
| An Intelligent Agentic System for Complex Image Restoration Problems | Oct 23, 2024 | Image Restoration | CodeCode Available | 2 |
| Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models | Oct 23, 2024 | Instruction FollowingLanguage Modelling | CodeCode Available | 2 |
| Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study | Oct 23, 2024 | | CodeCode Available | 2 |
| LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering | Oct 23, 2024 | ChunkingQuestion Answering | CodeCode Available | 2 |
| Rawsamble: Overlapping and Assembling Raw Nanopore Signals using a Hash-based Seeding Mechanism | Oct 23, 2024 | CPU | CodeCode Available | 2 |
| MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models | Oct 23, 2024 | | CodeCode Available | 2 |
| TabDPT: Scaling Tabular Foundation Models | Oct 23, 2024 | In-Context LearningSelf-Supervised Learning | CodeCode Available | 2 |
| CARLA2Real: a tool for reducing the sim2real gap in CARLA simulator | Oct 23, 2024 | Autonomous DrivingSelf-Driving Cars | CodeCode Available | 2 |
| Improving Causal Reasoning in Large Language Models: A Survey | Oct 22, 2024 | Decision MakingSurvey | CodeCode Available | 2 |
| DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model | Oct 22, 2024 | DecoderInstance Segmentation | CodeCode Available | 2 |
| MiniPLM: Knowledge Distillation for Pre-Training Language Models | Oct 22, 2024 | DiversityKnowledge Distillation | CodeCode Available | 2 |
| PAPILLON: Privacy Preservation from Internet-based and Local Language Model Ensembles | Oct 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |