| pyrtklib: An open-source package for tightly coupled deep learning and GNSS integration for positioning in urban canyons | Sep 19, 2024 | Deep Learning | CodeCode Available | 2 |
| HSIGene: A Foundation Model For Hyperspectral Image Generation | Sep 19, 2024 | Data AugmentationDenoising | CodeCode Available | 2 |
| Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization | Sep 19, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models | Sep 19, 2024 | Semantic SimilaritySemantic Textual Similarity | CodeCode Available | 2 |
| GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling | Sep 19, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making Framework | Sep 19, 2024 | Autonomous VehiclesDecision Making | CodeCode Available | 2 |
| AutoVerus: Automated Proof Generation for Rust Code | Sep 19, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 2 |
| PhysMamba: Efficient Remote Physiological Measurement with SlowFast Temporal Difference Mamba | Sep 18, 2024 | MambaState Space Models | CodeCode Available | 2 |
| Recent Advances in OOD Detection: Problems and Approaches | Sep 18, 2024 | Out-of-Distribution DetectionOut of Distribution (OOD) Detection | CodeCode Available | 2 |
| TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning | Sep 18, 2024 | Fact VerificationQuestion Answering | CodeCode Available | 2 |
| RockTrack: A 3D Robust Multi-Camera-Ken Multi-Object Tracking Framework | Sep 18, 2024 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| Large Language Models are Strong Audio-Visual Speech Recognition Learners | Sep 18, 2024 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks | Sep 18, 2024 | 3DGSSegmentation | CodeCode Available | 2 |
| A Controlled Study on Long Context Extension and Generalization in LLMs | Sep 18, 2024 | In-Context Learning | CodeCode Available | 2 |
| Vista3D: Unravel the 3D Darkside of a Single Image | Sep 18, 2024 | 3D GenerationDiversity | CodeCode Available | 2 |
| All-in-one foundational models learning across quantum chemical levels | Sep 18, 2024 | AllCloud Computing | CodeCode Available | 2 |
| Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models | Sep 17, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse | Sep 17, 2024 | In-Context LearningRAG | CodeCode Available | 2 |
| A mmWave Software-Defined Array Platform for Wireless Experimentation at 24-29.5 GHz | Sep 17, 2024 | | CodeCode Available | 2 |
| Advances in APPFL: A Comprehensive and Extensible Federated Learning Framework | Sep 17, 2024 | BenchmarkingFederated Learning | CodeCode Available | 2 |
| SkinMamba: A Precision Skin Lesion Segmentation Architecture with Cross-Scale Global State Modeling and Frequency Boundary Guidance | Sep 17, 2024 | DecoderLesion Segmentation | CodeCode Available | 2 |
| Multi-Domain Data Aggregation for Axon and Myelin Segmentation in Histology Images | Sep 17, 2024 | Segmentation | CodeCode Available | 2 |
| Multi-Document Grounded Multi-Turn Synthetic Dialog Generation | Sep 17, 2024 | | CodeCode Available | 2 |
| Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models | Sep 17, 2024 | Information RetrievalRetrieval | CodeCode Available | 2 |
| SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction | Sep 17, 2024 | 3DGS4D reconstruction | CodeCode Available | 2 |
| BAD: Bidirectional Auto-regressive Diffusion for Text-to-Motion Generation | Sep 17, 2024 | Human motion predictionMotion Forecasting | CodeCode Available | 2 |
| Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search | Sep 16, 2024 | Quantization | CodeCode Available | 2 |
| Fit and Prune: Fast and Training-free Visual Token Pruning for Multi-modal Large Language Models | Sep 16, 2024 | | CodeCode Available | 2 |
| On the Diagram of Thought | Sep 16, 2024 | | CodeCode Available | 2 |
| WaveMixSR-V2: Enhancing Super-resolution with Higher Efficiency | Sep 16, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| beeFormer: Bridging the Gap Between Semantic and Interaction Similarity in Recommender Systems | Sep 16, 2024 | Collaborative FilteringRecommendation Systems | CodeCode Available | 2 |
| Mamba-ST: State Space Model for Efficient Style Transfer | Sep 16, 2024 | MambaStyle Transfer | CodeCode Available | 2 |
| RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval | Sep 16, 2024 | CPUGPU | CodeCode Available | 2 |
| OPUS: Occupancy Prediction Using a Sparse Set | Sep 14, 2024 | Autonomous DrivingPrediction | CodeCode Available | 2 |
| One missing piece in Vision and Language: A Survey on Comics Understanding | Sep 14, 2024 | document understandingimage-classification | CodeCode Available | 2 |
| Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology | Sep 14, 2024 | Inductive BiasPrognosis | CodeCode Available | 2 |
| Target Speaker ASR with Whisper | Sep 14, 2024 | Speech Separation | CodeCode Available | 2 |
| SafeEar: Content Privacy-Preserving Audio Deepfake Detection | Sep 14, 2024 | Audio Deepfake DetectionDeepFake Detection | CodeCode Available | 2 |
| Comparing Retrieval-Augmentation and Parameter-Efficient Fine-Tuning for Privacy-Preserving Personalization of Large Language Models | Sep 14, 2024 | parameter-efficient fine-tuningPrivacy Preserving | CodeCode Available | 2 |
| Self-Prompting Polyp Segmentation in Colonoscopy using Hybrid Yolo-SAM 2 Model | Sep 14, 2024 | Medical Image SegmentationPolyp Segmentation | CodeCode Available | 2 |
| Agents in Software Engineering: Survey, Landscape, and Vision | Sep 13, 2024 | Survey | CodeCode Available | 2 |
| Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions | Sep 13, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| PiEEG-16 to Measure 16 EEG Channels with Raspberry Pi for Brain-Computer Interfaces and EEG devices | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| https://arxiv.org/pdf/2409.07491 | Sep 13, 2024 | Brain Computer InterfaceEEG | CodeCode Available | 2 |
| Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection | Sep 13, 2024 | MambaOpen Vocabulary Object Detection | CodeCode Available | 2 |
| S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training | Sep 13, 2024 | Quantization | CodeCode Available | 2 |
| Hierarchical Hypercomplex Network for Multimodal Emotion Recognition | Sep 13, 2024 | Emotion RecognitionMultimodal Emotion Recognition | CodeCode Available | 2 |
| FLAMO: An Open-Source Library for Frequency-Domain Differentiable Audio Processing | Sep 13, 2024 | | CodeCode Available | 2 |
| LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment | Sep 13, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| VAE Explainer: Supplement Learning Variational Autoencoders with Interactive Visualization | Sep 13, 2024 | Math | CodeCode Available | 2 |