| On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning? | May 3, 2024 | Computational EfficiencyPrompt Learning | CodeCode Available | 2 |
| FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space | May 3, 2024 | Facial Expression RecognitionFacial Expression Recognition (FER) | CodeCode Available | 2 |
| Vibe-Eval: A hard evaluation suite for measuring progress of multimodal language models | May 3, 2024 | | CodeCode Available | 2 |
| SCIMAP: A Python Toolkit for Integrated Spatial Analysis of Multiplexed Imaging Data | May 3, 2024 | | CodeCode Available | 2 |
| Part-aware Shape Generation with Latent 3D Diffusion of Neural Voxel Fields | May 2, 2024 | Decoder | CodeCode Available | 2 |
| FeNNol: an Efficient and Flexible Library for Building Force-field-enhanced Neural Network Potentials | May 2, 2024 | GPU | CodeCode Available | 2 |
| Multi-Space Alignments Towards Universal LiDAR Segmentation | May 2, 2024 | Autonomous DrivingDiversity | CodeCode Available | 2 |
| Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design | May 2, 2024 | Model CompressionNeural Network Compression | CodeCode Available | 2 |
| EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion | May 2, 2024 | 3D Object RetrievalDenoising | CodeCode Available | 2 |
| SATO: Stable Text-to-Motion Framework | May 2, 2024 | | CodeCode Available | 2 |
| A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law | May 2, 2024 | DiagnosticEthics | CodeCode Available | 2 |
| Benchmarking Representations for Speech, Music, and Acoustic Events | May 2, 2024 | Audio ClassificationBenchmarking | CodeCode Available | 2 |
| LocInv: Localization-aware Inversion for Text-Guided Image Editing | May 2, 2024 | Denoisingtext-guided-image-editing | CodeCode Available | 2 |
| SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints | May 2, 2024 | DiversityDrug Design | CodeCode Available | 2 |
| SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising | May 2, 2024 | Computational EfficiencyDenoising | CodeCode Available | 2 |
| MiniGPT-3D: Efficiently Aligning 3D Point Clouds with Large Language Models using 2D Priors | May 2, 2024 | 3D Object Captioning3D Object Classification | CodeCode Available | 2 |
| Model Quantization and Hardware Acceleration for Vision Transformers: A Comprehensive Survey | May 1, 2024 | Quantization | CodeCode Available | 2 |
| HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond | May 1, 2024 | BenchmarkingHigh-Level Synthesis | CodeCode Available | 2 |
| Toward Unified Practices in Trajectory Prediction Research on Bird's-Eye-View Datasets | May 1, 2024 | Autonomous VehiclesMotion Forecasting | CodeCode Available | 2 |
| Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation | May 1, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Spectrally Pruned Gaussian Fields with Neural Compensation | May 1, 2024 | | CodeCode Available | 2 |
| ASAM: Boosting Segment Anything Model with Adversarial Tuning | May 1, 2024 | Image Segmentationmodel | CodeCode Available | 2 |
| WorkBench: a Benchmark Dataset for Agents in a Realistic Workplace Setting | May 1, 2024 | Scheduling | CodeCode Available | 2 |
| GraCo: Granularity-Controllable Interactive Segmentation | May 1, 2024 | Interactive SegmentationSegmentation | CodeCode Available | 2 |
| Causal Evaluation of Language Models | May 1, 2024 | Causal DiscoveryCausal Inference | CodeCode Available | 2 |
| TFPred: Learning Discriminative Representations from Unlabeled Data for Few-Label Rotating Machinery Fault Diagnosis | May 1, 2024 | Fault DetectionFault Diagnosis | CodeCode Available | 2 |
| Training-free Graph Neural Networks and the Power of Labels as Features | Apr 30, 2024 | Node Classification | CodeCode Available | 2 |
| LVOS: A Benchmark for Large-scale Long-term Video Object Segmentation | Apr 30, 2024 | AttributeSemantic Segmentation | CodeCode Available | 2 |
| Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging | Apr 30, 2024 | Pose Estimation | CodeCode Available | 2 |
| VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization | Apr 30, 2024 | Domain AdaptationDomain Generalization | CodeCode Available | 2 |
| Mixed Continuous and Categorical Flow Matching for 3D De Novo Molecule Generation | Apr 30, 2024 | 3D Molecule Generation | CodeCode Available | 2 |
| MicroDreamer: Efficient 3D Generation in 20 Seconds by Score-based Iterative Reconstruction | Apr 30, 2024 | 3D Generation3D Reconstruction | CodeCode Available | 2 |
| Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly | Apr 30, 2024 | Anomaly Detection | CodeCode Available | 2 |
| CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation | Apr 30, 2024 | MambaState Space Models | CodeCode Available | 2 |
| HLSTransform: Energy-Efficient Llama 2 Inference on FPGAs Via High Level Synthesis | Apr 29, 2024 | CPUEdge-computing | CodeCode Available | 2 |
| Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior | Apr 29, 2024 | Image CompressionImage Reconstruction | CodeCode Available | 2 |
| 4D-DRESS: A 4D Dataset of Real-world Human Clothing with Semantic Annotations | Apr 29, 2024 | Human Parsing | CodeCode Available | 2 |
| Benchmarking Benchmark Leakage in Large Language Models | Apr 29, 2024 | BenchmarkingMathematical Reasoning | CodeCode Available | 2 |
| TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Apr 29, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| Unleashing the Power of Multi-Task Learning: A Comprehensive Survey Spanning Traditional, Deep, and Pretrained Foundation Model Eras | Apr 29, 2024 | Multi-Task LearningPrognosis | CodeCode Available | 2 |
| 3D Gaussian Splatting with Deferred Reflection | Apr 29, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting | Apr 29, 2024 | | CodeCode Available | 2 |
| Joint Signal Detection and Automatic Modulation Classification via Deep Learning | Apr 29, 2024 | Deep Learning | CodeCode Available | 2 |
| Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations | Apr 29, 2024 | RetrievalText Retrieval | CodeCode Available | 2 |
| How secure is AI-generated Code: A Large-Scale Comparison of Large Language Models | Apr 29, 2024 | Code Generation | CodeCode Available | 2 |
| RSCaMa: Remote Sensing Image Change Captioning with State Space Model | Apr 29, 2024 | DecoderMamba | CodeCode Available | 2 |
| PromptReps: Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval | Apr 29, 2024 | Document RankingRe-Ranking | CodeCode Available | 2 |
| SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods | Apr 29, 2024 | BenchmarkingImage Generation | CodeCode Available | 2 |
| OpenStreetView-5M: The Many Roads to Global Visual Geolocation | Apr 29, 2024 | Photo geolocation estimation | CodeCode Available | 2 |
| OAEI Machine Learning Dataset for Online Model Generation | Apr 29, 2024 | Graph Matchingmodel | CodeCode Available | 2 |