| Towards Sim-to-Real Industrial Parts Classification with Synthetic Dataset | Apr 12, 2024 | Benchmarking | CodeCode Available | 1 |
| Implicit Multi-Spectral Transformer: An Lightweight and Effective Visible to Infrared Image Translation Model | Apr 10, 2024 | BenchmarkingImage-to-Image Translation | CodeCode Available | 1 |
| AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents | Apr 9, 2024 | Benchmarking | CodeCode Available | 1 |
| PARIS3D: Reasoning-based 3D Part Segmentation Using Large Multimodal Model | Apr 4, 2024 | 3D Part SegmentationBenchmarking | CodeCode Available | 1 |
| Outlier-Efficient Hopfield Layers for Large Transformer-Based Models | Apr 4, 2024 | BenchmarkingQuantization | CodeCode Available | 1 |
| Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT | Apr 3, 2024 | BenchmarkingGeneral Knowledge | CodeCode Available | 1 |
| Atom-Level Optical Chemical Structure Recognition with Limited Supervision | Apr 2, 2024 | Benchmarking | CodeCode Available | 1 |
| PREGO: online mistake detection in PRocedural EGOcentric videos | Apr 2, 2024 | Action RecognitionBenchmarking | CodeCode Available | 1 |
| Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions | Mar 29, 2024 | Action DetectionBenchmarking | CodeCode Available | 1 |
| Benchmarking Counterfactual Image Generation | Mar 29, 2024 | BenchmarkingConditional Image Generation | CodeCode Available | 1 |
| Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAM | Mar 28, 2024 | Benchmarking | CodeCode Available | 1 |
| RankMamba: Benchmarking Mamba's Document Ranking Performance in the Era of Transformers | Mar 27, 2024 | BenchmarkingDocument Ranking | CodeCode Available | 1 |
| ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | Mar 27, 2024 | Benchmarking | CodeCode Available | 1 |
| Towards Image Ambient Lighting Normalization | Mar 27, 2024 | BenchmarkingImage Restoration | CodeCode Available | 1 |
| Benchmarking Object Detectors with COCO: A New Path Forward | Mar 27, 2024 | BenchmarkingObject | CodeCode Available | 1 |
| ArabicaQA: A Comprehensive Dataset for Arabic Question Answering | Mar 26, 2024 | BenchmarkingMachine Reading Comprehension | CodeCode Available | 1 |
| CodeS: Natural Language to Code Repository via Multi-Layer Sketch | Mar 25, 2024 | Benchmarking | CodeCode Available | 1 |
| Addressing the generalization of 3D registration methods with a featureless baseline and an unbiased benchmark | Mar 23, 2024 | BenchmarkingImage to Point Cloud Registration | CodeCode Available | 1 |
| DomainLab: A modular Python package for domain generalization in deep learning | Mar 21, 2024 | BenchmarkingDomain Generalization | CodeCode Available | 1 |
| Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations | Mar 21, 2024 | BenchmarkingMemorization | CodeCode Available | 1 |
| RoDLA: Benchmarking the Robustness of Document Layout Analysis Models | Mar 21, 2024 | BenchmarkingDocument Layout Analysis | CodeCode Available | 1 |
| Can 3D Vision-Language Models Truly Understand Natural Language? | Mar 21, 2024 | BenchmarkingDiversity | CodeCode Available | 1 |
| Practical End-to-End Optical Music Recognition for Pianoform Music | Mar 20, 2024 | Benchmarking | CodeCode Available | 1 |
| MELTing point: Mobile Evaluation of Language Transformers | Mar 19, 2024 | BenchmarkingQuantization | CodeCode Available | 1 |
| ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems | Mar 19, 2024 | Benchmarkingfeature selection | CodeCode Available | 1 |