| VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning | Mar 19, 2024 | BenchmarkingImage Captioning | CodeCode Available | 2 |
| Real-IAD: A Real-World Multi-View Dataset for Benchmarking Versatile Industrial Anomaly Detection | Mar 19, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 3 |
| MELTing point: Mobile Evaluation of Language Transformers | Mar 19, 2024 | BenchmarkingQuantization | CodeCode Available | 1 |
| Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset | Mar 19, 2024 | Action RecognitionBenchmarking | —Unverified | 0 |
| AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework | Mar 19, 2024 | BenchmarkingFinancial Analysis | CodeCode Available | 3 |
| ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems | Mar 19, 2024 | Benchmarkingfeature selection | CodeCode Available | 1 |
| Embarrassingly Simple Scribble Supervision for 3D Medical Segmentation | Mar 19, 2024 | BenchmarkingSegmentation | —Unverified | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens | Mar 18, 2024 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| Align and Distill: Unifying and Improving Domain Adaptive Object Detection | Mar 18, 2024 | Benchmarkingobject-detection | CodeCode Available | 1 |