| LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking | Aug 9, 2023 | BenchmarkingFew-Shot Learning | CodeCode Available | 1 |
| Benchmarking LLM powered Chatbots: Methods and Metrics | Aug 8, 2023 | BenchmarkingChatbot | —Unverified | 0 |
| Application-Oriented Benchmarking of Quantum Generative Learning Using QUARK | Aug 8, 2023 | BenchmarkingGPU | CodeCode Available | 1 |
| RECipe: Does a Multi-Modal Recipe Knowledge Graph Fit a Multi-Purpose Recommendation System? | Aug 8, 2023 | BenchmarkingCollaborative Filtering | —Unverified | 0 |
| XFlow: Benchmarking Flow Behaviors over Graphs | Aug 7, 2023 | Benchmarking | CodeCode Available | 1 |
| Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP) | Aug 6, 2023 | BenchmarkingImage Segmentation | —Unverified | 0 |
| Precise Benchmarking of Explainable AI Attribution Methods | Aug 6, 2023 | Benchmarkingimage-classification | CodeCode Available | 0 |
| ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and Retrieval | Aug 4, 2023 | BenchmarkingInformation Retrieval | CodeCode Available | 0 |
| RobustMQ: Benchmarking Robustness of Quantized Models | Aug 4, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| A Survey of Spanish Clinical Language Models | Aug 4, 2023 | BenchmarkingSurvey | —Unverified | 0 |
| Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances | Aug 3, 2023 | Benchmarking | —Unverified | 0 |
| qgym: A Gym for Training and Benchmarking RL-Based Quantum Compilation | Aug 1, 2023 | BenchmarkingOpenAI Gym | CodeCode Available | 1 |
| Differential Privacy for Adaptive Weight Aggregation in Federated Tumor Segmentation | Aug 1, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| Benchmarking Ultra-High-Definition Image Reflection Removal | Aug 1, 2023 | BenchmarkingImage Restoration | CodeCode Available | 0 |
| Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks | Aug 1, 2023 | Benchmarking | —Unverified | 0 |
| CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual Clustering | Aug 1, 2023 | BenchmarkingClustering | —Unverified | 0 |
| VG-SSL: Benchmarking Self-supervised Representation Learning Approaches for Visual Geo-localization | Jul 31, 2023 | Autonomous NavigationAutonomous Vehicles | CodeCode Available | 1 |
| Deep Learning and Computer Vision for Glaucoma Detection: A Review | Jul 31, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples | Jul 31, 2023 | Adversarial RobustnessBenchmarking | CodeCode Available | 1 |
| TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization | Jul 30, 2023 | BenchmarkingMulti-target regression | CodeCode Available | 0 |
| SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension | Jul 30, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 2 |
| Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment | Jul 30, 2023 | BenchmarkingEntity Alignment | CodeCode Available | 1 |
| Benchmarking Offline Reinforcement Learning on Real-Robot Hardware | Jul 28, 2023 | Benchmarkingreinforcement-learning | CodeCode Available | 1 |
| Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System | Jul 28, 2023 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer | Jul 27, 2023 | BenchmarkingImage Manipulation | CodeCode Available | 2 |