| LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking | Aug 9, 2023 | BenchmarkingFew-Shot Learning | CodeCode Available | 1 |
| Benchmarking LLM powered Chatbots: Methods and Metrics | Aug 8, 2023 | BenchmarkingChatbot | —Unverified | 0 |
| Application-Oriented Benchmarking of Quantum Generative Learning Using QUARK | Aug 8, 2023 | BenchmarkingGPU | CodeCode Available | 1 |
| RECipe: Does a Multi-Modal Recipe Knowledge Graph Fit a Multi-Purpose Recommendation System? | Aug 8, 2023 | BenchmarkingCollaborative Filtering | —Unverified | 0 |
| XFlow: Benchmarking Flow Behaviors over Graphs | Aug 7, 2023 | Benchmarking | CodeCode Available | 1 |
| Microvasculature Segmentation in Human BioMolecular Atlas Program (HuBMAP) | Aug 6, 2023 | BenchmarkingImage Segmentation | —Unverified | 0 |
| Precise Benchmarking of Explainable AI Attribution Methods | Aug 6, 2023 | Benchmarkingimage-classification | CodeCode Available | 0 |
| ChatGPT for GTFS: Benchmarking LLMs on GTFS Understanding and Retrieval | Aug 4, 2023 | BenchmarkingInformation Retrieval | CodeCode Available | 0 |
| RobustMQ: Benchmarking Robustness of Quantized Models | Aug 4, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| A Survey of Spanish Clinical Language Models | Aug 4, 2023 | BenchmarkingSurvey | —Unverified | 0 |
| Benchmarking Adaptative Variational Quantum Algorithms on QUBO Instances | Aug 3, 2023 | Benchmarking | —Unverified | 0 |
| qgym: A Gym for Training and Benchmarking RL-Based Quantum Compilation | Aug 1, 2023 | BenchmarkingOpenAI Gym | CodeCode Available | 1 |
| Differential Privacy for Adaptive Weight Aggregation in Federated Tumor Segmentation | Aug 1, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| Benchmarking Ultra-High-Definition Image Reflection Removal | Aug 1, 2023 | BenchmarkingImage Restoration | CodeCode Available | 0 |
| Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks | Aug 1, 2023 | Benchmarking | —Unverified | 0 |
| CLAMS: A Cluster Ambiguity Measure for Estimating Perceptual Variability in Visual Clustering | Aug 1, 2023 | BenchmarkingClustering | —Unverified | 0 |
| VG-SSL: Benchmarking Self-supervised Representation Learning Approaches for Visual Geo-localization | Jul 31, 2023 | Autonomous NavigationAutonomous Vehicles | CodeCode Available | 1 |
| Deep Learning and Computer Vision for Glaucoma Detection: A Review | Jul 31, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples | Jul 31, 2023 | Adversarial RobustnessBenchmarking | CodeCode Available | 1 |
| TMPNN: High-Order Polynomial Regression Based on Taylor Map Factorization | Jul 30, 2023 | BenchmarkingMulti-target regression | CodeCode Available | 0 |
| SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension | Jul 30, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 2 |
| Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment | Jul 30, 2023 | BenchmarkingEntity Alignment | CodeCode Available | 1 |
| Benchmarking Offline Reinforcement Learning on Real-Robot Hardware | Jul 28, 2023 | Benchmarkingreinforcement-learning | CodeCode Available | 1 |
| Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System | Jul 28, 2023 | Anomaly DetectionAutonomous Driving | CodeCode Available | 0 |
| IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer | Jul 27, 2023 | BenchmarkingImage Manipulation | CodeCode Available | 2 |
| Benchmarking Performance of Deep Learning Model for Material Segmentation on Two HPC Systems | Jul 27, 2023 | BenchmarkingGPU | —Unverified | 0 |
| Quantitative Metrics for Benchmarking Human-Aware Robot Navigation | Jul 26, 2023 | BenchmarkingRobot Navigation | CodeCode Available | 0 |
| YOLOBench: Benchmarking Efficient Object Detectors on Embedded Systems | Jul 26, 2023 | BenchmarkingCPU | CodeCode Available | 0 |
| Fluorescent Neuronal Cells v2: Multi-Task, Multi-Format Annotations for Deep Learning in Microscopy | Jul 26, 2023 | Benchmarkingobject-detection | —Unverified | 0 |
| Foundational Models Defining a New Era in Vision: A Survey and Outlook | Jul 25, 2023 | Benchmarking | CodeCode Available | 2 |
| Towards Long-Term predictions of Turbulence using Neural Operators | Jul 25, 2023 | Benchmarking | —Unverified | 0 |
| When Multi-Task Learning Meets Partial Supervision: A Computer Vision Review | Jul 25, 2023 | BenchmarkingMulti-Task Learning | CodeCode Available | 0 |
| UPREVE: An End-to-End Causal Discovery Benchmarking System | Jul 25, 2023 | BenchmarkingCausal Discovery | —Unverified | 0 |
| Implementing and Benchmarking the Locally Competitive Algorithm on the Loihi 2 Neuromorphic Processor | Jul 25, 2023 | BenchmarkingCPU | —Unverified | 0 |
| Benchmarking and Analyzing Generative Data for Visual Recognition | Jul 25, 2023 | BenchmarkingRetrieval | —Unverified | 0 |
| Towards an AI Accountability Policy | Jul 25, 2023 | BenchmarkingFairness | —Unverified | 0 |
| The Impact of Genomic Variation on Function (IGVF) Consortium | Jul 24, 2023 | Benchmarking | —Unverified | 0 |
| Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation of rPPG | Jul 24, 2023 | Benchmarking | CodeCode Available | 2 |
| PLANTAIN: Diffusion-inspired Pose Score Minimization for Fast and Accurate Molecular Docking | Jul 22, 2023 | BenchmarkingMolecular Docking | CodeCode Available | 1 |
| Selecting the motion ground truth for loose-fitting wearables: benchmarking optical MoCap methods | Jul 21, 2023 | Benchmarking | CodeCode Available | 0 |
| JoinGym: An Efficient Query Optimization Environment for Reinforcement Learning | Jul 21, 2023 | BenchmarkingCombinatorial Optimization | CodeCode Available | 1 |
| Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory | Jul 20, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| The Extractive-Abstractive Axis: Measuring Content "Borrowing" in Generative Language Models | Jul 20, 2023 | Benchmarking | —Unverified | 0 |
| SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models | Jul 20, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Benchmarking Potential Based Rewards for Learning Humanoid Locomotion | Jul 19, 2023 | BenchmarkingReinforcement Learning (RL) | CodeCode Available | 2 |
| On the Real-Time Semantic Segmentation of Aphid Clusters in the Wild | Jul 17, 2023 | BenchmarkingReal-Time Semantic Segmentation | —Unverified | 0 |
| Efficient Prediction of Peptide Self-assembly through Sequential and Graphical Encoding | Jul 17, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| Examining the Effects of Degree Distribution and Homophily in Graph Learning Models | Jul 17, 2023 | BenchmarkingGraph Clustering | CodeCode Available | 1 |
| Towards Heterogeneous Long-tailed Learning: Benchmarking, Metrics, and Toolbox | Jul 17, 2023 | Benchmarking | CodeCode Available | 1 |
| Approaches for benchmarking single-cell gene regulatory network inference methods | Jul 17, 2023 | Benchmarking | —Unverified | 0 |