| Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces | May 23, 2023 | Benchmarking | CodeCode Available | 1 |
| Towards Benchmarking and Assessing Visual Naturalness of Physical World Adversarial Attacks | May 22, 2023 | Adversarial AttackAutonomous Driving | CodeCode Available | 1 |
| Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method | May 22, 2023 | BenchmarkingHallucination | CodeCode Available | 1 |
| X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models | May 18, 2023 | BenchmarkingImage Generation | CodeCode Available | 1 |
| PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering | May 17, 2023 | BenchmarkingDiagnostic | CodeCode Available | 1 |
| An Empirical Study on Google Research Football Multi-agent Scenarios | May 16, 2023 | BenchmarkingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| A Platform for the Biomedical Application of Large Language Models | May 10, 2023 | BenchmarkingPrivacy Preserving | CodeCode Available | 1 |
| InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation | May 10, 2023 | BenchmarkingImage Captioning | CodeCode Available | 1 |
| Benchmarking large language models for biomedical natural language processing applications and recommendations | May 10, 2023 | BenchmarkingDocument Classification | CodeCode Available | 1 |
| DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects | May 9, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Working Memory Capacity of ChatGPT: An Empirical Study | Apr 30, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Event-Free Moving Object Segmentation from Moving Ego Vehicle | Apr 28, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 1 |
| IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds | Apr 25, 2023 | BenchmarkingPose Estimation | CodeCode Available | 1 |
| MF-NeRF: Memory Efficient NeRF with Mixed-Feature Hash Table | Apr 25, 2023 | BenchmarkingGPU | CodeCode Available | 1 |
| RGB-D Indiscernible Object Counting in Underwater Scenes | Apr 23, 2023 | BenchmarkingDepth Estimation | CodeCode Available | 1 |
| Benchmarking Low-Shot Robustness to Natural Distribution Shifts | Apr 21, 2023 | Benchmarking | CodeCode Available | 1 |
| SCoDA: Domain Adaptive Shape Completion for Real Scans | Apr 20, 2023 | BenchmarkingDomain Adaptation | CodeCode Available | 1 |
| Graph Neural Network-Based Anomaly Detection for River Network Systems | Apr 19, 2023 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| Benchmarking Actor-Critic Deep Reinforcement Learning Algorithms for Robotics Control with Action Constraints | Apr 18, 2023 | BenchmarkingDeep Reinforcement Learning | CodeCode Available | 1 |
| A Comparison of Image Denoising Methods | Apr 18, 2023 | BenchmarkingDenoising | CodeCode Available | 1 |
| NeuroBench: A Framework for Benchmarking Neuromorphic Computing Algorithms and Systems | Apr 10, 2023 | Benchmarking | CodeCode Available | 1 |
| Interpretable statistical representations of neural population dynamics and geometry | Apr 6, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding | Apr 5, 2023 | BenchmarkingMS-SSIM | CodeCode Available | 1 |
| SLPerf: a Unified Framework for Benchmarking Split Learning | Apr 4, 2023 | BenchmarkingDiversity | CodeCode Available | 1 |
| Spam-T5: Benchmarking Large Language Models for Few-Shot Email Spam Detection | Apr 3, 2023 | BenchmarkingSentence | CodeCode Available | 1 |