| Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection | Apr 25, 2024 | Benchmarkingobject-detection | CodeCode Available | 1 |
| SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension | Apr 25, 2024 | BenchmarkingMultiple-choice | CodeCode Available | 3 |
| Benchmarking Mobile Device Control Agents across Diverse Configurations | Apr 25, 2024 | BenchmarkingImitation Learning | —Unverified | 0 |
| ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees | Apr 24, 2024 | BenchmarkingMolecular Property Prediction | CodeCode Available | 0 |
| SynthEval: A Framework for Detailed Utility and Privacy Evaluation of Tabular Synthetic Data | Apr 24, 2024 | BenchmarkingFairness | CodeCode Available | 1 |
| ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction | Apr 24, 2024 | AttributeAttribute Value Extraction | CodeCode Available | 1 |
| DPO: A Differential and Pointwise Control Approach to Reinforcement Learning | Apr 24, 2024 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| Empirical Analysis of the Dynamic Binary Value Problem with IOHprofiler | Apr 24, 2024 | Benchmarking | —Unverified | 0 |
| Importance of Disjoint Sampling in Conventional and Transformer Models for Hyperspectral Image Classification | Apr 23, 2024 | BenchmarkingHyperspectral Image Classification | CodeCode Available | 0 |
| Open Datasets for Satellite Radio Resource Control | Apr 22, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Benchmarking Advanced Text Anonymisation Methods: A Comparative Study on Novel and Traditional Approaches | Apr 22, 2024 | BenchmarkingDiversity | —Unverified | 0 |
| Experimental Validation of Ultrasound Beamforming with End-to-End Deep Learning for Single Plane Wave Imaging | Apr 22, 2024 | Benchmarking | CodeCode Available | 1 |
| The Adversarial AI-Art: Understanding, Generation, Detection, and Benchmarking | Apr 22, 2024 | BenchmarkingMisinformation | —Unverified | 0 |
| A User-Centric Multi-Intent Benchmark for Evaluating Large Language Models | Apr 22, 2024 | BenchmarkingWorld Knowledge | CodeCode Available | 1 |
| EnzChemRED, a rich enzyme chemistry relation extraction dataset | Apr 22, 2024 | Benchmarkingnamed-entity-recognition | —Unverified | 0 |
| TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos | Apr 22, 2024 | BenchmarkingMulti-Object Tracking | —Unverified | 0 |
| TAVGBench: Benchmarking Text to Audible-Video Generation | Apr 22, 2024 | BenchmarkingContrastive Learning | CodeCode Available | 1 |
| In-situ process monitoring and adaptive quality enhancement in laser additive manufacturing: a critical review | Apr 21, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News | Apr 21, 2024 | BenchmarkingEmotion Recognition | CodeCode Available | 0 |
| Bridging the Gap Between Theory and Practice: Benchmarking Transfer Evolutionary Optimization | Apr 20, 2024 | Benchmarking | —Unverified | 0 |
| DeepFake-O-Meter v2.0: An Open Platform for DeepFake Detection | Apr 19, 2024 | BenchmarkingDeepFake Detection | CodeCode Available | 3 |
| Integrated Sensing and Communication enabled Multiple Base Stations Cooperative UAV Detection | Apr 19, 2024 | BenchmarkingIntegrated sensing and communication | —Unverified | 0 |
| STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases | Apr 19, 2024 | BenchmarkingRetrieval | CodeCode Available | 3 |
| Look Before You Decide: Prompting Active Deduction of MLLMs for Assumptive Reasoning | Apr 19, 2024 | Benchmarkingcounterfactual | —Unverified | 0 |
| REXEL: An End-to-end Model for Document-Level Relation Extraction and Entity Linking | Apr 19, 2024 | Benchmarkingcoreference-resolution | CodeCode Available | 1 |