| WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking | Nov 14, 2024 | BenchmarkingDrug Discovery | —Unverified | 0 |
| A survey of probabilistic generative frameworks for molecular simulations | Nov 14, 2024 | BenchmarkingDenoising | CodeCode Available | 0 |
| Caravan MultiMet: Extending Caravan with Multiple Weather Nowcasts and Forecasts | Nov 14, 2024 | Benchmarking | CodeCode Available | 3 |
| BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation | Nov 14, 2024 | Adversarial AttackAdversarial Robustness | CodeCode Available | 0 |
| Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset | Nov 13, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| A Survey on Vision Autoregressive Model | Nov 13, 2024 | 3D GenerationBenchmarking | —Unverified | 0 |
| HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere | Nov 13, 2024 | BenchmarkingDataset Generation | —Unverified | 0 |
| FM-TS: Flow Matching for Time Series Generation | Nov 12, 2024 | BenchmarkingImputation | CodeCode Available | 1 |
| Evaluating the Generation of Spatial Relations in Text and Image Generative Models | Nov 12, 2024 | BenchmarkingImage Generation | —Unverified | 0 |
| Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation | Nov 11, 2024 | 16kBenchmarking | CodeCode Available | 0 |
| BuckTales : A multi-UAV dataset for multi-object tracking and re-identification of wild antelopes | Nov 11, 2024 | BenchmarkingMulti-Object Tracking | —Unverified | 0 |
| General Geospatial Inference with a Population Dynamics Foundation Model | Nov 11, 2024 | BenchmarkingGraph Neural Network | CodeCode Available | 3 |
| Benchmarking LLMs' Judgments with No Gold Standard | Nov 11, 2024 | BenchmarkingMachine Translation | CodeCode Available | 0 |
| Arctique: An artificial histopathological dataset unifying realism and controllability for uncertainty quantification | Nov 11, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 1 |
| MolMiner: Towards Controllable, 3D-Aware, Fragment-Based Molecular Design | Nov 10, 2024 | 3D geometryBenchmarking | —Unverified | 0 |
| Low Dynamic Range for RIS-aided Bistatic Integrated Sensing and Communication | Nov 9, 2024 | BenchmarkingIntegrated sensing and communication | —Unverified | 0 |
| Benchmarking 3D multi-coil NC-PDNet MRI reconstruction | Nov 8, 2024 | 3D ReconstructionBenchmarking | —Unverified | 0 |
| FactLens: Benchmarking Fine-Grained Fact Verification | Nov 8, 2024 | BenchmarkingFact Verification | —Unverified | 0 |
| Open-set object detection: towards unified problem formulation and benchmarking | Nov 8, 2024 | Autonomous DrivingBenchmarking | —Unverified | 0 |
| Benchmarking Distributional Alignment of Large Language Models | Nov 8, 2024 | Benchmarking | CodeCode Available | 0 |
| A Retrospective on the Robot Air Hockey Challenge: Benchmarking Robust, Reliable, and Safe Learning Techniques for Real-world Robotics | Nov 8, 2024 | Benchmarking | —Unverified | 0 |
| ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding | Nov 7, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| Performance-Guided LLM Knowledge Distillation for Efficient Text Classification at Scale | Nov 7, 2024 | Active LearningBenchmarking | —Unverified | 0 |
| Deep Learning Models for UAV-Assisted Bridge Inspection: A YOLO Benchmark Analysis | Nov 7, 2024 | BenchmarkingModel Selection | —Unverified | 0 |
| HandCraft: Anatomically Correct Restoration of Malformed Hands in Diffusion Generated Images | Nov 7, 2024 | AnatomyBenchmarking | —Unverified | 0 |