| CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset | Oct 1, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Exploring QUIC Dynamics: A Large-Scale Dataset for Encrypted Traffic Analysis | Sep 30, 2024 | BenchmarkingIntrusion Detection | CodeCode Available | 1 |
| ImmersePro: End-to-End Stereo Video Synthesis Via Implicit Disparity Learning | Sep 30, 2024 | BenchmarkingDisparity Estimation | CodeCode Available | 0 |
| Benchmarking Adaptive Intelligence and Computer Vision on Human-Robot Collaboration | Sep 30, 2024 | BenchmarkingIntent Detection | —Unverified | 0 |
| Q-Bench-Video: Benchmarking the Video Quality Understanding of LMMs | Sep 30, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| Match Stereo Videos via Bidirectional Alignment | Sep 30, 2024 | BenchmarkingStereo Matching | —Unverified | 0 |
| Beyond Prompts: Dynamic Conversational Benchmarking of Large Language Models | Sep 30, 2024 | BenchmarkingContinual Learning | CodeCode Available | 2 |
| GenTel-Safe: A Unified Benchmark and Shielding Framework for Defending Against Prompt Injection Attacks | Sep 29, 2024 | Benchmarking | —Unverified | 0 |
| Tracking Everything in Robotic-Assisted Surgery | Sep 29, 2024 | Benchmarking | —Unverified | 0 |
| A Survey on Graph Neural Networks for Remaining Useful Life Prediction: Methodologies, Evaluation and Future Trends | Sep 29, 2024 | Benchmarkinggraph construction | CodeCode Available | 2 |
| AstroMLab 2: AstroLLaMA-2-70B Model and Benchmarking Specialised LLMs for Astronomy | Sep 29, 2024 | AstronomyBenchmarking | —Unverified | 0 |
| Constrained Reinforcement Learning for Safe Heat Pump Control | Sep 29, 2024 | Benchmarkingreinforcement-learning | CodeCode Available | 0 |
| SciDoc2Diagrammer-MAF: Towards Generation of Scientific Diagrams from Documents guided by Multi-Aspect Feedback Refinement | Sep 28, 2024 | BenchmarkingCode Generation | —Unverified | 0 |
| EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes | Sep 27, 2024 | BenchmarkingDataset Generation | —Unverified | 0 |
| bnRep: A repository of Bayesian networks from the academic literature | Sep 27, 2024 | Benchmarking | —Unverified | 0 |
| CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting | Sep 27, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| MCUBench: A Benchmark of Tiny Object Detectors on MCUs | Sep 27, 2024 | BenchmarkingModel Selection | —Unverified | 0 |
| Data Analysis in the Era of Generative AI | Sep 27, 2024 | Benchmarking | —Unverified | 0 |
| Constructing Confidence Intervals for 'the' Generalization Error -- a Comprehensive Benchmark Study | Sep 27, 2024 | Benchmarkingtabular-regression | CodeCode Available | 0 |
| ARLBench: Flexible and Efficient Benchmarking for Hyperparameter Optimization in Reinforcement Learning | Sep 27, 2024 | AutoMLBenchmarking | CodeCode Available | 1 |
| The Elephant in the Room: Towards A Reliable Time-Series Anomaly Detection Benchmark | Sep 26, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 3 |
| Conformal Prediction: A Theoretical Note and Benchmarking Transductive Node Classification in Graphs | Sep 26, 2024 | BenchmarkingConformal Prediction | CodeCode Available | 0 |
| MALPOLON: A Framework for Deep Species Distribution Modeling | Sep 26, 2024 | BenchmarkingGPU | CodeCode Available | 1 |
| Omnibenchmark (alpha) for continuous and open benchmarking in bioinformatics | Sep 25, 2024 | Benchmarking | —Unverified | 0 |
| Proof of Thought : Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning | Sep 25, 2024 | BenchmarkingFormal Logic | —Unverified | 0 |