| SoK: Systematization and Benchmarking of Deepfake Detectors in a Unified Framework | Jan 9, 2024 | BenchmarkingDeepFake Detection | —Unverified | 0 |
| Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset | Jan 9, 2024 | Benchmarkingimage-classification | —Unverified | 0 |
| MST: Adaptive Multi-Scale Tokens Guided Interactive Segmentation | Jan 9, 2024 | BenchmarkingInteractive Segmentation | CodeCode Available | 0 |
| TransportationGames: Benchmarking Transportation Knowledge of (Multimodal) Large Language Models | Jan 9, 2024 | Benchmarking | —Unverified | 0 |
| Chain of LoRA: Efficient Fine-tuning of Language Models via Residual Learning | Jan 8, 2024 | BenchmarkingCoLA | —Unverified | 0 |
| Attention versus Contrastive Learning of Tabular Data -- A Data-centric Benchmarking | Jan 8, 2024 | BenchmarkingContrastive Learning | —Unverified | 0 |
| Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks | Jan 7, 2024 | BenchmarkingGraph Neural Network | CodeCode Available | 0 |
| Segment Anything Model for Medical Image Segmentation: Current Applications and Future Directions | Jan 7, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 5 |
| NovelGym: A Flexible Ecosystem for Hybrid Planning and Learning Agents Designed for Open Worlds | Jan 7, 2024 | Autonomous VehiclesBenchmarking | —Unverified | 0 |
| CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital Twins | Jan 6, 2024 | Autonomous VehiclesBenchmarking | CodeCode Available | 1 |
| Using Multi-Temporal Sentinel-1 and Sentinel-2 data for water bodies mapping | Jan 5, 2024 | Benchmarking | —Unverified | 0 |
| German Text Embedding Clustering Benchmark | Jan 5, 2024 | BenchmarkingClustering | CodeCode Available | 1 |
| Benchmarking PathCLIP for Pathology Image Analysis | Jan 5, 2024 | BenchmarkingDecision Making | —Unverified | 0 |
| Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNN | Jan 5, 2024 | Benchmarking | CodeCode Available | 0 |
| Nodule detection and generation on chest X-rays: NODE21 Challenge | Jan 4, 2024 | Benchmarking | —Unverified | 0 |
| AstroLLaMA-Chat: Scaling AstroLLaMA with Conversational and Diverse Datasets | Jan 3, 2024 | AstronomyBenchmarking | —Unverified | 0 |
| Benchmarking Audio Visual Segmentation for Long-Untrimmed Videos | Jan 1, 2024 | Benchmarking | —Unverified | 0 |
| Hyperbolic Anomaly Detection | Jan 1, 2024 | Anomaly DetectionBenchmarking | —Unverified | 0 |
| AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One | Jan 1, 2024 | AllBenchmarking | —Unverified | 0 |
| FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning | Jan 1, 2024 | BenchmarkingFederated Learning | —Unverified | 0 |
| A Call to Reflect on Evaluation Practices for Age Estimation: Comparative Analysis of the State-of-the-Art and a Unified Benchmark | Jan 1, 2024 | Age EstimationBenchmarking | CodeCode Available | 2 |
| Sheared Backpropagation for Fine-tuning Foundation Models | Jan 1, 2024 | Benchmarking | —Unverified | 0 |
| FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures | Jan 1, 2024 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| SEED-Bench: Benchmarking Multimodal Large Language Models | Jan 1, 2024 | BenchmarkingImage Generation | CodeCode Available | 3 |
| FinDABench: Benchmarking Financial Data Analysis Ability of Large Language Models | Jan 1, 2024 | Benchmarking | CodeCode Available | 1 |
| Temporal Validity Change Prediction | Jan 1, 2024 | BenchmarkingPrediction | —Unverified | 0 |
| Benchmarking Large Language Models on Controllable Generation under Diversified Instructions | Jan 1, 2024 | BenchmarkingInstruction Following | CodeCode Available | 1 |
| Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models | Dec 30, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Benchmarking Hebbian learning rules for associative memory | Dec 30, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA | Dec 29, 2023 | AnatomyBenchmarking | CodeCode Available | 1 |
| TSPP: A Unified Benchmarking Tool for Time-series Forecasting | Dec 28, 2023 | BenchmarkingFeature Engineering | CodeCode Available | 0 |
| FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs | Dec 27, 2023 | BenchmarkingGPU | CodeCode Available | 0 |
| Knowledge Enhanced Conditional Imputation for Healthcare Time-series | Dec 27, 2023 | BenchmarkingImputation | CodeCode Available | 0 |
| Combining SNNs with Filtering for Efficient Neural Decoding in Implantable Brain-Machine Interfaces | Dec 26, 2023 | BenchmarkingDecoder | —Unverified | 0 |
| RDF-star2Vec: RDF-star Graph Embeddings for Data Mining | Dec 25, 2023 | BenchmarkingGraph Embedding | CodeCode Available | 0 |
| APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond | Dec 25, 2023 | Animal Pose EstimationBenchmarking | CodeCode Available | 1 |
| Data needs and challenges for quantum dot devices automation | Dec 21, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks | Dec 21, 2023 | BenchmarkingCommunity Detection | —Unverified | 0 |
| Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models | Dec 21, 2023 | Benchmarking | CodeCode Available | 1 |
| Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming | Dec 21, 2023 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks | Dec 21, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation | Dec 21, 2023 | BenchmarkingProduct Recommendation | CodeCode Available | 1 |
| AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Comparing Machine Learning Algorithms by Union-Free Generic Depth | Dec 20, 2023 | Benchmarking | CodeCode Available | 0 |
| Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest | Dec 20, 2023 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| Perception Test 2023: A Summary of the First Challenge And Outcome | Dec 20, 2023 | BenchmarkingGrounded Video Question Answering | —Unverified | 0 |
| FiFAR: A Fraud Detection Dataset for Learning to Defer | Dec 20, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Scaling Compute Is Not All You Need for Adversarial Robustness | Dec 20, 2023 | Adversarial RobustnessAll | CodeCode Available | 0 |