| Are SNNs Truly Energy-efficient? - A Hardware Perspective | Sep 6, 2023 | Benchmarking | —Unverified | 0 |
| AGIBench: A Multi-granularity, Multimodal, Human-referenced, Auto-scoring Benchmark for Large Language Models | Sep 5, 2023 | BenchmarkingZero-Shot Learning | —Unverified | 0 |
| A skeletonization algorithm for gradient-based optimization | Sep 5, 2023 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking | Sep 5, 2023 | BenchmarkingKnowledge Distillation | —Unverified | 0 |
| Transfer Learning between Motor Imagery Datasets using Deep Learning -- Validation of Framework and Comparison of Datasets | Sep 4, 2023 | BenchmarkingMotor Imagery | CodeCode Available | 0 |
| Benchmarking Large Language Models in Retrieval-Augmented Generation | Sep 4, 2023 | Benchmarkingcounterfactual | CodeCode Available | 2 |
| Hybrid data driven/thermal simulation model for comfort assessment | Sep 4, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking Autoregressive Conditional Diffusion Models for Turbulent Flow Simulation | Sep 4, 2023 | Benchmarking | CodeCode Available | 1 |
| Orientation-Independent Chinese Text Recognition in Scene Images | Sep 3, 2023 | BenchmarkingImage Reconstruction | CodeCode Available | 2 |
| FOR-instance: a UAV laser scanning benchmark dataset for semantic and instance segmentation of individual trees | Sep 3, 2023 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction | Sep 3, 2023 | BenchmarkingExposure Correction | —Unverified | 0 |
| NeMig -- A Bilingual News Collection and Knowledge Graph about Migration | Sep 1, 2023 | ArticlesBenchmarking | CodeCode Available | 0 |
| FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large Language Models in Federated Learning | Sep 1, 2023 | BenchmarkingFederated Learning | —Unverified | 0 |
| Can humans help BERT gain "confidence"? | Aug 31, 2023 | BenchmarkingEEG | —Unverified | 0 |
| Developing a Scalable Benchmark for Assessing Large Language Models in Knowledge Graph Engineering | Aug 31, 2023 | BenchmarkingDataset Generation | CodeCode Available | 1 |
| Benchmarking Robustness and Generalization in Multi-Agent Systems: A Case Study on Neural MMO | Aug 30, 2023 | BenchmarkingReinforcement Learning (RL) | —Unverified | 0 |
| Benchmarking Multilabel Topic Classification in the Kyrgyz Language | Aug 30, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Benchmarking the Generation of Fact Checking Explanations | Aug 29, 2023 | Abstractive Text SummarizationArticles | CodeCode Available | 1 |
| Towards quantitative precision for ECG analysis: Leveraging state space models, self-supervision and patient metadata | Aug 29, 2023 | BenchmarkingDiagnostic | CodeCode Available | 1 |
| Matbench Discovery -- A framework to evaluate machine learning crystal stability predictions | Aug 28, 2023 | BenchmarkingFormation Energy | CodeCode Available | 3 |
| Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads | Aug 28, 2023 | BenchmarkingSelf-Supervised Learning | —Unverified | 0 |
| MLLM-DataEngine: An Iterative Refinement Approach for MLLM | Aug 25, 2023 | Benchmarking | CodeCode Available | 1 |
| Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models | Aug 24, 2023 | Action LocalizationBenchmarking | —Unverified | 0 |
| Beyond Document Page Classification: Design, Datasets, and Challenges | Aug 24, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations | Aug 23, 2023 | BenchmarkingDecoder | CodeCode Available | 2 |
| Benchmarking Causal Study to Interpret Large Language Models for Source Code | Aug 23, 2023 | BenchmarkingCausal Inference | —Unverified | 0 |
| Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0 | Aug 23, 2023 | Benchmarkingregression | CodeCode Available | 0 |
| LLMRec: Benchmarking Large Language Models on Recommendation Task | Aug 23, 2023 | BenchmarkingExplanation Generation | CodeCode Available | 1 |
| Efficient Benchmarking of Language Models | Aug 22, 2023 | BenchmarkingGPU | —Unverified | 0 |
| Expecting The Unexpected: Towards Broad Out-Of-Distribution Detection | Aug 22, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 0 |
| Benchmarking Domain Adaptation for Chemical Processes on the Tennessee Eastman Process | Aug 22, 2023 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Beyond MD17: the reactive xxMD dataset | Aug 22, 2023 | BenchmarkingComputational chemistry | CodeCode Available | 0 |
| Measuring the Effect of Causal Disentanglement on the Adversarial Robustness of Neural Network Models | Aug 21, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| UGSL: A Unified Framework for Benchmarking Graph Structure Learning | Aug 21, 2023 | BenchmarkingGraph structure learning | —Unverified | 0 |
| VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations | Aug 19, 2023 | 6D Pose Estimation using RGBBenchmarking | CodeCode Available | 1 |
| Neurological Prognostication of Post-Cardiac-Arrest Coma Patients Using EEG Data: A Dynamic Survival Analysis Framework with Competing Risks | Aug 17, 2023 | BenchmarkingEEG | CodeCode Available | 0 |
| Benchmarking Neural Network Generalization for Grammar Induction | Aug 16, 2023 | Benchmarking | CodeCode Available | 1 |
| Benchmarking Adversarial Robustness of Compressed Deep Learning Models | Aug 16, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| IoT Data Trust Evaluation via Machine Learning | Aug 15, 2023 | BenchmarkingTime Series | CodeCode Available | 0 |
| Deep Neural Operator Driven Real Time Inference for Nuclear Systems to Enable Digital Twin Solutions | Aug 15, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| A Survey on Model Compression for Large Language Models | Aug 15, 2023 | BenchmarkingKnowledge Distillation | —Unverified | 0 |
| Benchmarking Scalable Epistemic Uncertainty Quantification in Organ Segmentation | Aug 15, 2023 | BenchmarkingMedical Image Analysis | CodeCode Available | 0 |
| Benchmarking Generated Poses: How Rational is Structure-based Drug Design with Generative Models? | Aug 14, 2023 | BenchmarkingDrug Design | CodeCode Available | 1 |
| BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents | Aug 11, 2023 | BenchmarkingDecision Making | CodeCode Available | 2 |
| Does AI for science need another ImageNet Or totally different benchmarks? A case study of machine learning force fields | Aug 11, 2023 | Benchmarking | —Unverified | 0 |
| DIG In: Evaluating Disparities in Image Generations with Indicators for Geographic Diversity | Aug 11, 2023 | BenchmarkingDiversity | CodeCode Available | 1 |
| A Comparative Visual Analytics Framework for Evaluating Evolutionary Processes in Multi-objective Optimization | Aug 10, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Spintronics for image recognition: performance benchmarking via ultrafast data-driven simulations | Aug 10, 2023 | BenchmarkingClassification | —Unverified | 0 |
| Benchmarking Algorithmic Bias in Face Recognition: An Experimental Approach Using Synthetic Faces and Human Evaluation | Aug 10, 2023 | AttributeBenchmarking | —Unverified | 0 |
| Enhancing Architecture Frameworks by Including Modern Stakeholders and their Views/Viewpoints | Aug 9, 2023 | Benchmarking | —Unverified | 0 |