| Adaptive Experimentation at Scale: A Computational Framework for Flexible Batches | Mar 21, 2023 | BenchmarkingThompson Sampling | —Unverified | 0 |
| DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4 | Mar 20, 2023 | BenchmarkingDe-identification | CodeCode Available | 1 |
| A Multi-Task Deep Learning Approach for Sensor-based Human Activity Recognition and Segmentation | Mar 20, 2023 | Activity RecognitionBenchmarking | —Unverified | 0 |
| Benchmarking Robustness of 3D Object Detection to Common Corruptions in Autonomous Driving | Mar 20, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 0 |
| Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering Regularized Self-Training | Mar 20, 2023 | BenchmarkingClustering | CodeCode Available | 1 |
| COVID-19 event extraction from Twitter via extractive question answering with continuous prompts | Mar 19, 2023 | BenchmarkingEvent Extraction | CodeCode Available | 1 |
| CCTV-Gun: Benchmarking Handgun Detection in CCTV Images | Mar 19, 2023 | Benchmarkingobject-detection | CodeCode Available | 1 |
| NoisyHate: Mining Online Human-Written Perturbations for Realistic Robustness Benchmarking of Content Moderation Models | Mar 18, 2023 | Adversarial AttackBenchmarking | —Unverified | 0 |
| DeAR: Debiasing Vision-Language Models with Additive Residuals | Mar 18, 2023 | AttributeBenchmarking | —Unverified | 0 |
| Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+ | Mar 16, 2023 | BenchmarkingGraph Regression | CodeCode Available | 3 |
| From MNIST to ImageNet and Back: Benchmarking Continual Curriculum Learning | Mar 16, 2023 | BenchmarkingContinual Learning | CodeCode Available | 0 |
| Joint Multi-Scale Tone Mapping and Denoising for HDR Image Enhancement | Mar 16, 2023 | BenchmarkingDemosaicking | CodeCode Available | 0 |
| ShabbyPages: A Reproducible Document Denoising and Binarization Dataset | Mar 16, 2023 | BenchmarkingBinarization | —Unverified | 0 |
| DACOS-A Manually Annotated Dataset of Code Smells | Mar 15, 2023 | Benchmarking | —Unverified | 0 |
| TransNetR: Transformer-based Residual Network for Polyp Segmentation with Multi-Center Out-of-Distribution Testing | Mar 13, 2023 | BenchmarkingDecoder | CodeCode Available | 1 |
| Aux-Drop: Handling Haphazard Inputs in Online Learning Using Auxiliary Dropouts | Mar 9, 2023 | Benchmarking | CodeCode Available | 0 |
| BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset | Mar 9, 2023 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| Towards Self-adaptive Mutation in Evolutionary Multi-Objective Algorithms | Mar 8, 2023 | BenchmarkingEvolutionary Algorithms | —Unverified | 0 |
| Using Affine Combinations of BBOB Problems for Performance Assessment | Mar 8, 2023 | Benchmarking | —Unverified | 0 |
| Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test | Mar 8, 2023 | BenchmarkingTime Series | CodeCode Available | 0 |
| Continuous Function Structured in Multilayer Perceptron for Global Optimization | Mar 7, 2023 | Benchmarkingglobal-optimization | —Unverified | 0 |
| Leveraging Pre-trained AudioLDM for Sound Generation: A Benchmark Study | Mar 7, 2023 | Audio GenerationBenchmarking | —Unverified | 0 |
| OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception | Mar 7, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| Continuous-Time Gaussian Process Motion-Compensation for Event-vision Pattern Tracking with Distance Fields | Mar 5, 2023 | BenchmarkingMotion Compensation | —Unverified | 0 |
| Extended Agriculture-Vision: An Extension of a Large Aerial Image Dataset for Agricultural Pattern Analysis | Mar 4, 2023 | BenchmarkingContrastive Learning | CodeCode Available | 2 |
| FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation | Mar 4, 2023 | BenchmarkingGPU | CodeCode Available | 2 |
| Benchmarking framework for machine learning classification from fNIRS data | Mar 3, 2023 | BenchmarkingBrain Computer Interface | CodeCode Available | 0 |
| Benchmarking White Blood Cell Classification Under Domain Shift | Mar 3, 2023 | BenchmarkingClassification | CodeCode Available | 0 |
| Data-Efficient Training of CNNs and Transformers with Coresets: A Stability Perspective | Mar 3, 2023 | BenchmarkingImage Classification | CodeCode Available | 0 |
| POPGym: Benchmarking Partially Observable Reinforcement Learning | Mar 3, 2023 | BenchmarkingGPU | CodeCode Available | 2 |
| Structure-Based Experimental Datasets for Benchmarking Protein Simulation Force Fields | Mar 2, 2023 | Benchmarking | —Unverified | 0 |
| Learning to Adapt to Online Streams with Distribution Shifts | Mar 2, 2023 | BenchmarkingMeta-Learning | —Unverified | 0 |
| Benchmarking Self-Supervised Contrastive Learning Methods for Image-Based Plant Phenotyping | Mar 1, 2023 | BenchmarkingContrastive Learning | CodeCode Available | 0 |
| A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking | Feb 28, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| Benchmarking Deepart Detection | Feb 28, 2023 | BenchmarkingDeepFake Detection | —Unverified | 0 |
| Predicting the Performance of a Computing System with Deep Networks | Feb 27, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking of Cancelable Biometrics for Deep Templates | Feb 26, 2023 | BenchmarkingBinarization | —Unverified | 0 |
| STA: Self-controlled Text Augmentation for Improving Text Classifications | Feb 24, 2023 | BenchmarkingText Augmentation | CodeCode Available | 0 |
| Dynamic Benchmarking of Masked Language Models on Temporal Concept Drift with Multiple Views | Feb 23, 2023 | Benchmarking | —Unverified | 0 |
| What Can We Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers | Feb 23, 2023 | BenchmarkingOut-of-Distribution Detection | CodeCode Available | 1 |
| Revisiting the Gumbel-Softmax in MADDPG | Feb 23, 2023 | BenchmarkingMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| A framework for benchmarking class-out-of-distribution detection and its application to ImageNet | Feb 23, 2023 | BenchmarkingKnowledge Distillation | CodeCode Available | 1 |
| Dermatological Diagnosis Explainability Benchmark for Convolutional Neural Networks | Feb 23, 2023 | BenchmarkingMedical Diagnosis | CodeCode Available | 0 |
| MultiRobustBench: Benchmarking Robustness Against Multiple Attacks | Feb 21, 2023 | Benchmarking | —Unverified | 0 |
| An Efficient Two-stage Gradient Boosting Framework for Short-term Traffic State Estimation | Feb 21, 2023 | BenchmarkingState Estimation | CodeCode Available | 0 |
| Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines | Feb 21, 2023 | Benchmarkingwhole slide images | —Unverified | 0 |
| Determinants of Performance in European ATM -- How to Analyze a Diverse Industry | Feb 20, 2023 | BenchmarkingManagement | —Unverified | 0 |
| Arena-Rosnav 2.0: A Development and Benchmarking Platform for Robot Navigation in Highly Dynamic Environments | Feb 20, 2023 | BenchmarkingRobot Navigation | CodeCode Available | 0 |
| Fuzzy Knowledge Distillation from High-Order TSK to Low-Order TSK | Feb 16, 2023 | BenchmarkingKnowledge Distillation | —Unverified | 0 |
| Towards Fair Machine Learning Software: Understanding and Addressing Model Bias Through Counterfactual Thinking | Feb 16, 2023 | Benchmarkingcounterfactual | —Unverified | 0 |