| EmProx: Neural Network Performance Estimation For Neural Architecture Search | Jun 13, 2022 | BenchmarkingDecoder | CodeCode Available | 0 |
| BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents | Jun 13, 2022 | Benchmarking | —Unverified | 0 |
| Data-Driven Denoising of Stationary Accelerometer Signals | Jun 13, 2022 | BenchmarkingDenoising | CodeCode Available | 1 |
| CodeS: Towards Code Model Generalization Under Distribution Shift | Jun 11, 2022 | BenchmarkingCode Classification | CodeCode Available | 0 |
| SAIBench: Benchmarking AI for Science | Jun 11, 2022 | BenchmarkingFriction | —Unverified | 0 |
| Challenges and Opportunities in Offline Reinforcement Learning from Visual Observations | Jun 9, 2022 | Benchmarkingcontinuous-control | CodeCode Available | 2 |
| SwinCheX: Multi-label classification on chest X-ray images with transformers | Jun 9, 2022 | BenchmarkingMulti-Label Classification | CodeCode Available | 1 |
| Functional Code Building Genetic Programming | Jun 9, 2022 | BenchmarkingProgram Synthesis | —Unverified | 0 |
| Do We Need Another Explainable AI Method? Toward Unifying Post-hoc XAI Evaluation Methods into an Interactive and Multi-dimensional Benchmark | Jun 8, 2022 | BenchmarkingExplainable Artificial Intelligence (XAI) | CodeCode Available | 1 |
| Benchmarking Bayesian neural networks and evaluation metrics for regression tasks | Jun 8, 2022 | BenchmarkingOpen-Ended Question Answering | —Unverified | 0 |
| FedHPO-B: A Benchmark Suite for Federated Hyperparameter Optimization | Jun 8, 2022 | BenchmarkingFederated Learning | —Unverified | 0 |
| Scaling laws in global corporations as a benchmarking approach to assess environmental performance | Jun 7, 2022 | BenchmarkingOpen-Ended Question Answering | —Unverified | 0 |
| Revisiting Realistic Test-Time Training: Sequential Inference and Adaptation by Anchored Clustering | Jun 6, 2022 | BenchmarkingClustering | CodeCode Available | 1 |
| MorisienMT: A Dataset for Mauritian Creole Machine Translation | Jun 6, 2022 | BenchmarkingMachine Translation | —Unverified | 0 |
| Which models are innately best at uncertainty estimation? | Jun 5, 2022 | BenchmarkingOut-of-Distribution Detection | —Unverified | 0 |
| Revisiting the "Video" in Video-Language Understanding | Jun 3, 2022 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| Fast Benchmarking of Accuracy vs. Training Time with Cyclic Learning Rates | Jun 2, 2022 | Benchmarking | CodeCode Available | 0 |
| Evaluation of Three Welsh Language POS Taggers | Jun 1, 2022 | BenchmarkingPOS | —Unverified | 0 |
| Deep One-Class Hate Speech Detection Model | Jun 1, 2022 | BenchmarkingBinary Classification | —Unverified | 0 |
| Introducing RezoJDM16k: a French KnowledgeGraph DataSet for Link Prediction | Jun 1, 2022 | 16kBenchmarking | —Unverified | 0 |
| Benchmarking Language Models for Cyberbullying Identification and Classification from Social-media Texts | Jun 1, 2022 | BenchmarkingBinary Classification | —Unverified | 0 |
| Low-resource Neural Machine Translation: Benchmarking State-of-the-art Transformer for Wolof<->French | Jun 1, 2022 | BenchmarkingLow Resource Neural Machine Translation | —Unverified | 0 |
| A Japanese Dataset for Subjective and Objective Sentiment Polarity Classification in Micro Blog Domain | Jun 1, 2022 | BenchmarkingEmotion Recognition | CodeCode Available | 1 |
| Jojajovai: A Parallel Guarani-Spanish Corpus for MT Benchmarking | Jun 1, 2022 | BenchmarkingSentence | CodeCode Available | 1 |
| MTLens: Machine Translation Output Debugging | Jun 1, 2022 | BenchmarkingMachine Translation | —Unverified | 0 |