| Temporal Validity Change Prediction | Jan 1, 2024 | BenchmarkingPrediction | —Unverified | 0 |
| Benchmarking Large Language Models on Controllable Generation under Diversified Instructions | Jan 1, 2024 | BenchmarkingInstruction Following | CodeCode Available | 1 |
| Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models | Dec 30, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Benchmarking Hebbian learning rules for associative memory | Dec 30, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRA | Dec 29, 2023 | AnatomyBenchmarking | CodeCode Available | 1 |
| TSPP: A Unified Benchmarking Tool for Time-series Forecasting | Dec 28, 2023 | BenchmarkingFeature Engineering | CodeCode Available | 0 |
| FALCON: Feature-Label Constrained Graph Net Collapse for Memory Efficient GNNs | Dec 27, 2023 | BenchmarkingGPU | CodeCode Available | 0 |
| Knowledge Enhanced Conditional Imputation for Healthcare Time-series | Dec 27, 2023 | BenchmarkingImputation | CodeCode Available | 0 |
| Combining SNNs with Filtering for Efficient Neural Decoding in Implantable Brain-Machine Interfaces | Dec 26, 2023 | BenchmarkingDecoder | —Unverified | 0 |
| RDF-star2Vec: RDF-star Graph Embeddings for Data Mining | Dec 25, 2023 | BenchmarkingGraph Embedding | CodeCode Available | 0 |
| APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond | Dec 25, 2023 | Animal Pose EstimationBenchmarking | CodeCode Available | 1 |
| Data needs and challenges for quantum dot devices automation | Dec 21, 2023 | Benchmarking | —Unverified | 0 |
| Benchmarking Evolutionary Community Detection Algorithms in Dynamic Networks | Dec 21, 2023 | BenchmarkingCommunity Detection | —Unverified | 0 |
| Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models | Dec 21, 2023 | Benchmarking | CodeCode Available | 1 |
| Incorporating Human Flexibility through Reward Preferences in Human-AI Teaming | Dec 21, 2023 | Benchmarkingreinforcement-learning | —Unverified | 0 |
| ARBiBench: Benchmarking Adversarial Robustness of Binarized Neural Networks | Dec 21, 2023 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation | Dec 21, 2023 | BenchmarkingProduct Recommendation | CodeCode Available | 1 |
| AN ELIXIR FOR BLOCKCHAIN SCALABILITY WITH CHANNEL BASED CLUSTERED SHARDING | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Review and experimental benchmarking of machine learning algorithms for efficient optimization of cold atom experiments | Dec 20, 2023 | Benchmarking | —Unverified | 0 |
| Comparing Machine Learning Algorithms by Union-Free Generic Depth | Dec 20, 2023 | Benchmarking | CodeCode Available | 0 |
| Benchmarking and Analyzing In-context Learning, Fine-tuning and Supervised Learning for Biomedical Knowledge Curation: a focused study on chemical entities of biological interest | Dec 20, 2023 | BenchmarkingIn-Context Learning | —Unverified | 0 |
| Perception Test 2023: A Summary of the First Challenge And Outcome | Dec 20, 2023 | BenchmarkingGrounded Video Question Answering | —Unverified | 0 |
| FiFAR: A Fraud Detection Dataset for Learning to Defer | Dec 20, 2023 | BenchmarkingDecision Making | CodeCode Available | 1 |
| Scaling Compute Is Not All You Need for Adversarial Robustness | Dec 20, 2023 | Adversarial RobustnessAll | CodeCode Available | 0 |