| BdSLW60: A Word-Level Bangla Sign Language Dataset | Feb 13, 2024 | BenchmarkingGesture Recognition | CodeCode Available | 0 |
| Impact of spatial transformations on landscape features of CEC2022 basic benchmark problems | Feb 12, 2024 | Benchmarking | —Unverified | 0 |
| Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT | Feb 12, 2024 | BenchmarkingChunking | —Unverified | 0 |
| EvoGPT-f: An Evolutionary GPT Framework for Benchmarking Formal Math Languages | Feb 12, 2024 | Automated Theorem ProvingBenchmarking | —Unverified | 0 |
| Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking Study | Feb 11, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| Estimating the Effect of Crosstalk Error on Circuit Fidelity Using Noisy Intermediate-Scale Quantum Devices | Feb 10, 2024 | Benchmarking | —Unverified | 0 |
| ProtIR: Iterative Refinement between Retrievers and Predictors for Protein Function Annotation | Feb 10, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Improving 2D-3D Dense Correspondences with Diffusion Models for 6D Object Pose Estimation | Feb 9, 2024 | 6D Pose Estimation using RGBBenchmarking | —Unverified | 0 |
| A Functional Analysis Approach to Symbolic Regression | Feb 9, 2024 | Benchmarkingregression | —Unverified | 0 |
| LLaVA-Docent: Instruction Tuning with Multimodal Large Language Model to Support Art Appreciation Education | Feb 9, 2024 | BenchmarkingChatbot | —Unverified | 0 |
| Efficient Expression Neutrality Estimation with Application to Face Recognition Utility Prediction | Feb 8, 2024 | BenchmarkingFace Image Quality | —Unverified | 0 |
| Transparent and Scrutable Recommendations Using Natural Language User Profiles | Feb 8, 2024 | BenchmarkingDescriptive | CodeCode Available | 0 |
| Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset | Feb 8, 2024 | Benchmarking | CodeCode Available | 0 |
| Towards Biologically Plausible and Private Gene Expression Data Generation | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 |
| BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 |
| AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection | Feb 6, 2024 | Benchmarking | CodeCode Available | 0 |
| Are Machines Better at Complex Reasoning? Unveiling Human-Machine Inference Gaps in Entailment Verification | Feb 6, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| Quantitative Metrics for Benchmarking Medical Image Harmonization | Feb 6, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| PowerGraph: A power grid benchmark dataset for graph neural networks | Feb 5, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation | Feb 5, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 0 |
| Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations | Feb 3, 2024 | Benchmarking | CodeCode Available | 0 |
| Probing Critical Learning Dynamics of PLMs for Hate Speech Detection | Feb 3, 2024 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| Can LLMs perform structured graph reasoning? | Feb 2, 2024 | BenchmarkingNavigate | CodeCode Available | 0 |
| Variational Quantum Circuits Enhanced Generative Adversarial Network | Feb 2, 2024 | BenchmarkingGenerative Adversarial Network | —Unverified | 0 |
| Benchmarking Spiking Neural Network Learning Methods with Varying Locality | Feb 1, 2024 | Benchmarking | —Unverified | 0 |
| Coherent Feed Forward Quantum Neural Network | Feb 1, 2024 | BenchmarkingDiagnostic | —Unverified | 0 |
| MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures | Feb 1, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data | Jan 31, 2024 | BenchmarkingChange Detection | CodeCode Available | 0 |
| Benchmarking Sensitivity of Continual Graph Learning for Skeleton-Based Action Recognition | Jan 31, 2024 | Action RecognitionBenchmarking | —Unverified | 0 |
| ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks | Jan 29, 2024 | BenchmarkingCross-Lingual Transfer | CodeCode Available | 0 |
| Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA | Jan 29, 2024 | BenchmarkingImage Comprehension | —Unverified | 0 |
| PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models | Jan 28, 2024 | BenchmarkingCode Generation | CodeCode Available | 0 |
| Benchmarking with MIMIC-IV, an irregular, spare clinical time series dataset | Jan 27, 2024 | BenchmarkingTime Series | —Unverified | 0 |
| SAM-based instance segmentation models for the automation of structural damage detection | Jan 27, 2024 | BenchmarkingInstance Segmentation | —Unverified | 0 |
| Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis | Jan 26, 2024 | BenchmarkingSemantic Segmentation | —Unverified | 0 |
| Benchmarking Large Language Models in Complex Question Answering Attribution using Knowledge Graphs | Jan 26, 2024 | BenchmarkingKnowledge Graphs | —Unverified | 0 |
| Automated legal reasoning with discretion to act using s(LAW) | Jan 25, 2024 | BenchmarkingLegal Reasoning | —Unverified | 0 |
| TriSAM: Tri-Plane SAM for zero-shot cortical blood vessel segmentation in VEM images | Jan 25, 2024 | BenchmarkingSegmentation | —Unverified | 0 |
| Large Malaysian Language Model Based on Mistral for Enhanced Local Language Understanding | Jan 24, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Benchmarking the Fairness of Image Upsampling Methods | Jan 24, 2024 | BenchmarkingDiversity | CodeCode Available | 0 |
| LLpowershap: Logistic Loss-based Automated Shapley Values Feature Selection Method | Jan 23, 2024 | BenchmarkingFairness | CodeCode Available | 0 |
| Deep Neural Network Benchmarks for Selective Classification | Jan 23, 2024 | BenchmarkingClassification | CodeCode Available | 0 |
| What the Weight?! A Unified Framework for Zero-Shot Knowledge Composition | Jan 23, 2024 | Benchmarking | CodeCode Available | 0 |
| Subgroup analysis methods for time-to-event outcomes in heterogeneous randomized controlled trials | Jan 22, 2024 | BenchmarkingSynthetic Data Generation | CodeCode Available | 0 |
| Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound | Jan 20, 2024 | Benchmarking | —Unverified | 0 |
| Data Augmentation for Traffic Classification | Jan 19, 2024 | BenchmarkingClassification | —Unverified | 0 |
| Harnessing Orthogonality to Train Low-Rank Neural Networks | Jan 16, 2024 | Benchmarking | CodeCode Available | 0 |
| NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription | Jan 16, 2024 | Automatic Speech RecognitionBenchmarking | —Unverified | 0 |
| OpenDPD: An Open-Source End-to-End Learning & Benchmarking Framework for Wideband Power Amplifier Modeling and Digital Pre-Distortion | Jan 16, 2024 | Benchmarking | —Unverified | 0 |
| Large Language Models are Null-Shot Learners | Jan 16, 2024 | Arithmetic ReasoningBenchmarking | —Unverified | 0 |