| A Functional Analysis Approach to Symbolic Regression | Feb 9, 2024 | Benchmarkingregression | —Unverified | 0 |
| Transparent and Scrutable Recommendations Using Natural Language User Profiles | Feb 8, 2024 | BenchmarkingDescriptive | CodeCode Available | 0 |
| Efficient Expression Neutrality Estimation with Application to Face Recognition Utility Prediction | Feb 8, 2024 | BenchmarkingFace Image Quality | —Unverified | 0 |
| Benchmarking Large Language Models on Communicative Medical Coaching: a Novel System and Dataset | Feb 8, 2024 | Benchmarking | CodeCode Available | 0 |
| SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Feb 8, 2024 | BenchmarkingDiversity | CodeCode Available | 7 |
| Improved off-policy training of diffusion samplers | Feb 7, 2024 | Benchmarking | CodeCode Available | 1 |
| BRI3L: A Brightness Illusion Image Dataset for Identification and Localization of Regions of Illusory Perception | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 |
| InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior | Feb 7, 2024 | BenchmarkingDecoder | CodeCode Available | 2 |
| Towards Biologically Plausible and Private Gene Expression Data Generation | Feb 7, 2024 | Benchmarking | CodeCode Available | 0 |
| LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology | Feb 6, 2024 | AllBenchmarking | CodeCode Available | 2 |
| LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K | Feb 6, 2024 | 16kBenchmarking | CodeCode Available | 2 |
| Quantitative Metrics for Benchmarking Medical Image Harmonization | Feb 6, 2024 | AnatomyBenchmarking | —Unverified | 0 |
| Are Machines Better at Complex Reasoning? Unveiling Human-Machine Inference Gaps in Entailment Verification | Feb 6, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection | Feb 6, 2024 | Benchmarking | CodeCode Available | 0 |
| Architecture Analysis and Benchmarking of 3D U-shaped Deep Learning Models for Thoracic Anatomical Segmentation | Feb 5, 2024 | BenchmarkingImage Segmentation | CodeCode Available | 0 |
| PowerGraph: A power grid benchmark dataset for graph neural networks | Feb 5, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching | Feb 5, 2024 | BenchmarkingSentence | CodeCode Available | 1 |
| Vi(E)va LLM! A Conceptual Stack for Evaluating and Interpreting Generative AI-based Visualizations | Feb 3, 2024 | Benchmarking | CodeCode Available | 0 |
| EffiBench: Benchmarking the Efficiency of Automatically Generated Code | Feb 3, 2024 | BenchmarkingCode Completion | CodeCode Available | 2 |
| Probing Critical Learning Dynamics of PLMs for Hate Speech Detection | Feb 3, 2024 | BenchmarkingHate Speech Detection | CodeCode Available | 0 |
| GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning | Feb 3, 2024 | BenchmarkingDeepFake Detection | CodeCode Available | 1 |
| Can LLMs perform structured graph reasoning? | Feb 2, 2024 | BenchmarkingNavigate | CodeCode Available | 0 |
| Variational Quantum Circuits Enhanced Generative Adversarial Network | Feb 2, 2024 | BenchmarkingGenerative Adversarial Network | —Unverified | 0 |
| Benchmarking Spiking Neural Network Learning Methods with Varying Locality | Feb 1, 2024 | Benchmarking | —Unverified | 0 |
| MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures | Feb 1, 2024 | AnatomyBenchmarking | —Unverified | 0 |