| Benchmarking Multi-Domain Active Learning on Image Classification | Dec 1, 2023 | Active LearningAll | —Unverified | 0 |
| Benchmarking and Enhancing Disentanglement in Concept-Residual Models | Nov 30, 2023 | BenchmarkingDisentanglement | —Unverified | 0 |
| LucidDreaming: Controllable Object-Centric 3D Generation | Nov 30, 2023 | 3D GenerationBenchmarking | —Unverified | 0 |
| A Video is Worth 10,000 Words: Training and Benchmarking with Diverse Captions for Better Long Video Retrieval | Nov 30, 2023 | BenchmarkingRetrieval | —Unverified | 0 |
| Event-based Continuous Color Video Decompression from Single Frames | Nov 30, 2023 | Benchmarking | —Unverified | 0 |
| Enhancing Ligand Pose Sampling for Molecular Docking | Nov 30, 2023 | BenchmarkingMolecular Docking | CodeCode Available | 1 |
| Towards Assessing and Benchmarking Risk-Return Tradeoff of Off-Policy Evaluation | Nov 30, 2023 | Benchmarkingcounterfactual | CodeCode Available | 1 |
| Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning Algorithms | Nov 30, 2023 | BenchmarkingOpenAI Gym | CodeCode Available | 1 |
| Z_2 Z_2 Equivariant Quantum Neural Networks: Benchmarking against Classical Neural Networks | Nov 30, 2023 | BenchmarkingBinary Classification | CodeCode Available | 0 |
| Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction | Nov 30, 2023 | Benchmarkingregression | —Unverified | 0 |
| AlignBench: Benchmarking Chinese Alignment of Large Language Models | Nov 30, 2023 | Benchmarking | CodeCode Available | 2 |
| TaskBench: Benchmarking Large Language Models for Task Automation | Nov 30, 2023 | BenchmarkingParameter Prediction | CodeCode Available | 6 |
| TransOpt: Transformer-based Representation Learning for Optimization Problem Classification | Nov 29, 2023 | BenchmarkingClassification | —Unverified | 0 |
| Mixed-Precision Quantization for Federated Learning on Resource-Constrained Heterogeneous Devices | Nov 29, 2023 | BenchmarkingFederated Learning | —Unverified | 0 |
| ROBBIE: Robust Bias Evaluation of Large Generative Language Models | Nov 29, 2023 | BenchmarkingFairness | —Unverified | 0 |
| Biomedical knowledge graph-optimized prompt generation for large language models | Nov 29, 2023 | BenchmarkingKnowledge Graphs | CodeCode Available | 2 |
| SAIBench: A Structural Interpretation of AI for Science Through Benchmarks | Nov 29, 2023 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| Enhancing Post-Hoc Explanation Benchmark Reliability for Image Classification | Nov 29, 2023 | BenchmarkingDecision Making | —Unverified | 0 |
| Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs | Nov 29, 2023 | Benchmarking | CodeCode Available | 1 |
| SEED-Bench-2: Benchmarking Multimodal Large Language Models | Nov 28, 2023 | BenchmarkingImage Generation | CodeCode Available | 2 |
| UniIR: Training and Benchmarking Universal Multimodal Information Retrievers | Nov 28, 2023 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| PAWS-VMK: A Unified Approach To Semi-Supervised Learning And Out-of-Distribution Detection | Nov 28, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Riemannian Self-Attention Mechanism for SPD Networks | Nov 28, 2023 | BenchmarkingRiemannian optimization | —Unverified | 0 |
| FakeWatch ElectionShield: A Benchmarking Framework to Detect Fake News for Credible US Elections | Nov 27, 2023 | ArticlesBenchmarking | —Unverified | 0 |
| Comprehensive Benchmarking of Entropy and Margin Based Scoring Metrics for Data Selection | Nov 27, 2023 | Active LearningBenchmarking | —Unverified | 0 |
| Experimental Analysis of Large-scale Learnable Vector Storage Compression | Nov 27, 2023 | Benchmarking | CodeCode Available | 0 |
| Lightly Weighted Automatic Audio Parameter Extraction for the Quality Assessment of Consensus Auditory-Perceptual Evaluation of Voice | Nov 27, 2023 | Benchmarking | —Unverified | 0 |
| Syn3DWound: A Synthetic Dataset for 3D Wound Bed Analysis | Nov 27, 2023 | BenchmarkingDiagnostic | —Unverified | 0 |
| Benchmarking Large Language Model Volatility | Nov 26, 2023 | BenchmarkingDecision Making | —Unverified | 0 |
| UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation | Nov 26, 2023 | BenchmarkingHallucination | CodeCode Available | 1 |
| ASI: Accuracy-Stability Index for Evaluating Deep Learning Models | Nov 26, 2023 | BenchmarkingDeep Learning | —Unverified | 0 |
| An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification | Nov 24, 2023 | Benchmarkingimage-classification | —Unverified | 0 |
| Benchmarking Robustness of Text-Image Composed Retrieval | Nov 24, 2023 | AttributeBenchmarking | CodeCode Available | 1 |
| Large Language Models as Automated Aligners for benchmarking Vision-Language Models | Nov 24, 2023 | BenchmarkingWorld Knowledge | —Unverified | 0 |
| Dialogue Quality and Emotion Annotations for Customer Support Conversations | Nov 23, 2023 | BenchmarkingDiversity | CodeCode Available | 0 |
| Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI | Nov 23, 2023 | BenchmarkingCloud Detection | CodeCode Available | 0 |
| Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN) | Nov 23, 2023 | BenchmarkingBrain Tumor Segmentation | —Unverified | 0 |
| Learning Dynamic Selection and Pricing of Out-of-Home Deliveries | Nov 23, 2023 | BenchmarkingDecision Making | CodeCode Available | 0 |
| Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning | Nov 22, 2023 | BenchmarkingDrug Discovery | —Unverified | 0 |
| PG-Video-LLaVA: Pixel Grounding Large Video-Language Models | Nov 22, 2023 | BenchmarkingPhrase Grounding | CodeCode Available | 2 |
| A projected nonlinear state-space model for forecasting time series signals | Nov 22, 2023 | BenchmarkingComputational Efficiency | CodeCode Available | 0 |
| Deep State-Space Model for Predicting Cryptocurrency Price | Nov 21, 2023 | BenchmarkingUncertainty Quantification | —Unverified | 0 |
| IMGTB: A Framework for Machine-Generated Text Detection Benchmarking | Nov 21, 2023 | BenchmarkingText Detection | CodeCode Available | 1 |
| Benchmarking bias: Expanding clinical AI model card to incorporate bias reporting of social and non-social factors | Nov 21, 2023 | Benchmarking | —Unverified | 0 |
| BEND: Benchmarking DNA Language Models on biologically meaningful tasks | Nov 21, 2023 | BenchmarkingLanguage Modeling | CodeCode Available | 1 |
| Towards a more inductive world for drug repurposing approaches | Nov 21, 2023 | BenchmarkingPrediction | CodeCode Available | 1 |
| Demonstrating Almost Linear Time Complexity of Bus Admittance Matrix-Based Distribution Network Power Flow: An Empirical Approach | Nov 20, 2023 | Benchmarking | —Unverified | 0 |
| LogLead -- Fast and Integrated Log Loader, Enhancer, and Anomaly Detector | Nov 20, 2023 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| Holistic Inverse Rendering of Complex Facade via Aerial 3D Scanning | Nov 20, 2023 | BenchmarkingInverse Rendering | —Unverified | 0 |
| Segment Together: A Versatile Paradigm for Semi-Supervised Medical Image Segmentation | Nov 20, 2023 | BenchmarkingImage Segmentation | —Unverified | 0 |