| Comparing Hyper-optimized Machine Learning Models for Predicting Efficiency Degradation in Organic Solar Cells | Mar 29, 2024 | Benchmarking | —Unverified | 0 |
| IndiBias: A Benchmark Dataset to Measure Social Biases in Language Models for Indian Context | Mar 29, 2024 | BenchmarkingSentence | CodeCode Available | 0 |
| Are Large Language Models Good at Utility Judgments? | Mar 28, 2024 | Answer GenerationBenchmarking | CodeCode Available | 0 |
| Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound Data | Mar 27, 2024 | BenchmarkingCancer Classification | —Unverified | 0 |
| GPTs and Language Barrier: A Cross-Lingual Legal QA Examination | Mar 26, 2024 | ArticlesBenchmarking | —Unverified | 0 |
| Benchmarking Video Frame Interpolation | Mar 25, 2024 | BenchmarkingComputational Efficiency | —Unverified | 0 |
| NSINA: A News Corpus for Sinhala | Mar 25, 2024 | ArticlesBenchmarking | CodeCode Available | 0 |
| DISL: Fueling Research with A Large Dataset of Solidity Smart Contracts | Mar 25, 2024 | Benchmarking | —Unverified | 0 |
| On the Fragility of Active Learners for Text Classification | Mar 23, 2024 | Active LearningBenchmarking | CodeCode Available | 0 |
| TrustSQL: Benchmarking Text-to-SQL Reliability with Penalty-Based Scoring | Mar 23, 2024 | BenchmarkingText to SQL | CodeCode Available | 0 |
| Unifying Large Language Model and Deep Reinforcement Learning for Human-in-Loop Interactive Socially-aware Navigation | Mar 22, 2024 | BenchmarkingDeep Reinforcement Learning | —Unverified | 0 |
| Transactive Local Energy Markets Enable Community-Level Resource Coordination Using Individual Rewards | Mar 22, 2024 | Benchmarkingenergy management | —Unverified | 0 |
| Subjective Quality Assessment of Compressed Tone-Mapped High Dynamic Range Videos | Mar 22, 2024 | BenchmarkingTone Mapping | —Unverified | 0 |
| Broadening the Scope of Neural Network Potentials through Direct Inclusion of Additional Molecular Attributes | Mar 22, 2024 | Benchmarking | —Unverified | 0 |
| ChatGPT Alternative Solutions: Large Language Models Survey | Mar 21, 2024 | BenchmarkingChatbot | —Unverified | 0 |
| Embarrassingly Simple Scribble Supervision for 3D Medical Segmentation | Mar 19, 2024 | BenchmarkingSegmentation | —Unverified | 0 |
| MARTA: a model for the automatic phonemic grouping of the parkinsonian speech | Mar 19, 2024 | BenchmarkingClassification | CodeCode Available | 0 |
| Benchmarking Badminton Action Recognition with a New Fine-Grained Dataset | Mar 19, 2024 | Action RecognitionBenchmarking | —Unverified | 0 |
| Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks | Mar 18, 2024 | BenchmarkingClassification | —Unverified | 0 |
| A Sober Look at the Robustness of CLIPs to Spurious Features | Mar 18, 2024 | Benchmarking | —Unverified | 0 |
| Benchmarking the Robustness of UAV Tracking Against Common Corruptions | Mar 18, 2024 | Benchmarking | CodeCode Available | 0 |
| OpenEval: Benchmarking Chinese LLMs across Capability, Alignment and Safety | Mar 18, 2024 | BenchmarkingMathematical Reasoning | —Unverified | 0 |
| Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking | Mar 17, 2024 | BenchmarkingDialogue State Tracking | —Unverified | 0 |
| FlowMind: Automatic Workflow Generation with LLMs | Mar 17, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Depression Detection on Social Media with Large Language Models | Mar 16, 2024 | BenchmarkingDepression Detection | —Unverified | 0 |
| Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks | Mar 15, 2024 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study | Mar 15, 2024 | Benchmarking | CodeCode Available | 0 |
| SpokeN-100: A Cross-Lingual Benchmarking Dataset for The Classification of Spoken Numbers in Different Languages | Mar 14, 2024 | BenchmarkingDimensionality Reduction | CodeCode Available | 0 |
| Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors | Mar 14, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 0 |
| Semi-Supervised Learning for Anomaly Traffic Detection via Bidirectional Normalizing Flows | Mar 13, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| An Approach to Evaluate Modeling Adequacy for Small-Signal Stability Analysis of IBR-related SSOs in Multimachine Systems | Mar 12, 2024 | Benchmarking | —Unverified | 0 |
| A tutorial on multi-view autoencoders using the multi-view-AE library | Mar 12, 2024 | Benchmarking | —Unverified | 0 |
| IndicSTR12: A Dataset for Indic Scene Text Recognition | Mar 12, 2024 | BenchmarkingScene Text Recognition | —Unverified | 0 |
| (N,K)-Puzzle: A Cost-Efficient Testbed for Benchmarking Reinforcement Learning Algorithms in Generative Language Model | Mar 11, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| Class Imbalance in Object Detection: An Experimental Diagnosis and Study of Mitigation Strategies | Mar 11, 2024 | BenchmarkingData Augmentation | CodeCode Available | 0 |
| A Holistic Framework Towards Vision-based Traffic Signal Control with Microscopic Simulation | Mar 11, 2024 | BenchmarkingTraffic Signal Control | —Unverified | 0 |
| Multi-GPU-Enabled Hybrid Quantum-Classical Workflow in Quantum-HPC Middleware: Applications in Quantum Simulations | Mar 9, 2024 | BenchmarkingCPU | CodeCode Available | 0 |
| Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume | Mar 8, 2024 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| Synth4bench: a framework for generating synthetic genomics data for the evaluation of tumor-only somatic variant calling algorithms | Mar 8, 2024 | BenchmarkingSynthetic Data Generation | CodeCode Available | 0 |
| Benchmarking Large Language Models for Molecule Prediction Tasks | Mar 8, 2024 | BenchmarkingPrediction | CodeCode Available | 0 |
| Improvements & Evaluations on the MLCommons CloudMask Benchmark | Mar 7, 2024 | Benchmarking | CodeCode Available | 0 |
| NLPre: a revised approach towards language-centric benchmarking of Natural Language Preprocessing systems | Mar 7, 2024 | BenchmarkingDependency Parsing | —Unverified | 0 |
| Benchmarking News Recommendation in the Era of Green AI | Mar 7, 2024 | BenchmarkingGPU | —Unverified | 0 |
| Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI | Mar 7, 2024 | Benchmarking | CodeCode Available | 0 |
| Comparison Performance of Spectrogram and Scalogram as Input of Acoustic Recognition Task | Mar 6, 2024 | Benchmarking | CodeCode Available | 0 |
| BAIT: Benchmarking (Embedding) Architectures for Interactive Theorem-Proving | Mar 6, 2024 | Automated Theorem ProvingBenchmarking | —Unverified | 0 |
| Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks | Mar 6, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 0 |
| Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem | Mar 6, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Video | Mar 6, 2024 | BenchmarkingCrowd Counting | —Unverified | 0 |
| Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation | Mar 5, 2024 | BenchmarkingIn-Context Learning | —Unverified | 0 |