| Benchmarking Generative Models on Computational Thinking Tests in Elementary Visual Programming | Jun 14, 2024 | BenchmarkingGeneral Knowledge | —Unverified | 0 |
| Learning from Natural Language Explanations for Generalizable Entity Matching | Jun 13, 2024 | Binary ClassificationDomain Generalization | —Unverified | 0 |
| RAD: A Comprehensive Dataset for Benchmarking the Robustness of Image Anomaly Detection | Jun 11, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation | Jun 9, 2024 | Common Sense ReasoningDenoising | CodeCode Available | 1 |
| F-LMM: Grounding Frozen Large Multimodal Models | Jun 9, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers | Jun 7, 2024 | General KnowledgeQuestion Generation | —Unverified | 0 |
| HYDRA: Model Factorization Framework for Black-Box LLM Personalization | Jun 5, 2024 | General Knowledge | CodeCode Available | 1 |
| ContextFlow++: Generalist-Specialist Flow-based Generative Models with Mixed-Variable Context Encoding | Jun 2, 2024 | Anomaly DetectionDensity Estimation | CodeCode Available | 0 |
| Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks | Jun 1, 2024 | General KnowledgeHippocampus | CodeCode Available | 1 |
| CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection | May 26, 2024 | General Knowledge | CodeCode Available | 1 |
| SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge | May 15, 2024 | General KnowledgeKnowledge Graphs | —Unverified | 0 |
| Health Index Estimation Through Integration of General Knowledge with Unsupervised Learning | May 8, 2024 | General Knowledge | CodeCode Available | 1 |
| MoST: Multi-modality Scene Tokenization for Motion Prediction | Apr 30, 2024 | General Knowledgemotion prediction | —Unverified | 0 |
| Towards Generalizable Agents in Text-Based Educational Environments: A Study of Integrating RL with LLMs | Apr 29, 2024 | DiagnosticGeneral Knowledge | —Unverified | 0 |
| Enhancing Action Recognition from Low-Quality Skeleton Data via Part-Level Knowledge Distillation | Apr 28, 2024 | Action RecognitionGeneral Knowledge | —Unverified | 0 |
| Evaluating Consistency and Reasoning Capabilities of Large Language Models | Apr 25, 2024 | General KnowledgeText Generation | —Unverified | 0 |
| Learning Electromagnetic Metamaterial Physics With ChatGPT | Apr 23, 2024 | General Knowledge | —Unverified | 0 |
| When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering | Apr 19, 2024 | General Knowledge | —Unverified | 0 |
| Pretraining and Updates of Domain-Specific LLM: A Case Study in the Japanese Business Domain | Apr 12, 2024 | Continual PretrainingGeneral Knowledge | —Unverified | 0 |
| Knowledge graphs for empirical concept retrieval | Apr 10, 2024 | General KnowledgeKnowledge Graphs | CodeCode Available | 0 |
| Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge | Apr 8, 2024 | General KnowledgeSafety Alignment | CodeCode Available | 0 |
| BEAR: A Unified Framework for Evaluating Relational Knowledge in Causal and Masked Language Models | Apr 5, 2024 | Factual probeGeneral Knowledge | CodeCode Available | 1 |
| Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT | Apr 3, 2024 | BenchmarkingGeneral Knowledge | CodeCode Available | 1 |
| Prompt Learning via Meta-Regularization | Apr 1, 2024 | Domain GeneralizationGeneral Knowledge | CodeCode Available | 1 |
| Juru: Legal Brazilian Large Language Model from Reputable Sources | Mar 26, 2024 | General KnowledgeLanguage Modeling | —Unverified | 0 |
| Are LLMs Good Cryptic Crossword Solvers? | Mar 15, 2024 | General Knowledge | —Unverified | 0 |
| CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model | Mar 13, 2024 | General KnowledgeInstruction Following | CodeCode Available | 2 |
| DiPrompT: Disentangled Prompt Tuning for Multiple Latent Domain Generalization in Federated Learning | Mar 11, 2024 | Domain GeneralizationFederated Learning | —Unverified | 0 |
| See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI | Mar 11, 2024 | Brain DecodingGeneral Knowledge | CodeCode Available | 1 |
| Deep Prompt Multi-task Network for Abuse Language Detection | Mar 8, 2024 | Abusive LanguageGeneral Knowledge | —Unverified | 0 |
| MedSafetyBench: Evaluating and Improving the Medical Safety of Large Language Models | Mar 6, 2024 | EthicsGeneral Knowledge | CodeCode Available | 1 |
| K-Link: Knowledge-Link Graph from LLMs for Enhanced Representation Learning in Multivariate Time-Series Data | Mar 6, 2024 | General Knowledgegraph construction | —Unverified | 0 |
| Pruning neural network models for gene regulatory dynamics using data and domain knowledge | Mar 5, 2024 | General KnowledgeNetwork Pruning | CodeCode Available | 0 |
| Beyond Specialization: Assessing the Capabilities of MLLMs in Age and Gender Estimation | Mar 4, 2024 | Age And Gender ClassificationAge and Gender Estimation | CodeCode Available | 3 |
| Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese | Feb 27, 2024 | General KnowledgeQuestion Answering | CodeCode Available | 1 |
| Bootstrapping Cognitive Agents with a Large Language Model | Feb 25, 2024 | General KnowledgeLanguage Modeling | —Unverified | 0 |
| OMGEval: An Open Multilingual Generative Evaluation Benchmark for Large Language Models | Feb 21, 2024 | General KnowledgeLogical Reasoning | CodeCode Available | 1 |
| Inductive Graph Alignment Prompt: Bridging the Gap between Graph Pre-training and Inductive Fine-tuning From Spectral Perspective | Feb 21, 2024 | General KnowledgeGraph Classification | —Unverified | 0 |
| CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledge | Feb 12, 2024 | General KnowledgeMultiple-choice | CodeCode Available | 2 |
| Pre-training and Diagnosing Knowledge Base Completion Models | Jan 27, 2024 | General KnowledgeKnowledge Base Completion | CodeCode Available | 1 |
| GALA: Generating Animatable Layered Assets from a Single Scan | Jan 23, 2024 | 3D geometryGeneral Knowledge | —Unverified | 0 |
| INCPrompt: Task-Aware incremental Prompting for Rehearsal-Free Class-incremental Learning | Jan 22, 2024 | class-incremental learningClass Incremental Learning | —Unverified | 0 |
| The Unreasonable Effectiveness of Easy Training Data for Hard Tasks | Jan 12, 2024 | General KnowledgeIn-Context Learning | CodeCode Available | 1 |
| Generic Knowledge Boosted Pre-training For Remote Sensing Images | Jan 9, 2024 | Change DetectionDeep Learning | CodeCode Available | 1 |
| Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation | Jan 1, 2024 | General KnowledgeNavigate | CodeCode Available | 2 |
| KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling | Jan 1, 2024 | General KnowledgeKnowledge Distillation | —Unverified | 0 |
| MMA: Multi-Modal Adapter for Vision-Language Models | Jan 1, 2024 | Domain GeneralizationGeneral Knowledge | CodeCode Available | 2 |
| GeoGalactica: A Scientific Large Language Model in Geoscience | Dec 31, 2023 | Document ClassificationGeneral Knowledge | CodeCode Available | 1 |
| Time Travelling Pixels: Bitemporal Features Integration with Foundation Model for Remote Sensing Image Change Detection | Dec 23, 2023 | Change DetectionGeneral Knowledge | CodeCode Available | 1 |
| VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation | Dec 22, 2023 | Conditional Image GenerationGeneral Knowledge | CodeCode Available | 1 |