| Automated detection of gibbon calls from passive acoustic monitoring data using convolutional neural networks in the "torch for R" ecosystem | Jul 13, 2024 | BenchmarkingDeep Learning | —Unverified | 0 |
| OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling | Jul 13, 2024 | BenchmarkingMath | CodeCode Available | 1 |
| NativQA: Multilingual Culturally-Aligned Natural Query for LLMs | Jul 13, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Retrospective for the Dynamic Sensorium Competition for predicting large-scale mouse primary visual cortex activity from videos | Jul 12, 2024 | BenchmarkingPupil Dilation | CodeCode Available | 1 |
| Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic Environment | Jul 12, 2024 | BenchmarkingDecision Making | CodeCode Available | 0 |
| Benchmarking Language Model Creativity: A Case Study on Code Generation | Jul 12, 2024 | BenchmarkingCode Generation | CodeCode Available | 1 |
| A Comprehensive Survey on Retrieval Methods in Recommender Systems | Jul 11, 2024 | BenchmarkingRecommendation Systems | —Unverified | 0 |
| Evaluating Nuanced Bias in Large Language Model Free Response Answers | Jul 11, 2024 | BenchmarkingLanguage Modeling | —Unverified | 0 |
| WayveScenes101: A Dataset and Benchmark for Novel View Synthesis in Autonomous Driving | Jul 11, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines | Jul 11, 2024 | BenchmarkingPrediction | CodeCode Available | 1 |