| Towards Efficient and Scalable Training of Differentially Private Deep Learning | Jun 25, 2024 | BenchmarkingDeep Learning | CodeCode Available | 0 |
| Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation | Jun 25, 2024 | Action DetectionBenchmarking | CodeCode Available | 0 |
| Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language | Jun 25, 2024 | Benchmarking | —Unverified | 0 |
| NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods | Jun 25, 2024 | 3DGSBenchmarking | —Unverified | 0 |
| A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems | Jun 25, 2024 | BenchmarkingCollaborative Filtering | CodeCode Available | 0 |
| MatText: Do Language Models Need More than Text & Scale for Materials Modeling? | Jun 25, 2024 | Benchmarking | CodeCode Available | 1 |
| VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation | Jun 25, 2024 | ARCBenchmarking | CodeCode Available | 0 |
| Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA | Jun 25, 2024 | BenchmarkingLong-Context Understanding | CodeCode Available | 2 |
| Brittle Minds, Fixable Activations: Understanding Belief Representations in Language Models | Jun 25, 2024 | Benchmarking | —Unverified | 0 |
| MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models | Jun 24, 2024 | Benchmarking | —Unverified | 0 |