| Quanda: An Interpretability Toolkit for Training Data Attribution Evaluation and Beyond | Oct 9, 2024 | Benchmarking | CodeCode Available | 2 |
| FedGraph: A Research Library and Benchmark for Federated Graph Learning | Oct 8, 2024 | BenchmarkingFederated Learning | CodeCode Available | 2 |
| MIBench: A Comprehensive Framework for Benchmarking Model Inversion Attack and Defense | Oct 7, 2024 | Adversarial RobustnessBenchmarking | CodeCode Available | 2 |
| dattri: A Library for Efficient Data Attribution | Oct 6, 2024 | Benchmarking | CodeCode Available | 2 |
| AutoPenBench: Benchmarking Generative Agents for Penetration Testing | Oct 4, 2024 | Benchmarking | CodeCode Available | 2 |
| Beyond Prompts: Dynamic Conversational Benchmarking of Large Language Models | Sep 30, 2024 | BenchmarkingContinual Learning | CodeCode Available | 2 |
| A Survey on Graph Neural Networks for Remaining Useful Life Prediction: Methodologies, Evaluation and Future Trends | Sep 29, 2024 | Benchmarkinggraph construction | CodeCode Available | 2 |
| Small Language Models: Survey, Measurements, and Insights | Sep 24, 2024 | BenchmarkingDecoder | CodeCode Available | 2 |
| GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization | Sep 24, 2024 | 3D geometry3DGS | CodeCode Available | 2 |
| A Survey on Multimodal Benchmarks: In the Era of Large AI Models | Sep 21, 2024 | BenchmarkingSurvey | CodeCode Available | 2 |