| Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs | Sep 27, 2023 | FormNavigate | CodeCode Available | 1 |
| Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning | Sep 19, 2024 | FormInstruction Following | CodeCode Available | 1 |
| Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models | Jul 5, 2024 | Adversarial AttackAutomatic Speech Recognition | CodeCode Available | 1 |
| Deep Visual Template-Free Form Parsing | Sep 5, 2019 | Form | CodeCode Available | 1 |
| DocQueryNet: Value Retrieval with Arbitrary Queries for Form-like Documents | Oct 1, 2022 | document understandingForm | CodeCode Available | 1 |
| Do Deep Neural Network Solutions Form a Star Domain? | Mar 12, 2024 | Form | CodeCode Available | 1 |
| E2E-LOAD: End-to-End Long-form Online Action Detection | Jun 13, 2023 | Action DetectionForm | CodeCode Available | 1 |
| EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding | Aug 17, 2023 | DiagnosticEgoSchema | CodeCode Available | 1 |
| Towards Emotion Analysis in Short-form Videos: A Large-Scale Dataset and Baseline | Nov 29, 2023 | audio-visual learningForm | CodeCode Available | 1 |
| Deep Learning for ECG Analysis: Benchmarks and Insights from PTB-XL | Apr 28, 2020 | AllBenchmarking | CodeCode Available | 1 |