| Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling | Mar 21, 2024 | Grounded language learningLanguage Acquisition | CodeCode Available | 1 |
| BiLD: Bi-directional Logits Difference Loss for Large Language Model Distillation | Jun 19, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 1 |
| SpeechPrompt: An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks | Mar 31, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Bilinear MLPs enable weight-based mechanistic interpretability | Oct 10, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| "Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation | Sep 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AudioBERT: Audio Knowledge Augmented Language Model | Sep 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Adaptive Attention Span in Transformers | May 19, 2019 | 8kLanguage Modeling | CodeCode Available | 1 |
| ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation | Dec 23, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| EscapeBench: Pushing Language Models to Think Outside the Box | Dec 18, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Entity Tracking in Language Models | May 3, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |