| Fauno: The Italian Large Language Model that will leave you senza parole! | Jun 26, 2023 | GPULanguage Modeling | CodeCode Available | 1 |
| DesCo: Learning Object Recognition with Rich Language Descriptions | Jun 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Bring Your Own Data! Self-Supervised Evaluation for Large Language Models | Jun 23, 2023 | ChatbotLanguage Modeling | CodeCode Available | 1 |
| Implementing contextual biasing in GPU decoder for online ASR | Jun 23, 2023 | CPUDecoder | CodeCode Available | 1 |
| Generative Multimodal Entity Linking | Jun 22, 2023 | Entity LinkingIn-Context Learning | CodeCode Available | 1 |
| OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue | Jun 21, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 1 |
| Mass-Producing Failures of Multimodal Systems with Language Models | Jun 21, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| NoRefER: a Referenceless Quality Metric for Automatic Speech Recognition via Semi-Supervised Language Model Fine-Tuning with Contrastive Learning | Jun 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| A Reference-less Quality Metric for Automatic Speech Recognition via Contrastive-Learning of a Multi-Language Model with Self-Supervision | Jun 21, 2023 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Sparse Modular Activation for Efficient Sequence Modeling | Jun 19, 2023 | ChunkingLanguage Modeling | CodeCode Available | 1 |
| LLMVA-GEBC: Large Language Model with Video Adapter for Generic Event Boundary Captioning | Jun 17, 2023 | Boundary CaptioningLanguage Modeling | CodeCode Available | 1 |
| Just One Byte (per gradient): A Note on Low-Bandwidth Decentralized Language Model Finetuning Using Shared Randomness | Jun 16, 2023 | Distributed OptimizationLanguage Modeling | CodeCode Available | 1 |
| Conformal Language Modeling | Jun 16, 2023 | Conformal PredictionLanguage Modeling | CodeCode Available | 1 |
| FALL-E: A Foley Sound Synthesis Model and Strategies | Jun 16, 2023 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Pushing the Limits of Unsupervised Unit Discovery for SSL Speech Representation | Jun 15, 2023 | Automatic Speech RecognitionClustering | CodeCode Available | 1 |
| ChessGPT: Bridging Policy Learning and Language Modeling | Jun 15, 2023 | Decision MakingLanguage Modeling | CodeCode Available | 1 |
| Generate to Understand for Representation | Jun 14, 2023 | Contrastive LearningGPU | CodeCode Available | 1 |
| World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models | Jun 14, 2023 | Grounded Open Vocabulary AcquisitionLanguage Modeling | CodeCode Available | 1 |
| Tokenization with Factorized Subword Encoding | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Gradient Ascent Post-training Enhances Language Model Generalization | Jun 12, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Waffling around for Performance: Visual Classification with Random Words and Broad Concepts | Jun 12, 2023 | ClassificationLanguage Modeling | CodeCode Available | 1 |
| GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model | Jun 11, 2023 | General KnowledgeKnowledge Distillation | CodeCode Available | 1 |
| Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method | Jun 11, 2023 | Knowledge DistillationLanguage Modeling | CodeCode Available | 1 |
| QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search | Jun 11, 2023 | Domain AdaptationLanguage Modeling | CodeCode Available | 1 |
| Large Language Models Are Semi-Parametric Reinforcement Learning Agents | Jun 9, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| 14 Examples of How LLMs Can Transform Materials Science and Chemistry: A Reflection on a Large Language Model Hackathon | Jun 9, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions | Jun 9, 2023 | HallucinationLanguage Modeling | CodeCode Available | 1 |
| Hexatagging: Projective Dependency Parsing as Tagging | Jun 8, 2023 | Computational EfficiencyDependency Parsing | CodeCode Available | 1 |
| Privately generating tabular data using language models | Jun 7, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images! | Jun 6, 2023 | counterfactualData Augmentation | CodeCode Available | 1 |
| On the Difference of BERT-style and CLIP-style Text Encoders | Jun 6, 2023 | Image GenerationLanguage Modeling | CodeCode Available | 1 |
| LLMZip: Lossless Text Compression using Large Language Models | Jun 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs | Jun 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| AutoScrum: Automating Project Planning Using Large Language Models | Jun 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| COMET: Learning Cardinality Constrained Mixture of Experts with Trees and Local Search | Jun 5, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Improving Conversational Recommendation Systems via Counterfactual Data Simulation | Jun 5, 2023 | Conversational Recommendationcounterfactual | CodeCode Available | 1 |
| Log Parsing: How Far Can ChatGPT Go? | Jun 2, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Enhancing the Protein Tertiary Structure Prediction by Multiple Sequence Alignment Generation | Jun 2, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Preference-grounded Token-level Guidance for Language Model Fine-tuning | Jun 1, 2023 | Imitation LearningLanguage Modeling | CodeCode Available | 1 |
| Training-free Neural Architecture Search for RNNs and Transformers | Jun 1, 2023 | image-classificationImage Classification | CodeCode Available | 1 |
| Vocabulary-free Image Classification | Jun 1, 2023 | Classificationimage-classification | CodeCode Available | 1 |
| Faster Causal Attention Over Large Sequences Through Sparse Flash Attention | Jun 1, 2023 | 16k8k | CodeCode Available | 1 |
| IDAS: Intent Discovery with Abstractive Summarization | May 31, 2023 | Abstractive Text SummarizationDescriptive | CodeCode Available | 1 |
| Red Teaming Language Model Detectors with Language Models | May 31, 2023 | Adversarial RobustnessLanguage Modeling | CodeCode Available | 1 |
| Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data | May 31, 2023 | Code SearchLanguage Modeling | CodeCode Available | 1 |
| Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models | May 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Likelihood-Based Diffusion Language Models | May 30, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images | May 30, 2023 | counterfactualLanguage Modeling | CodeCode Available | 1 |
| Test-Time Training on Nearest Neighbors for Large Language Models | May 29, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| PaLI-X: On Scaling up a Multilingual Vision and Language Model | May 29, 2023 | Chart Question Answeringdocument understanding | CodeCode Available | 1 |