| Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models | Feb 27, 2024 | MarketingVideo Generation | CodeCode Available | 4 |
| LLM Inference Unveiled: Survey and Roofline Model Insights | Feb 26, 2024 | Knowledge DistillationLanguage Modelling | CodeCode Available | 4 |
| RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation | Feb 26, 2024 | Code Documentation GenerationCode Generation | CodeCode Available | 4 |
| MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT | Feb 26, 2024 | | CodeCode Available | 4 |
| Chain-of-Discussion: A Multi-Model Framework for Complex Evidence-Based Question Answering | Feb 26, 2024 | Evidence SelectionOpen-Ended Question Answering | CodeCode Available | 4 |
| Neural Operators with Localized Integral and Differential Kernels | Feb 26, 2024 | Operator learning | CodeCode Available | 4 |
| Debug like a Human: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step | Feb 25, 2024 | Code GenerationHumanEval | CodeCode Available | 4 |
| Knowledge Fusion of Chat LLMs: A Preliminary Technical Report | Feb 25, 2024 | | CodeCode Available | 4 |
| AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | Feb 23, 2024 | | CodeCode Available | 4 |
| AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System | Feb 23, 2024 | AI Agent | CodeCode Available | 4 |
| Self-Supervised Pre-Training for Table Structure Recognition Transformer | Feb 23, 2024 | Representation Learning | CodeCode Available | 4 |
| Cameras as Rays: Pose Estimation via Ray Diffusion | Feb 22, 2024 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 4 |
| 2D Matryoshka Sentence Embeddings | Feb 22, 2024 | RAGRepresentation Learning | CodeCode Available | 4 |
| TinyLLaVA: A Framework of Small-scale Large Multimodal Models | Feb 22, 2024 | Visual Question Answering | CodeCode Available | 4 |
| Large Language Models for Data Annotation and Synthesis: A Survey | Feb 21, 2024 | Survey | CodeCode Available | 4 |
| Benchmarking Retrieval-Augmented Generation for Medicine | Feb 20, 2024 | BenchmarkingInformation Retrieval | CodeCode Available | 4 |
| Neural Network Diffusion | Feb 20, 2024 | Decoder | CodeCode Available | 4 |
| FinBen: A Holistic Financial Benchmark for Large Language Models | Feb 20, 2024 | Question AnsweringRAG | CodeCode Available | 4 |
| Aria Everyday Activities Dataset | Feb 20, 2024 | | CodeCode Available | 4 |
| AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling | Feb 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs | Feb 19, 2024 | Knowledge Distillation | CodeCode Available | 4 |
| GIM: Learning Generalizable Image Matcher From Internet Videos | Feb 16, 2024 | 3D ReconstructionCamera Pose Estimation | CodeCode Available | 4 |
| In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss | Feb 16, 2024 | RAG | CodeCode Available | 4 |
| Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation | Feb 16, 2024 | Cardiac SegmentationDecoder | CodeCode Available | 4 |
| BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation | Feb 16, 2024 | Knowledge DistillationQuantization | CodeCode Available | 4 |
| PointMamba: A Simple State Space Model for Point Cloud Analysis | Feb 16, 2024 | GPUMamba | CodeCode Available | 4 |
| LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models | Feb 16, 2024 | | CodeCode Available | 4 |
| Generative Representational Instruction Tuning | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| TIAViz: A Browser-based Visualization Tool for Computational Pathology Models | Feb 15, 2024 | whole slide images | CodeCode Available | 4 |
| OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Feb 15, 2024 | Arithmetic ReasoningGSM8K | CodeCode Available | 4 |
| OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM | Feb 14, 2024 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 4 |
| DoRA: Weight-Decomposed Low-Rank Adaptation | Feb 14, 2024 | parameter-efficient fine-tuning | CodeCode Available | 4 |
| G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering | Feb 12, 2024 | Common Sense ReasoningGraph Classification | CodeCode Available | 4 |
| Dólares or Dollars? Unraveling the Bilingual Prowess of Financial LLMs Between Spanish and English | Feb 12, 2024 | | CodeCode Available | 4 |
| Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models | Feb 12, 2024 | HallucinationObject Localization | CodeCode Available | 4 |
| Semi-Mamba-UNet: Pixel-Level Contrastive and Pixel-Level Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation | Feb 11, 2024 | Cardiac SegmentationContrastive Learning | CodeCode Available | 4 |
| ScreenAgent: A Vision Language Model-driven Computer Control Agent | Feb 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA | Feb 9, 2024 | Event DetectionHate Speech Detection | CodeCode Available | 4 |
| InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning | Feb 9, 2024 | Data AugmentationGSM8K | CodeCode Available | 4 |
| InkSight: Offline-to-Online Handwriting Conversion by Learning to Read and Write | Feb 8, 2024 | Derendering | CodeCode Available | 4 |
| MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis | Feb 8, 2024 | AttributeConditional Text-to-Image Synthesis | CodeCode Available | 4 |
| Spirit LM: Interleaved Spoken and Written Language Model | Feb 8, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| You Only Need One Color Space: An Efficient Network for Low-light Image Enhancement | Feb 8, 2024 | Image EnhancementLow-light Image Deblurring and Enhancement | CodeCode Available | 4 |
| AlphaFold Meets Flow Matching for Generating Protein Ensembles | Feb 7, 2024 | Diversity | CodeCode Available | 4 |
| JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase Flows | Feb 7, 2024 | GPU | CodeCode Available | 4 |
| Amortized Planning with Large-Scale Transformers: A Case Study on Chess | Feb 7, 2024 | Memorization | CodeCode Available | 4 |
| Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation | Feb 7, 2024 | Cardiac SegmentationComputational Efficiency | CodeCode Available | 4 |
| QuIP#: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks | Feb 6, 2024 | Quantization | CodeCode Available | 4 |
| LESS: Selecting Influential Data for Targeted Instruction Tuning | Feb 6, 2024 | | CodeCode Available | 4 |
| HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal | Feb 6, 2024 | Red Teaming | CodeCode Available | 4 |