| A Survey on Multimodal Large Language Models for Autonomous Driving | Nov 21, 2023 | Autonomous Driving | CodeCode Available | 2 |
| Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation | Nov 20, 2023 | 3D Human Pose EstimationPose Estimation | CodeCode Available | 2 |
| Sparse4D v3: Advancing End-to-End 3D Detection and Tracking | Nov 20, 2023 | Autonomous DrivingDenoising | CodeCode Available | 2 |
| System 2 Attention (is something you might need too) | Nov 20, 2023 | Math | CodeCode Available | 2 |
| Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents | Nov 20, 2023 | | CodeCode Available | 2 |
| GPQA: A Graduate-Level Google-Proof Q&A Benchmark | Nov 20, 2023 | Multiple-choice | CodeCode Available | 2 |
| Fast Inner-Product Algorithms and Architectures for Deep Neural Network Accelerators | Nov 20, 2023 | | CodeCode Available | 2 |
| Meta Prompting for AI Systems | Nov 20, 2023 | Data InteractionGSM8K | CodeCode Available | 2 |
| LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching | Nov 19, 2023 | 3D GenerationText to 3D | CodeCode Available | 2 |
| Open-Vocabulary Camouflaged Object Segmentation | Nov 19, 2023 | Camouflaged Object SegmentationImage Segmentation | CodeCode Available | 2 |
| Tactics2D: A Highly Modular and Extensible Simulator for Driving Decision-making | Nov 18, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| An Embodied Generalist Agent in 3D World | Nov 18, 2023 | 3D dense captioning3D Question Answering (3D-QA) | CodeCode Available | 2 |
| FlashOcc: Fast and Memory-Efficient Occupancy Prediction via Channel-to-Height Plugin | Nov 18, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models | Nov 16, 2023 | Conversational SearchIn-Context Learning | CodeCode Available | 2 |
| The Chosen One: Consistent Characters in Text-to-Image Diffusion Models | Nov 16, 2023 | Consistent Character GenerationImage Generation | CodeCode Available | 2 |
| ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code | Nov 16, 2023 | Code GenerationNavigate | CodeCode Available | 2 |
| Stella Nera: Achieving 161 TOp/s/W with Multiplier-free DNN Acceleration based on Approximate Matrix Multiplication | Nov 16, 2023 | Quantization | CodeCode Available | 2 |
| JaxMARL: Multi-Agent RL Environments and Algorithms in JAX | Nov 16, 2023 | CPUGPU | CodeCode Available | 2 |
| MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning | Nov 16, 2023 | MedQAMMLU | CodeCode Available | 2 |
| HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs | Nov 16, 2023 | Domain AdaptationLanguage Modeling | CodeCode Available | 2 |
| GEO: Generative Engine Optimization | Nov 16, 2023 | | CodeCode Available | 2 |
| ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems | Nov 16, 2023 | RAGRetrieval | CodeCode Available | 2 |
| Exponentially Faster Language Modelling | Nov 15, 2023 | BenchmarkingCPU | CodeCode Available | 2 |
| FastBlend: a Powerful Model-Free Toolkit Making Video Stylization Easier | Nov 15, 2023 | Computational EfficiencyPatch Matching | CodeCode Available | 2 |
| MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning | Nov 15, 2023 | Chart Understanding | CodeCode Available | 2 |
| Striped Attention: Faster Ring Attention for Causal Transformers | Nov 15, 2023 | | CodeCode Available | 2 |
| Correlation-Guided Query-Dependency Calibration for Video Temporal Grounding | Nov 15, 2023 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 |
| Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation | Nov 15, 2023 | QuantizationRecommendation Systems | CodeCode Available | 2 |
| Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training | Nov 15, 2023 | Passage RetrievalPosition | CodeCode Available | 2 |
| GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer | Nov 14, 2023 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 2 |
| Mustango: Toward Controllable Text-to-Music Generation | Nov 14, 2023 | Data AugmentationDenoising | CodeCode Available | 2 |
| REST: Retrieval-Based Speculative Decoding | Nov 14, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| MeLo: Low-rank Adaptation is Better than Fine-tuning for Medical Image Diagnosis | Nov 14, 2023 | | CodeCode Available | 2 |
| Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation | Nov 14, 2023 | ObjectVideo Editing | CodeCode Available | 2 |
| Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding | Nov 14, 2023 | Image-based Generative Performance BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster | Nov 14, 2023 | GPUPosition | CodeCode Available | 2 |
| Learning to Filter Context for Retrieval-Augmented Generation | Nov 14, 2023 | Extractive Question-AnsweringFact Verification | CodeCode Available | 2 |
| To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning | Nov 13, 2023 | Instruction FollowingMM-Vet | CodeCode Available | 2 |
| Neural General Circulation Models for Weather and Climate | Nov 13, 2023 | Physical SimulationsWeather Forecasting | CodeCode Available | 2 |
| SpectralGPT: Spectral Remote Sensing Foundation Model | Nov 13, 2023 | Change Detectionmodel | CodeCode Available | 2 |
| Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models | Nov 12, 2023 | | CodeCode Available | 2 |
| LayoutPrompter: Awaken the Design Ability of Large Language Models | Nov 11, 2023 | In-Context LearningLayout Generation | CodeCode Available | 2 |
| Tamil-Llama: A New Tamil Language Model Based on Llama 2 | Nov 10, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores | Nov 10, 2023 | | CodeCode Available | 2 |
| FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure Graph Perspective | Nov 10, 2023 | Graph Neural NetworkMultivariate Time Series Forecasting | CodeCode Available | 2 |
| Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration | Nov 10, 2023 | Inference AttackMembership Inference Attack | CodeCode Available | 2 |
| Frequency-domain MLPs are More Effective Learners in Time Series Forecasting | Nov 10, 2023 | Time SeriesTime Series Forecasting | CodeCode Available | 2 |
| High-dimensional mixed-categorical Gaussian processes with application to multidisciplinary design optimization for a green aircraft | Nov 10, 2023 | Bayesian OptimizationCantilever Beam | CodeCode Available | 2 |
| Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model | Nov 10, 2023 | DiversityNeRF | CodeCode Available | 2 |
| EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy | Nov 10, 2023 | Uncertainty Quantification | CodeCode Available | 2 |