| Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders | Jul 19, 2024 | | CodeCode Available | 3 |
| Scikit-fingerprints: easy and efficient computation of molecular fingerprints in Python | Jul 18, 2024 | Molecular Property PredictionProperty Prediction | CodeCode Available | 3 |
| NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models | Jul 17, 2024 | Instruction FollowingVision and Language Navigation | CodeCode Available | 3 |
| E5-V: Universal Embeddings with Multimodal Large Language Models | Jul 17, 2024 | | CodeCode Available | 3 |
| AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases | Jul 17, 2024 | Autonomous DrivingBackdoor Attack | CodeCode Available | 3 |
| Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | Jul 16, 2024 | 2D Object Detectionobject-detection | CodeCode Available | 3 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| TCFormer: Visual Recognition via Token Clustering Transformer | Jul 16, 2024 | Clusteringimage-classification | CodeCode Available | 3 |
| Scaling Diffusion Transformers to 16 Billion Parameters | Jul 16, 2024 | AttributeConditional Image Generation | CodeCode Available | 3 |
| The Oscars of AI Theater: A Survey on Role-Playing with Language Models | Jul 16, 2024 | Survey | CodeCode Available | 3 |
| OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition | Jul 15, 2024 | Automated Theorem Proving | CodeCode Available | 3 |
| Evaluating Large Language Models with fmeval | Jul 15, 2024 | Question Answering | CodeCode Available | 3 |
| An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use Cases | Jul 15, 2024 | Attributecounterfactual | CodeCode Available | 3 |
| Fast Matrix Multiplications for Lookup Table-Quantized LLMs | Jul 15, 2024 | Quantization | CodeCode Available | 3 |
| Learning Dynamics of LLM Finetuning | Jul 15, 2024 | Hallucination | CodeCode Available | 3 |
| Restoring Images in Adverse Weather Conditions via Histogram Transformer | Jul 14, 2024 | Image Restoration | CodeCode Available | 3 |
| Human-like Episodic Memory for Infinite Context LLMs | Jul 12, 2024 | Computational EfficiencyEvent Segmentation | CodeCode Available | 3 |
| A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization | Jul 12, 2024 | Anomaly DetectionDefect Detection | CodeCode Available | 3 |
| LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models | Jul 12, 2024 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 3 |
| Single-Image Shadow Removal Using Deep Learning: A Comprehensive Survey | Jul 11, 2024 | Deep LearningImage Restoration | CodeCode Available | 3 |
| A Comprehensive Survey on Human Video Generation: Challenges, Methods, and Insights | Jul 11, 2024 | Motion GenerationSurvey | CodeCode Available | 3 |
| Unifying 3D Representation and Control of Diverse Robots with a Single Camera | Jul 11, 2024 | | CodeCode Available | 3 |
| WildGaussians: 3D Gaussian Splatting in the Wild | Jul 11, 2024 | 3DGS3D Scene Reconstruction | CodeCode Available | 3 |
| Video Diffusion Alignment via Reward Gradients | Jul 11, 2024 | | CodeCode Available | 3 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 |
| Inference Performance Optimization for Large Language Models on CPUs | Jul 10, 2024 | CPUGPU | CodeCode Available | 3 |
| Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation | Jul 10, 2024 | 3D human pose and shape estimation | CodeCode Available | 3 |
| EfficientQAT: Efficient Quantization-Aware Training for Large Language Models | Jul 10, 2024 | GPUQuantization | CodeCode Available | 3 |
| BiGym: A Demo-Driven Mobile Bi-Manual Manipulation Benchmark | Jul 10, 2024 | Imitation Learning | CodeCode Available | 3 |
| Vision-and-Language Navigation Today and Tomorrow: A Survey in the Era of Foundation Models | Jul 9, 2024 | Vision and Language Navigation | CodeCode Available | 3 |
| Robust Neural Information Retrieval: An Adversarial and Out-of-distribution Perspective | Jul 9, 2024 | Information RetrievalRetrieval | CodeCode Available | 3 |
| Chat-Edit-3D: Interactive 3D Scene Editing via Text Prompts | Jul 9, 2024 | 3D Object Editing3D Reconstruction | CodeCode Available | 3 |
| Scaling Retrieval-Based Language Models with a Trillion-Token Datastore | Jul 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Revisiting, Benchmarking and Understanding Unsupervised Graph Domain Adaptation | Jul 9, 2024 | BenchmarkingDomain Adaptation | CodeCode Available | 3 |
| A Survey on LoRA of Large Language Models | Jul 8, 2024 | Federated Learningparameter-efficient fine-tuning | CodeCode Available | 3 |
| WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks | Jul 7, 2024 | Arithmetic Reasoning | CodeCode Available | 3 |
| Unified Approach for Hedging Impermanent Loss of Liquidity Provision | Jul 6, 2024 | | CodeCode Available | 3 |
| LoRA-GA: Low-Rank Adaptation with Gradient Approximation | Jul 6, 2024 | GSM8Kparameter-efficient fine-tuning | CodeCode Available | 3 |
| LaRa: Efficient Large-Baseline Radiance Fields | Jul 5, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 3 |
| CountGD: Multi-Modal Open-World Counting | Jul 5, 2024 | Object CountingOpen-vocabulary object counting | CodeCode Available | 3 |
| Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data | Jul 5, 2024 | Classificationregression | CodeCode Available | 3 |
| YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation | Jul 5, 2024 | Drum TranscriptionDrum Transcription in Music (DTM) | CodeCode Available | 3 |
| Simplifying Deep Temporal Difference Learning | Jul 5, 2024 | Q-LearningReinforcement Learning (RL) | CodeCode Available | 3 |
| OneRestore: A Universal Restoration Framework for Composite Degradation | Jul 5, 2024 | Image DehazingImage Restoration | CodeCode Available | 3 |
| On the Workflows and Smells of Leaderboard Operations (LBOps): An Exploratory Study of Foundation Model Leaderboards | Jul 4, 2024 | Code Completion | CodeCode Available | 3 |
| A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models | Jul 2, 2024 | Navigate | CodeCode Available | 3 |
| Consistency Flow Matching: Defining Straight Flows with Velocity Consistency | Jul 2, 2024 | Image Generation | CodeCode Available | 3 |
| What We Talk About When We Talk About LMs: Implicit Paradigm Shifts and the Ship of Language Models | Jul 2, 2024 | | CodeCode Available | 3 |
| TokenPacker: Efficient Visual Projector for Multimodal LLM | Jul 2, 2024 | Language ModellingLarge Language Model | CodeCode Available | 3 |