| DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | Aug 15, 2024 | Automated Theorem ProvingLanguage Modeling | CodeCode Available | 4 |
| Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities | Aug 14, 2024 | Continual LearningFew-Shot Learning | CodeCode Available | 4 |
| SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning | Aug 14, 2024 | CPUMotion Planning | CodeCode Available | 4 |
| Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents | Aug 13, 2024 | Decision Making | CodeCode Available | 4 |
| Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers | Aug 12, 2024 | GSM8KMath | CodeCode Available | 4 |
| Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation | Aug 8, 2024 | ChunkingFact Checking | CodeCode Available | 4 |
| MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization | Aug 5, 2024 | | CodeCode Available | 4 |
| miniCTX: Neural Theorem Proving with (Long-)Contexts | Aug 5, 2024 | Automated Theorem Proving | CodeCode Available | 4 |
| RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation | Aug 5, 2024 | | CodeCode Available | 4 |
| ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning | Aug 4, 2024 | DecoderImitation Learning | CodeCode Available | 4 |
| Deep Patch Visual SLAM | Aug 3, 2024 | GPUVisual Odometry | CodeCode Available | 4 |
| GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS | Aug 2, 2024 | GPUNavigate | CodeCode Available | 4 |
| CitationMap: A Python Tool to Identify and Visualize Your Google Scholar Citations Around the World | Aug 2, 2024 | Citation VisualizationData Visualization | CodeCode Available | 4 |
| Medical SAM 2: Segment medical images as video via Segment Anything Model 2 | Aug 1, 2024 | Image SegmentationInteractive Segmentation | CodeCode Available | 4 |
| Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions | Aug 1, 2024 | Medical Question AnsweringMedQA | CodeCode Available | 4 |
| Expressive Whole-Body 3D Gaussian Avatar | Jul 31, 2024 | 3DGSDiversity | CodeCode Available | 4 |
| The Llama 3 Herd of Models | Jul 31, 2024 | answerability predictionLanguage Modeling | CodeCode Available | 4 |
| Generation of Training Data from HD Maps in the Lanelet2 Framework | Jul 24, 2024 | | CodeCode Available | 4 |
| LAMBDA: A Large Model Based Data Agent | Jul 24, 2024 | model | CodeCode Available | 4 |
| Multi-label Cluster Discrimination for Visual Representation Learning | Jul 24, 2024 | Contrastive LearningImage-text Retrieval | CodeCode Available | 4 |
| LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover | Jul 24, 2024 | Automated Theorem ProvingMath | CodeCode Available | 4 |
| Stable-Hair: Real-World Hair Transfer via Diffusion Model | Jul 19, 2024 | Triplet | CodeCode Available | 4 |
| NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals | Jul 18, 2024 | Experimental DesignGPU | CodeCode Available | 4 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 |
| Goldfish: Vision-Language Understanding of Arbitrarily Long Videos | Jul 17, 2024 | RetrievalVideo Understanding | CodeCode Available | 4 |
| Distilling Tiny and Ultra-fast Deep Neural Networks for Autonomous Navigation on Nano-UAVs | Jul 17, 2024 | Autonomous NavigationCollision Avoidance | CodeCode Available | 4 |
| Halu-J: Critique-Based Hallucination Judge | Jul 17, 2024 | Evidence SelectionHallucination | CodeCode Available | 4 |
| Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference | Jul 16, 2024 | | CodeCode Available | 4 |
| When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Deep-TEMPEST: Using Deep Learning to Eavesdrop on HDMI from its Unintended Electromagnetic Emanations | Jul 12, 2024 | | CodeCode Available | 4 |
| SEED-Story: Multimodal Long Story Generation with Large Language Model | Jul 11, 2024 | Image GenerationLanguage Modeling | CodeCode Available | 4 |
| MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine | Jul 11, 2024 | Contrastive LearningLanguage Modelling | CodeCode Available | 4 |
| OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training | Jul 10, 2024 | | CodeCode Available | 4 |
| The GeometricKernels Package: Heat and Matérn Kernels for Geometric Learning on Manifolds, Meshes, and Graphs | Jul 10, 2024 | Gaussian ProcessesUncertainty Quantification | CodeCode Available | 4 |
| A Survey of Attacks on Large Vision-Language Models: Resources, Advances, and Future Trends | Jul 10, 2024 | Data Poisoning | CodeCode Available | 4 |
| A Survey on Deep Stereo Matching in the Twenties | Jul 10, 2024 | Stereo MatchingSurvey | CodeCode Available | 4 |
| Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence | Jul 9, 2024 | Retrieval-augmented Generation | CodeCode Available | 4 |
| MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions | Jul 8, 2024 | Video AlignmentVideo Generation | CodeCode Available | 4 |
| Wavelet Convolutions for Large Receptive Fields | Jul 8, 2024 | 2D Object Detection2D Semantic Segmentation | CodeCode Available | 4 |
| ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Jul 8, 2024 | multimodal generationText Generation | CodeCode Available | 4 |
| MUSE: Machine Unlearning Six-Way Evaluation for Language Models | Jul 8, 2024 | ArticlesMachine Unlearning | CodeCode Available | 4 |
| TALENT: A Tabular Analytics and Learning Toolbox | Jul 4, 2024 | | CodeCode Available | 4 |
| Modern Neighborhood Components Analysis: A Deep Tabular Baseline Two Decades Later | Jul 3, 2024 | | CodeCode Available | 4 |
| MIGC++: Advanced Multi-Instance Generation Controller for Image Synthesis | Jul 2, 2024 | AttributeImage Generation | CodeCode Available | 4 |
| Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models | Jul 2, 2024 | Mixture-of-Expertsparameter-efficient fine-tuning | CodeCode Available | 4 |
| Tiny-PULP-Dronets: Squeezing Neural Networks for Faster and Lighter Inference on Multi-Tasking Autonomous Nano-Drones | Jul 2, 2024 | Autonomous Navigation | CodeCode Available | 4 |
| FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds | Jul 1, 2024 | Audio GenerationVideo Alignment | CodeCode Available | 4 |
| fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial Intelligence | Jul 1, 2024 | GPUPoint cloud reconstruction | CodeCode Available | 4 |
| A Closer Look at Deep Learning Methods on Tabular Datasets | Jul 1, 2024 | AttributeDeep Learning | CodeCode Available | 4 |
| Kolmogorov-Arnold Convolutions: Design Principles and Empirical Studies | Jul 1, 2024 | image-classificationImage Classification | CodeCode Available | 4 |