| AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection | Oct 29, 2023 | Anomaly DetectionPrompt Learning | CodeCode Available | 2 |
| Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision | Oct 28, 2023 | | CodeCode Available | 2 |
| Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions | Oct 28, 2023 | State Space Models | CodeCode Available | 2 |
| FP8-LM: Training FP8 Large Language Models | Oct 27, 2023 | GPU | CodeCode Available | 2 |
| ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image | Oct 27, 2023 | DiversityNeRF | CodeCode Available | 2 |
| Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time | Oct 26, 2023 | In-Context Learning | CodeCode Available | 2 |
| MimicGen: A Data Generation System for Scalable Robot Learning using Human Demonstrations | Oct 26, 2023 | Imitation Learning | CodeCode Available | 2 |
| JudgeLM: Fine-tuned Large Language Models are Scalable Judges | Oct 26, 2023 | | CodeCode Available | 2 |
| MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network | Oct 25, 2023 | 3D Human Pose EstimationClassification | CodeCode Available | 2 |
| QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models | Oct 25, 2023 | GPUMixture-of-Experts | CodeCode Available | 2 |
| Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis | Oct 25, 2023 | Text Spotting | CodeCode Available | 2 |
| The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI | Oct 25, 2023 | | CodeCode Available | 2 |
| LLM-FP4: 4-Bit Floating-Point Quantized Transformers | Oct 25, 2023 | Common Sense ReasoningQuantization | CodeCode Available | 2 |
| Detecting Pretraining Data from Large Language Models | Oct 25, 2023 | Machine Unlearning | CodeCode Available | 2 |
| CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images | Oct 25, 2023 | Transfer Learning | CodeCode Available | 2 |
| PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization | Oct 25, 2023 | Navigate | CodeCode Available | 2 |
| TD-MPC2: Scalable, Robust World Models for Continuous Control | Oct 25, 2023 | continuous-controlContinuous Control | CodeCode Available | 2 |
| Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution | Oct 25, 2023 | DenoisingLanguage Modeling | CodeCode Available | 2 |
| Neural Potential Field for Obstacle-Aware Local Motion Planning | Oct 25, 2023 | Model Predictive ControlMotion Planning | CodeCode Available | 2 |
| PERF: Panoramic Neural Radiance Field from a Single Panorama | Oct 25, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| Pre-training Music Classification Models via Music Source Separation | Oct 24, 2023 | ClassificationGenre classification | CodeCode Available | 2 |
| Dynamic Convolutional Neural Networks as Efficient Pre-trained Audio Models | Oct 24, 2023 | Audio ClassificationAudio Tagging | CodeCode Available | 2 |
| Mixture of Tokens: Continuous MoE through Cross-Example Aggregation | Oct 24, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| A Survey on Detection of LLMs-Generated Content | Oct 24, 2023 | Survey | CodeCode Available | 2 |
| BianQue: Balancing the Questioning and Suggestion Ability of Health LLMs with Multi-turn Health Conversations Polished by ChatGPT | Oct 24, 2023 | | CodeCode Available | 2 |
| Woodpecker: Hallucination Correction for Multimodal Large Language Models | Oct 24, 2023 | Hallucination | CodeCode Available | 2 |
| Representation Learning with Large Language Models for Recommendation | Oct 24, 2023 | Recommendation SystemsRepresentation Learning | CodeCode Available | 2 |
| Brainchop: Next Generation Web-Based Neuroimaging Application | Oct 24, 2023 | | CodeCode Available | 2 |
| Breaking of brightness consistency in optical flow with a lightweight CNN network | Oct 24, 2023 | CPUOptical Flow Estimation | CodeCode Available | 2 |
| RoboDepth: Robust Out-of-Distribution Depth Estimation under Corruptions | Oct 23, 2023 | Autonomous DrivingDepth Estimation | CodeCode Available | 2 |
| A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions | Oct 23, 2023 | Binary ClassificationLLM-generated Text Detection | CodeCode Available | 2 |
| RD-VIO: Robust Visual-Inertial Odometry for Mobile Augmented Reality in Dynamic Environments | Oct 23, 2023 | | CodeCode Available | 2 |
| Matryoshka Diffusion Models | Oct 23, 2023 | Image GenerationZero-shot Generalization | CodeCode Available | 2 |
| SAM-Med3D: Towards General-purpose Segmentation Models for Volumetric Medical Images | Oct 23, 2023 | 3D ArchitectureImage Segmentation | CodeCode Available | 2 |
| DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning | Oct 23, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| HallusionBench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models | Oct 23, 2023 | DiagnosticHallucination | CodeCode Available | 2 |
| BatteryML:An Open-source platform for Machine Learning on Battery Degradation | Oct 23, 2023 | | CodeCode Available | 2 |
| Is Weakly-supervised Action Segmentation Ready For Human-Robot Interaction? No, Let's Improve It With Action-union Learning | Oct 22, 2023 | Action RecognitionAction Segmentation | CodeCode Available | 2 |
| Vision Language Models in Autonomous Driving: A Survey and Outlook | Oct 22, 2023 | Autonomous DrivingDecision Making | CodeCode Available | 2 |
| A Pytorch Reproduction of Masked Generative Image Transformer | Oct 22, 2023 | Image Generation | CodeCode Available | 2 |
| PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain | Oct 22, 2023 | Dialogue GenerationDialogue Understanding | CodeCode Available | 2 |
| RL-X: A Deep Reinforcement Learning Library (not only) for RoboCup | Oct 20, 2023 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 2 |
| Improving Molecular Properties Prediction Through Latent Space Fusion | Oct 20, 2023 | Molecular Property PredictionPrediction | CodeCode Available | 2 |
| Formalizing and Benchmarking Prompt Injection Attacks and Defenses | Oct 19, 2023 | Benchmarking | CodeCode Available | 2 |
| GraphGPT: Graph Instruction Tuning for Large Language Models | Oct 19, 2023 | Data AugmentationGraph Learning | CodeCode Available | 2 |
| Frozen Transformers in Language Models Are Effective Visual Encoder Layers | Oct 19, 2023 | Action RecognitionImage-text Retrieval | CodeCode Available | 2 |
| HumanTOMATO: Text-aligned Whole-body Motion Generation | Oct 19, 2023 | Motion GenerationMotion Synthesis | CodeCode Available | 2 |
| SRAI: Towards Standardization of Geospatial AI | Oct 19, 2023 | | CodeCode Available | 2 |
| Position Interpolation Improves ALiBi Extrapolation | Oct 18, 2023 | Language ModellingPosition | CodeCode Available | 2 |
| Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture | Oct 18, 2023 | 4kimage-classification | CodeCode Available | 2 |