| GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers | Sep 6, 2024 | 3DGS3D human pose and shape estimation | CodeCode Available | 2 |
| Programming Refusal with Conditional Activation Steering | Sep 6, 2024 | | CodeCode Available | 2 |
| UniDet3D: Multi-dataset Indoor 3D Object Detection | Sep 6, 2024 | 3D Object DetectionObject | CodeCode Available | 2 |
| Self-Harmonized Chain of Thought | Sep 6, 2024 | | CodeCode Available | 2 |
| PlantSeg: A Large-Scale In-the-wild Dataset for Plant Disease Segmentation | Sep 6, 2024 | Benchmarkingimage-classification | CodeCode Available | 2 |
| Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding | Sep 5, 2024 | Question AnsweringScene Understanding | CodeCode Available | 2 |
| DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation | Sep 5, 2024 | | CodeCode Available | 2 |
| The AdEMAMix Optimizer: Better, Faster, Older | Sep 5, 2024 | image-classificationImage Classification | CodeCode Available | 2 |
| Tissue Concepts: supervised foundation models in computational pathology | Sep 5, 2024 | DiagnosticMulti-Task Learning | CodeCode Available | 2 |
| RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images | Sep 5, 2024 | | CodeCode Available | 2 |
| Plane2Depth: Hierarchical Adaptive Plane Guidance for Monocular Depth Estimation | Sep 4, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 2 |
| StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models | Sep 4, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| Multi-modal Situated Reasoning in 3D Scenes | Sep 4, 2024 | 3D Question Answering (3D-QA) | CodeCode Available | 2 |
| Hybrid-Segmentor: A Hybrid Approach to Automated Fine-Grained Crack Segmentation in Civil Infrastructure | Sep 4, 2024 | Crack SegmentationDecoder | CodeCode Available | 2 |
| CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Sep 4, 2024 | GSM8KMath | CodeCode Available | 2 |
| Language Model Powered Digital Biology with BRAD | Sep 4, 2024 | ChatbotCode Generation | CodeCode Available | 2 |
| LibMOON: A Gradient-based MultiObjective OptimizatioN Library in PyTorch | Sep 4, 2024 | Evolutionary AlgorithmsFairness | CodeCode Available | 2 |
| Human-VDM: Learning Single-Image 3D Human Gaussian Splatting from Video Diffusion Models | Sep 4, 2024 | Lifelike 3D Human Generation | CodeCode Available | 2 |
| Towards a Unified View of Preference Learning for Large Language Models: A Survey | Sep 4, 2024 | | CodeCode Available | 2 |
| MobileUNETR: A Lightweight End-To-End Hybrid Vision Transformer For Efficient Medical Image Segmentation | Sep 4, 2024 | Image SegmentationLesion Segmentation | CodeCode Available | 2 |
| AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction | Sep 3, 2024 | RelationRelation Extraction | CodeCode Available | 2 |
| Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024 | Sep 3, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for Cellular Automata | Sep 3, 2024 | Large Language Model | CodeCode Available | 2 |
| Boosting Vision-Language Models for Histopathology Classification: Predict all at once | Sep 3, 2024 | Allzero-shot-classification | CodeCode Available | 2 |
| Unveiling Deep Shadows: A Survey and Benchmark on Image and Video Shadow Detection, Removal, and Generation in the Deep Learning Era | Sep 3, 2024 | Scene UnderstandingShadow Detection | CodeCode Available | 2 |
| Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey | Sep 3, 2024 | Out-of-Distribution Detection | CodeCode Available | 2 |
| Explicit Differentiable Slicing and Global Deformation for Cardiac Mesh Reconstruction | Sep 3, 2024 | Anatomy | CodeCode Available | 2 |
| AllWeatherNet:Unified Image Enhancement for Autonomous Driving under Adverse Weather and Lowlight-conditions | Sep 3, 2024 | Autonomous DrivingDeep Attention | CodeCode Available | 2 |
| Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation | Sep 2, 2024 | GPU | CodeCode Available | 2 |
| EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance | Sep 2, 2024 | AudioCapsAudio captioning | CodeCode Available | 2 |
| Task-Specific Directions: Definition, Exploration, and Utilization in Parameter Efficient Fine-Tuning | Sep 2, 2024 | parameter-efficient fine-tuning | CodeCode Available | 2 |
| Highly Accurate Real-space Electron Densities with Neural Networks | Sep 2, 2024 | | CodeCode Available | 2 |
| MLOmics: Cancer Multi-Omics Database for Machine Learning | Sep 2, 2024 | | CodeCode Available | 2 |
| Enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare Applications | Sep 2, 2024 | CPUFederated Learning | CodeCode Available | 2 |
| Target-Driven Distillation: Consistency Distillation with Target Timestep Selection and Decoupled Guidance | Sep 2, 2024 | | CodeCode Available | 2 |
| Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization | Sep 2, 2024 | DiversityOffline RL | CodeCode Available | 2 |
| Enhancing Vectorized Map Perception with Historical Rasterized Maps | Sep 1, 2024 | Autonomous Driving | CodeCode Available | 2 |
| Sample-Efficient Diffusion for Text-To-Speech Synthesis | Sep 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Enhancing Remote Sensing Vision-Language Models for Zero-Shot Scene Classification | Sep 1, 2024 | Scene ClassificationTransductive Zero-Shot Classification | CodeCode Available | 2 |
| JaxLife: An Open-Ended Agentic Simulator | Sep 1, 2024 | Artificial Life | CodeCode Available | 2 |
| SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation | Sep 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression | Sep 1, 2024 | Autonomous Driving | CodeCode Available | 2 |
| BWT construction and search at the terabase scale | Sep 1, 2024 | Data Compression | CodeCode Available | 2 |
| A Hybrid Transformer-Mamba Network for Single Image Deraining | Aug 31, 2024 | MambaRain Removal | CodeCode Available | 2 |
| BreachSeek: A Multi-Agent Automated Penetration Tester | Aug 31, 2024 | | CodeCode Available | 2 |
| LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models | Aug 31, 2024 | 8kGPU | CodeCode Available | 2 |
| An Enhanced Batch Query Architecture in Real-time Recommendation | Aug 31, 2024 | Incremental LearningRecommendation Systems | CodeCode Available | 2 |
| Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Aug 30, 2024 | Deep LearningImage Segmentation | CodeCode Available | 2 |
| LAR-IQA: A Lightweight, Accurate, and Robust No-Reference Image Quality Assessment Model | Aug 30, 2024 | Image Quality AssessmentKolmogorov-Arnold Networks | CodeCode Available | 2 |
| Self-supervised Anomaly Detection Pretraining Enhances Long-tail ECG Diagnosis | Aug 30, 2024 | Anomaly DetectionDiagnostic | CodeCode Available | 2 |