| ResumeAtlas: Revisiting Resume Classification with Large-Scale Datasets and Large Language Models | Jun 26, 2024 | Classification | CodeCode Available | 2 |
| LumberChunker: Long-Form Narrative Document Segmentation | Jun 25, 2024 | ChunkingForm | CodeCode Available | 2 |
| Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection | Jun 25, 2024 | Audio Deepfake DetectionSynthetic Speech Detection | CodeCode Available | 2 |
| Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models | Jun 25, 2024 | DiversityMath | CodeCode Available | 2 |
| MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning | Jun 25, 2024 | ObjectObject Recognition | CodeCode Available | 2 |
| European Space Agency Benchmark for Anomaly Detection in Satellite Telemetry | Jun 25, 2024 | Anomaly DetectionTime Series | CodeCode Available | 2 |
| Revitalizing Convolutional Network for Image Restoration | Jun 25, 2024 | DeblurringImage Deblurring | CodeCode Available | 2 |
| Efficient, Multimodal, and Derivative-Free Bayesian Inference With Fisher-Rao Gradient Flows | Jun 25, 2024 | Bayesian Inference | CodeCode Available | 2 |
| SUM: Saliency Unification through Mamba for Visual Attention Modeling | Jun 25, 2024 | MambaMarketing | CodeCode Available | 2 |
| Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers | Jun 25, 2024 | Image GenerationModel Compression | CodeCode Available | 2 |
| Joint Admission Control and Resource Allocation of Virtual Network Embedding via Hierarchical Deep Reinforcement Learning | Jun 25, 2024 | Combinatorial OptimizationGraph Neural Network | CodeCode Available | 2 |
| The Balanced-Pairwise-Affinities Feature Transform | Jun 25, 2024 | Few-Shot Image ClassificationImage Clustering | CodeCode Available | 2 |
| Dual-Space Knowledge Distillation for Large Language Models | Jun 25, 2024 | Instruction FollowingKnowledge Distillation | CodeCode Available | 2 |
| FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model | Jun 25, 2024 | Federated Learning | CodeCode Available | 2 |
| Disentangled Motion Modeling for Video Frame Interpolation | Jun 25, 2024 | Optical Flow EstimationVideo Frame Interpolation | CodeCode Available | 2 |
| Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP | Jun 25, 2024 | cross-modal alignmentImage Classification | CodeCode Available | 2 |
| DiffusionPDE: Generative PDE-Solving Under Partial Observation | Jun 25, 2024 | | CodeCode Available | 2 |
| Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA | Jun 25, 2024 | BenchmarkingLong-Context Understanding | CodeCode Available | 2 |
| Finding Transformer Circuits with Edge Pruning | Jun 24, 2024 | In-Context LearningLanguage Modelling | CodeCode Available | 2 |
| One Thousand and One Pairs: A "novel" challenge for long-context language models | Jun 24, 2024 | RetrievalSentence | CodeCode Available | 2 |
| Are Vision xLSTM Embedded UNet More Reliable in Medical 3D Image Segmentation? | Jun 24, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| OmAgent: A Multi-modal Agent Framework for Complex Video Understanding with Task Divide-and-Conquer | Jun 24, 2024 | AI AgentLarge Language Model | CodeCode Available | 2 |
| LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments | Jun 24, 2024 | World Knowledge | CodeCode Available | 2 |
| FaceScore: Benchmarking and Enhancing Face Quality in Human Generation | Jun 24, 2024 | BenchmarkingDenoising | CodeCode Available | 2 |
| From Perfect to Noisy World Simulation: Customizable Embodied Multi-modal Perturbations for SLAM Robustness Benchmarking | Jun 24, 2024 | BenchmarkingNeRF | CodeCode Available | 2 |
| Character-Adapter: Prompt-Guided Region Control for High-Fidelity Character Customization | Jun 24, 2024 | Consistent Character GenerationImage Generation | CodeCode Available | 2 |
| Alpha^2: Discovering Logical Formulaic Alphas using Deep Reinforcement Learning | Jun 24, 2024 | Deep Reinforcement Learning | CodeCode Available | 2 |
| OlympicArena Medal Ranks: Who Is the Most Intelligent AI So Far? | Jun 24, 2024 | | CodeCode Available | 2 |
| GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights | Jun 24, 2024 | DenoisingNeural Architecture Search | CodeCode Available | 2 |
| DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation | Jun 24, 2024 | BenchmarkingImage Generation | CodeCode Available | 2 |
| FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models | Jun 24, 2024 | Video Generation | CodeCode Available | 2 |
| SegNet4D: Efficient Instance-Aware 4D Semantic Segmentation for LiDAR Point Cloud | Jun 24, 2024 | Autonomous DrivingAutonomous Navigation | CodeCode Available | 2 |
| Revisiting Referring Expression Comprehension Evaluation in the Era of Large Multimodal Models | Jun 24, 2024 | Referring ExpressionReferring Expression Comprehension | CodeCode Available | 2 |
| CausalFormer: An Interpretable Transformer for Temporal Causal Discovery | Jun 24, 2024 | Causal DiscoveryTime Series | CodeCode Available | 2 |
| DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation | Jun 23, 2024 | 3D Lane DetectionAutonomous Driving | CodeCode Available | 2 |
| Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking | Jun 23, 2024 | Benchmarking | CodeCode Available | 2 |
| LGS: A Light-weight 4D Gaussian Splatting for Efficient Surgical Scene Reconstruction | Jun 23, 2024 | | CodeCode Available | 2 |
| Efficient Evolutionary Search Over Chemical Space with Large Language Models | Jun 23, 2024 | Drug DesignEvolutionary Algorithms | CodeCode Available | 2 |
| PointDreamer: Zero-shot 3D Textured Mesh Reconstruction from Colored Point Cloud | Jun 22, 2024 | Image Inpainting | CodeCode Available | 2 |
| Soft Masked Mamba Diffusion Model for CT to MRI Conversion | Jun 22, 2024 | Computed Tomography (CT)Image Generation | CodeCode Available | 2 |
| Ladder: A Model-Agnostic Framework Boosting LLM-based Machine Translation to the Next Level | Jun 22, 2024 | Machine TranslationTranslation | CodeCode Available | 2 |
| Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs | Jun 22, 2024 | HallucinationUncertainty Quantification | CodeCode Available | 2 |
| EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting | Jun 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| What Matters in Transformers? Not All Attention is Needed | Jun 22, 2024 | AllMMLU | CodeCode Available | 2 |
| RouteFinder: Towards Foundation Models for Vehicle Routing Problems | Jun 21, 2024 | AttributeMulti-Task Learning | CodeCode Available | 2 |
| DExter: Learning and Controlling Performance Expression with Diffusion Models | Jun 21, 2024 | Music Performance Rendering | CodeCode Available | 2 |
| SelfReg-UNet: Self-Regularized UNet for Medical Image Segmentation | Jun 21, 2024 | DecoderImage Segmentation | CodeCode Available | 2 |
| MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression | Jun 21, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Is A Picture Worth A Thousand Words? Delving Into Spatial Reasoning for Vision Language Models | Jun 21, 2024 | Spatial Reasoning | CodeCode Available | 2 |
| GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation | Jun 21, 2024 | 3D GenerationGPU | CodeCode Available | 2 |