| Malla: Demystifying Real-world Large Language Model Integrated Malicious Services | Jan 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| CogGPT: Unleashing the Power of Cognitive Dynamics on Large Language Models | Jan 6, 2024 | | CodeCode Available | 2 |
| Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss | Jan 5, 2024 | Knowledge Distillation | CodeCode Available | 2 |
| UnetTSF: A Better Performance Linear Complexity Time Series Prediction Model | Jan 5, 2024 | Time SeriesTime Series Prediction | CodeCode Available | 2 |
| MOODv2: Masked Image Modeling for Out-of-Distribution Detection | Jan 5, 2024 | Out-of-Distribution DetectionOut of Distribution (OOD) Detection | CodeCode Available | 2 |
| Credence: Augmenting Datacenter Switch Buffer Sharing with ML Predictions | Jan 5, 2024 | | CodeCode Available | 2 |
| AST-T5: Structure-Aware Pretraining for Code Generation and Understanding | Jan 5, 2024 | Code GenerationDecoder | CodeCode Available | 2 |
| Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks | Jan 5, 2024 | Arithmetic ReasoningCode Generation | CodeCode Available | 2 |
| PeFoMed: Parameter Efficient Fine-tuning of Multimodal Large Language Models for Medical Imaging | Jan 5, 2024 | Medical Report GenerationMedical Visual Question Answering | CodeCode Available | 2 |
| Reading Between the Frames: Multi-Modal Depression Detection in Videos from Non-Verbal Cues | Jan 5, 2024 | Depression Detection | CodeCode Available | 2 |
| Learning to Prompt with Text Only Supervision for Vision-Language Models | Jan 4, 2024 | Prompt Engineering | CodeCode Available | 2 |
| Oceanship: A Large-Scale Dataset for Underwater Audio Target Recognition | Jan 4, 2024 | AttributeAudio Classification | CodeCode Available | 2 |
| ChangeCLIP: Remote sensing change detection with multimodal vision-language representation learning | Jan 4, 2024 | Change DetectionDecoder | CodeCode Available | 2 |
| Advanced Unstructured Data Processing for ESG Reports: A Methodology for Structured Transformation and Enhanced Analysis | Jan 4, 2024 | | CodeCode Available | 2 |
| ODIN: A Single Model for 2D and 3D Segmentation | Jan 4, 2024 | 3D Instance Segmentation3D Semantic Segmentation | CodeCode Available | 2 |
| ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning | Jan 4, 2024 | Data VisualizationDecision Making | CodeCode Available | 2 |
| Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment | Jan 4, 2024 | | CodeCode Available | 2 |
| Data-Centric Foundation Models in Computational Healthcare: A Survey | Jan 4, 2024 | EthicsSurvey | CodeCode Available | 2 |
| ClassWise-SAM-Adapter: Parameter Efficient Fine-tuning Adapts Segment Anything to SAR Domain for Semantic Segmentation | Jan 4, 2024 | Decoderparameter-efficient fine-tuning | CodeCode Available | 2 |
| TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection | Jan 4, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 2 |
| Graph Neural Networks for Tabular Data Learning: A Survey with Taxonomy and Directions | Jan 4, 2024 | Representation LearningSurvey | CodeCode Available | 2 |
| CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers | Jan 3, 2024 | Point Cloud Completion | CodeCode Available | 2 |
| ODTrack: Online Dense Temporal Token Learning for Visual Tracking | Jan 3, 2024 | Semi-Supervised Video Object SegmentationVideo Object Tracking | CodeCode Available | 2 |
| CoMoSVC: Consistency Model-based Singing Voice Conversion | Jan 3, 2024 | GPUmodel | CodeCode Available | 2 |
| A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity | Jan 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| aMUSEd: An Open MUSE Reproduction | Jan 3, 2024 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| Context-Guided Spatio-Temporal Video Grounding | Jan 3, 2024 | ObjectSpatio-Temporal Video Grounding | CodeCode Available | 2 |
| Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction | Jan 3, 2024 | DisentanglementSurvival Prediction | CodeCode Available | 2 |
| Poisoning Attacks against Recommender Systems: A Survey | Jan 3, 2024 | Recommendation SystemsSurvey | CodeCode Available | 2 |
| AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI | Jan 3, 2024 | Video AlignmentVideo Generation | CodeCode Available | 2 |
| STAF: 3D Human Mesh Recovery from Video with Spatio-Temporal Alignment Fusion | Jan 3, 2024 | 3D Human Pose EstimationHuman Mesh Recovery | CodeCode Available | 2 |
| Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation | Jan 2, 2024 | Audio Generationcross-modal alignment | CodeCode Available | 2 |
| Room impulse response reconstruction with physics-informed deep learning | Jan 2, 2024 | Deep Learning | CodeCode Available | 2 |
| ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text | Jan 2, 2024 | ColorizationSketch Colorization | CodeCode Available | 2 |
| An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction | Jan 2, 2024 | DecoderJoint Entity and Relation Extraction | CodeCode Available | 2 |
| Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models | Jan 2, 2024 | Autonomous Driving | CodeCode Available | 2 |
| Unsupervised Continual Anomaly Detection with Contrastively-learned Prompt | Jan 2, 2024 | Anomaly DetectionAnomaly Segmentation | CodeCode Available | 2 |
| D3still: Decoupled Differential Distillation for Asymmetric Image Retrieval | Jan 1, 2024 | Image RetrievalRetrieval | CodeCode Available | 2 |
| The More You See in 2D the More You Perceive in 3D | Jan 1, 2024 | 3D ReconstructionImage to 3D | CodeCode Available | 2 |
| Scaled Decoupled Distillation | Jan 1, 2024 | Knowledge Distillation | CodeCode Available | 2 |
| Uncovering What Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly | Jan 1, 2024 | Anomaly Detection | CodeCode Available | 2 |
| MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding | Jan 1, 2024 | Autonomous DrivingInstruction Following | CodeCode Available | 2 |
| Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation | Jan 1, 2024 | DescriptiveObject | CodeCode Available | 2 |
| MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation | Jan 1, 2024 | SegmentationVideo Segmentation | CodeCode Available | 2 |
| LiSA: LiDAR Localization with Semantic Awareness | Jan 1, 2024 | Knowledge DistillationSemantic Segmentation | CodeCode Available | 2 |
| Rethinking Interactive Image Segmentation with Low Latency High Quality and Diverse Prompts | Jan 1, 2024 | Image SegmentationInteractive Segmentation | CodeCode Available | 2 |
| ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention | Jan 1, 2024 | Blocking | CodeCode Available | 2 |
| DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement | Jan 1, 2024 | DiversityScene Flow Estimation | CodeCode Available | 2 |
| When StyleGAN Meets Stable Diffusion: a W+ Adapter for Personalized Image Generation | Jan 1, 2024 | AttributeDisentanglement | CodeCode Available | 2 |
| DiffLoc: Diffusion Model for Outdoor LiDAR Localization | Jan 1, 2024 | Denoisingmodel | CodeCode Available | 2 |