| Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification | Jan 12, 2024 | ClusteringPerson Re-Identification | CodeCode Available | 2 |
| Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data | Jan 12, 2024 | | CodeCode Available | 2 |
| Seg-metrics: a Python package to compute segmentation metrics | Jan 12, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction | Jan 12, 2024 | Bandwidth ExtensionCPU | CodeCode Available | 2 |
| Seeing the roads through the trees: A benchmark for modeling spatial dependencies with aerial imagery | Jan 12, 2024 | Object RecognitionRoad Segmentation | CodeCode Available | 2 |
| Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery | Jan 11, 2024 | 3D ReconstructionDepth Estimation | CodeCode Available | 2 |
| On the representation and methodology for wide and short range head pose estimation | Jan 11, 2024 | ArticlesHead Pose Estimation | CodeCode Available | 2 |
| PartSTAD: 2D-to-3D Part Segmentation Task Adaptation | Jan 11, 2024 | 3D Part SegmentationForeground Segmentation | CodeCode Available | 2 |
| Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image Classification | Jan 11, 2024 | Contrastive Learningimage-classification | CodeCode Available | 2 |
| Cheetah: Bridging the Gap Between Machine Learning and Particle Accelerator Physics with High-Speed, Differentiable Simulations | Jan 11, 2024 | Bayesian Optimisation | CodeCode Available | 2 |
| LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected? | Jan 11, 2024 | Binary text classification | CodeCode Available | 2 |
| Transformers are Multi-State RNNs | Jan 11, 2024 | Decoder | CodeCode Available | 2 |
| UAVD4L: A Large-Scale Dataset for UAV 6-DoF Localization | Jan 11, 2024 | Synthetic Data GenerationVisual Localization | CodeCode Available | 2 |
| Transforming Image Super-Resolution: A ConvFormer-based Efficient Approach | Jan 11, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition | Jan 11, 2024 | Contrastive LearningDynamic Facial Expression Recognition | CodeCode Available | 2 |
| End-to-end Learnable Clustering for Intent Learning in Recommendation | Jan 11, 2024 | ClusteringContrastive Learning | CodeCode Available | 2 |
| ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video | Jan 10, 2024 | Video Summarization | CodeCode Available | 2 |
| Graph-of-Thought: Utilizing Large Language Models to Solve Complex and Dynamic Business Problems | Jan 10, 2024 | Decision Making | CodeCode Available | 2 |
| Rethinking Test-time Likelihood: The Likelihood Path Principle and Its Application to OOD Detection | Jan 10, 2024 | Out of Distribution (OOD) Detection | CodeCode Available | 2 |
| Singer Identity Representation Learning using Self-Supervised Techniques | Jan 10, 2024 | Domain GeneralizationRepresentation Learning | CodeCode Available | 2 |
| Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training | Jan 10, 2024 | | CodeCode Available | 2 |
| InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks | Jan 10, 2024 | Benchmarking | CodeCode Available | 2 |
| Real-time and Continuous Turn-taking Prediction Using Voice Activity Projection | Jan 10, 2024 | CPU | CodeCode Available | 2 |
| MTAD: Tools and Benchmarks for Multivariate Time Series Anomaly Detection | Jan 10, 2024 | Anomaly DetectionTime Series | CodeCode Available | 2 |
| DebugBench: Evaluating Debugging Capability of Large Language Models | Jan 9, 2024 | Code Generation | CodeCode Available | 2 |
| RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale | Jan 9, 2024 | Depth EstimationDepth Prediction | CodeCode Available | 2 |
| PhilEO Bench: Evaluating Geo-Spatial Foundation Models | Jan 9, 2024 | Density EstimationEarth Observation | CodeCode Available | 2 |
| Low-resource finetuning of foundation models beats state-of-the-art in histopathology | Jan 9, 2024 | GPUSelf-Supervised Learning | CodeCode Available | 2 |
| TechGPT-2.0: A large language model project to solve the task of knowledge graph construction | Jan 9, 2024 | graph constructionLanguage Modeling | CodeCode Available | 2 |
| Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation | Jan 9, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | Jan 9, 2024 | Cell SegmentationImage Segmentation | CodeCode Available | 2 |
| Deep Covariance Alignment for Domain Adaptive Remote Sensing Image Segmentation | Jan 9, 2024 | Image SegmentationSegmentation | CodeCode Available | 2 |
| LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection | Jan 9, 2024 | Anomaly Detection | CodeCode Available | 2 |
| Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding | Jan 9, 2024 | Fact VerificationIn-Context Learning | CodeCode Available | 2 |
| SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | Jan 8, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Low-light Image Enhancement via CLIP-Fourier Guided Wavelet Diffusion | Jan 8, 2024 | Image EnhancementLow-Light Image Enhancement | CodeCode Available | 2 |
| WidthFormer: Toward Efficient Transformer-based BEV View Transformation | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| LLM4PLC: Harnessing Large Language Models for Verifiable Programming of PLCs in Industrial Control Systems | Jan 8, 2024 | Code GenerationPrompt Engineering | CodeCode Available | 2 |
| scDiffusion: conditional generation of high-quality single-cell data using diffusion model | Jan 8, 2024 | | CodeCode Available | 2 |
| Attack-Resilient Image Watermarking Using Stable Diffusion | Jan 8, 2024 | Denoising | CodeCode Available | 2 |
| A Survey on 3D Gaussian Splatting | Jan 8, 2024 | 3D ReconstructionSurvey | CodeCode Available | 2 |
| RoboFusion: Towards Robust Multi-Modal 3D Object Detection via SAM | Jan 8, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| MARG: Multi-Agent Review Generation for Scientific Papers | Jan 8, 2024 | Review GenerationSpecificity | CodeCode Available | 2 |
| MS-DETR: Efficient DETR Training with Mixed Supervision | Jan 8, 2024 | DecoderObject | CodeCode Available | 2 |
| Multi-Modal Representation Learning for Molecular Property Prediction: Sequence, Graph, Geometry | Jan 7, 2024 | Data AugmentationDrug Discovery | CodeCode Available | 2 |
| Grimoire is All You Need for Enhancing Large Language Models | Jan 7, 2024 | AllIn-Context Learning | CodeCode Available | 2 |
| Agent AI: Surveying the Horizons of Multimodal Interaction | Jan 7, 2024 | multimodal interaction | CodeCode Available | 2 |
| Towards Effective Multiple-in-One Image Restoration: A Sequential and Prompt Learning Strategy | Jan 7, 2024 | Image RestorationPrompt Learning | CodeCode Available | 2 |
| InFoBench: Evaluating Instruction Following Ability in Large Language Models | Jan 7, 2024 | Instruction Following | CodeCode Available | 2 |
| Malla: Demystifying Real-world Large Language Model Integrated Malicious Services | Jan 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |