| Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training | Jul 31, 2023 | Organ SegmentationRepresentation Learning | CodeCode Available | 2 |
| An Unforgeable Publicly Verifiable Watermark for Large Language Models | Jul 30, 2023 | Computational Efficiency | CodeCode Available | 2 |
| UnIVAL: Unified Model for Image, Video, Audio and Language Tasks | Jul 30, 2023 | Out-of-Distribution Generalization | CodeCode Available | 2 |
| SEED-Bench: Benchmarking Multimodal LLMs with Generative Comprehension | Jul 30, 2023 | BenchmarkingMultiple-choice | CodeCode Available | 2 |
| Implicit Neural Representation in Medical Imaging: A Comparative Survey | Jul 30, 2023 | Domain AdaptationImage Reconstruction | CodeCode Available | 2 |
| XMem++: Production-level Video Segmentation From Few Annotated Frames | Jul 29, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking | Jul 28, 2023 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| Equivariance and partial observations in Koopman operator theory for partial differential equations | Jul 28, 2023 | | CodeCode Available | 2 |
| Scaling Data Generation in Vision-and-Language Navigation | Jul 28, 2023 | Imitation LearningVision and Language Navigation | CodeCode Available | 2 |
| TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts | Jul 28, 2023 | Long-range modelingMixture-of-Experts | CodeCode Available | 2 |
| RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control | Jul 28, 2023 | ObjectQuestion Answering | CodeCode Available | 2 |
| Widespread Flaws in Offline Evaluation of Recommender Systems | Jul 27, 2023 | Recommendation Systems | CodeCode Available | 2 |
| PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking | Jul 27, 2023 | DiversityPoint Tracking | CodeCode Available | 2 |
| The Effect of Third Party Implementations on Reproducibility | Jul 27, 2023 | Recommendation Systems | CodeCode Available | 2 |
| IML-ViT: Benchmarking Image Manipulation Localization by Vision Transformer | Jul 27, 2023 | BenchmarkingImage Manipulation | CodeCode Available | 2 |
| Solving Data Quality Problems with Desbordante: a Demo | Jul 27, 2023 | Anomaly DetectionDescriptive | CodeCode Available | 2 |
| Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation | Jul 27, 2023 | 3D geometryFew-Shot Learning | CodeCode Available | 2 |
| Med-Flamingo: a Multimodal Medical Few-shot Learner | Jul 27, 2023 | Medical Visual Question AnsweringQuestion Answering | CodeCode Available | 2 |
| The RoboDepth Challenge: Methods and Advancements Towards Robust Depth Estimation | Jul 27, 2023 | Depth EstimationImage Restoration | CodeCode Available | 2 |
| MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving | Jul 27, 2023 | Autonomous DrivingNeRF | CodeCode Available | 2 |
| TransNormerLLM: A Faster and Better Large Language Model with Improved TransNormer | Jul 27, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Generative AI for Medical Imaging: extending the MONAI Framework | Jul 27, 2023 | Anomaly DetectionDenoising | CodeCode Available | 2 |
| NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection | Jul 27, 2023 | 3D geometry3D Object Detection | CodeCode Available | 2 |
| Three Bricks to Consolidate Watermarks for Large Language Models | Jul 26, 2023 | valid | CodeCode Available | 2 |
| Hypergraph Isomorphism Computation | Jul 26, 2023 | Community DetectionGraph Classification | CodeCode Available | 2 |
| trajdata: A Unified Interface to Multiple Human Trajectory Datasets | Jul 26, 2023 | Autonomous VehiclesMotion Forecasting | CodeCode Available | 2 |
| Tracking Anything in High Quality | Jul 26, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| TabR: Tabular Deep Learning Meets Nearest Neighbors in 2023 | Jul 26, 2023 | Deep LearningRetrieval | CodeCode Available | 2 |
| WavJourney: Compositional Audio Creation with Large Language Models | Jul 26, 2023 | Audio Generation | CodeCode Available | 2 |
| LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition | Jul 25, 2023 | In-Context Learning | CodeCode Available | 2 |
| QuIP: 2-Bit Quantization of Large Language Models With Guarantees | Jul 25, 2023 | Quantization | CodeCode Available | 2 |
| Zshot: An Open-source Framework for Zero-Shot Named Entity Recognition and Relation Extraction | Jul 25, 2023 | named-entity-recognitionNamed Entity Recognition | CodeCode Available | 2 |
| FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios | Jul 25, 2023 | Code GenerationFact Checking | CodeCode Available | 2 |
| Foundational Models Defining a New Era in Vision: A Survey and Outlook | Jul 25, 2023 | Benchmarking | CodeCode Available | 2 |
| TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition | Jul 24, 2023 | Image-Guided CompositionText-to-Image Generation | CodeCode Available | 2 |
| Aligning Large Language Models with Human: A Survey | Jul 24, 2023 | Survey | CodeCode Available | 2 |
| Getting pwn'd by AI: Penetration Testing with Large Language Models | Jul 24, 2023 | EthicsTask Planning | CodeCode Available | 2 |
| COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts | Jul 24, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models | Jul 24, 2023 | Image GenerationImage-text matching | CodeCode Available | 2 |
| Remote Bio-Sensing: Open Source Benchmark Framework for Fair Evaluation of rPPG | Jul 24, 2023 | Benchmarking | CodeCode Available | 2 |
| A Simple and Model-Free Path Filtering Algorithm for Smoothing and Accuracy | Jul 23, 2023 | Autonomous DrivingDenoising | CodeCode Available | 2 |
| Pyramid Semantic Graph-based Global Point Cloud Registration with Low Overlap | Jul 22, 2023 | Point Cloud RegistrationPose Estimation | CodeCode Available | 2 |
| PINNsFormer: A Transformer-Based Framework For Physics-Informed Neural Networks | Jul 21, 2023 | | CodeCode Available | 2 |
| Predict, Refine, Synthesize: Self-Guiding Diffusion Models for Probabilistic Time Series Forecasting | Jul 21, 2023 | ImputationProbabilistic Time Series Forecasting | CodeCode Available | 2 |
| Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning | Jul 21, 2023 | Diffusion PersonalizationDiffusion Personalization Tuning Free | CodeCode Available | 2 |
| BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion | Jul 20, 2023 | Conditional Text-to-Image SynthesisDenoising | CodeCode Available | 2 |
| CNOS: A Strong Baseline for CAD-based Novel Object Segmentation | Jul 20, 2023 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| BlendFace: Re-designing Identity Encoders for Face-Swapping | Jul 20, 2023 | AttributeDisentanglement | CodeCode Available | 2 |
| FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets | Jul 20, 2023 | Instruction FollowingLanguage Model Evaluation | CodeCode Available | 2 |
| DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering | Jul 19, 2023 | Camera CalibrationNovel View Synthesis | CodeCode Available | 2 |