| WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings | Jun 15, 2023 | Navigate | CodeCode Available | 2 |
| LVLM-eHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models | Jun 15, 2023 | HallucinationImage Captioning | CodeCode Available | 2 |
| QuadSwarm: A Modular Multi-Quadrotor Simulator for Deep Reinforcement Learning with Direct Thrust Control | Jun 15, 2023 | CPUDeep Reinforcement Learning | CodeCode Available | 2 |
| CMMLU: Measuring massive multitask language understanding in Chinese | Jun 15, 2023 | Large Language Model | CodeCode Available | 2 |
| PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs | Jun 15, 2023 | Benchmarking | CodeCode Available | 2 |
| Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis | Jun 15, 2023 | Image GenerationPreference Mapping | CodeCode Available | 2 |
| Datasets and Benchmarks for Offline Safe Reinforcement Learning | Jun 15, 2023 | Autonomous DrivingBenchmarking | CodeCode Available | 2 |
| SSCBench: A Large-Scale 3D Semantic Scene Completion Benchmark for Autonomous Driving | Jun 15, 2023 | 3D Semantic Scene Completion3D Semantic Scene Completion from a single 2D image | CodeCode Available | 2 |
| Segment Any Point Cloud Sequences by Distilling Vision Foundation Models | Jun 15, 2023 | Representation LearningTransfer Learning | CodeCode Available | 2 |
| 2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection | Jun 15, 2023 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data | Jun 15, 2023 | | CodeCode Available | 2 |
| Fast Training of Diffusion Models with Masked Transformers | Jun 15, 2023 | DecoderDenoising | CodeCode Available | 2 |
| LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting | Jun 14, 2023 | Traffic Prediction | CodeCode Available | 2 |
| TSMixer: Lightweight MLP-Mixer Model for Multivariate Time Series Forecasting | Jun 14, 2023 | Multivariate Time Series ForecastingRepresentation Learning | CodeCode Available | 2 |
| TryOnDiffusion: A Tale of Two UNets | Jun 14, 2023 | Virtual Try-on | CodeCode Available | 2 |
| NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification | Jun 14, 2023 | Graph structure learningimage-classification | CodeCode Available | 2 |
| MiniLLM: Knowledge Distillation of Large Language Models | Jun 14, 2023 | Instruction FollowingKnowledge Distillation | CodeCode Available | 2 |
| Hidden Biases of End-to-End Driving Models | Jun 13, 2023 | Autonomous DrivingBench2Drive | CodeCode Available | 2 |
| Parting with Misconceptions about Learning-based Vehicle Motion Planning | Jun 13, 2023 | MisconceptionsMotion Planning | CodeCode Available | 2 |
| One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning | Jun 13, 2023 | AllDomain Generalization | CodeCode Available | 2 |
| XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models | Jun 13, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Efficient 3D Semantic Segmentation with Superpoint Transformer | Jun 13, 2023 | 3D Semantic SegmentationGPU | CodeCode Available | 2 |
| Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models | Jun 13, 2023 | Catalytic activity predictionChemical-Disease Interaction Extraction | CodeCode Available | 2 |
| Controlling Text-to-Image Diffusion by Orthogonal Finetuning | Jun 12, 2023 | | CodeCode Available | 2 |
| Scalable 3D Captioning with Pretrained Models | Jun 12, 2023 | DescriptiveImage Captioning | CodeCode Available | 2 |
| Valley: Video Assistant with Large Language model Enhanced abilitY | Jun 12, 2023 | Action RecognitionInstruction Following | CodeCode Available | 2 |
| The Devil is in the Details: On the Pitfalls of Event Extraction Evaluation | Jun 12, 2023 | Event Argument ExtractionEvent Detection | CodeCode Available | 2 |
| Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization | Jun 11, 2023 | | CodeCode Available | 2 |
| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials | Jun 10, 2023 | Formation Energy | CodeCode Available | 2 |
| Mind2Web: Towards a Generalist Agent for the Web | Jun 9, 2023 | | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| SegViTv2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers | Jun 9, 2023 | Continual LearningContinual Semantic Segmentation | CodeCode Available | 2 |
| FasterViT: Fast Vision Transformers with Hierarchical Attention | Jun 9, 2023 | Image Classificationobject-detection | CodeCode Available | 2 |
| Prodigy: An Expeditiously Adaptive Parameter-Free Learner | Jun 9, 2023 | | CodeCode Available | 2 |
| InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding | Jun 8, 2023 | DecoderMulti-Task Learning | CodeCode Available | 2 |
| ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases | Jun 8, 2023 | | CodeCode Available | 2 |
| Matting Anything | Jun 8, 2023 | Image MattingReferring Image Matting | CodeCode Available | 2 |
| PIXIU: A Large Language Model, Instruction Data and Evaluation Benchmark for Finance | Jun 8, 2023 | Conversational Question AnsweringLanguage Modeling | CodeCode Available | 2 |
| Prompt Injection attack against LLM-integrated Applications | Jun 8, 2023 | | CodeCode Available | 2 |
| PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization | Jun 8, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street Views | Jun 8, 2023 | Autonomous DrivingGPU | CodeCode Available | 2 |
| Does Image Anonymization Impact Computer Vision Training? | Jun 8, 2023 | Face AnonymizationInstance Segmentation | CodeCode Available | 2 |
| RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit | Jun 8, 2023 | Answer GenerationFact Checking | CodeCode Available | 2 |
| K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| ReliableSwap: Boosting General Face Swapping Via Reliable Supervision | Jun 8, 2023 | Face ReenactmentFace Swapping | CodeCode Available | 2 |
| UCTB: An Urban Computing Tool Box for Building Spatiotemporal Prediction Services | Jun 7, 2023 | Diversity | CodeCode Available | 2 |
| On the Reliability of Watermarks for Large Language Models | Jun 7, 2023 | | CodeCode Available | 2 |
| Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models | Jun 7, 2023 | DiversityImage Generation | CodeCode Available | 2 |
| ModuleFormer: Modularity Emerges from Mixture-of-Experts | Jun 7, 2023 | Language ModellingLightweight Deployment | CodeCode Available | 2 |