| The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants | Aug 31, 2023 | BelebeleCross-Lingual Transfer | CodeCode Available | 2 |
| Beyond Self-Attention: Deformable Large Kernel Attention for Medical Image Segmentation | Aug 31, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 2 |
| Accurate Computation of Quantum Excited States with Neural Networks | Aug 31, 2023 | Variational Monte Carlo | CodeCode Available | 2 |
| InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion | Aug 31, 2023 | 3D Human DynamicsHuman Dynamics | CodeCode Available | 2 |
| PointLLM: Empowering Large Language Models to Understand Point Clouds | Aug 31, 2023 | 3D Object Captioning3D Object Classification | CodeCode Available | 2 |
| SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models | Aug 31, 2023 | DecoderLanguage Modeling | CodeCode Available | 2 |
| The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages | Aug 31, 2023 | Data AugmentationText Generation | CodeCode Available | 2 |
| MVDream: Multi-view Diffusion for 3D Generation | Aug 31, 2023 | 3D GenerationPrompt Learning | CodeCode Available | 2 |
| PivotNet: Vectorized Pivot Learning for End-to-end HD Map Construction | Aug 31, 2023 | Autonomous Driving | CodeCode Available | 2 |
| GREC: Generalized Referring Expression Comprehension | Aug 30, 2023 | Generalized Referring Expression ComprehensionReferring Expression | CodeCode Available | 2 |
| DTrOCR: Decoder-only Transformer for Optical Character Recognition | Aug 30, 2023 | DecoderHandwritten Text Recognition | CodeCode Available | 2 |
| LLaSM: Large Language and Speech Model | Aug 30, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Nemo: First Glimpse of a New Rule Engine | Aug 30, 2023 | Knowledge Graphs | CodeCode Available | 2 |
| WeatherBench 2: A benchmark for the next generation of data-driven global weather models | Aug 29, 2023 | Weather Forecasting | CodeCode Available | 2 |
| AutoDroid: LLM-powered Task Automation in Android | Aug 29, 2023 | Language Modelling | CodeCode Available | 2 |
| When Do Program-of-Thoughts Work for Reasoning? | Aug 29, 2023 | Code GenerationMathematical Reasoning | CodeCode Available | 2 |
| CAGRA: Highly Parallel Graph Construction and Approximate Nearest Neighbor Search for GPUs | Aug 29, 2023 | CPUGPU | CodeCode Available | 2 |
| Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation | Aug 29, 2023 | Prompt EngineeringText to SQL | CodeCode Available | 2 |
| Fast Feedforward Networks | Aug 28, 2023 | Mixture-of-Experts | CodeCode Available | 2 |
| Graph Meets LLMs: Towards Large Graph Models | Aug 28, 2023 | | CodeCode Available | 2 |
| DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation | Aug 28, 2023 | Knowledge Graphs | CodeCode Available | 2 |
| High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net | Aug 27, 2023 | Document Shadow RemovalImage Shadow Removal | CodeCode Available | 2 |
| The DiffuseStyleGesture+ entry to the GENEA Challenge 2023 | Aug 26, 2023 | | CodeCode Available | 2 |
| Residual Denoising Diffusion Models | Aug 25, 2023 | DenoisingDiversity | CodeCode Available | 2 |
| Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs | Aug 25, 2023 | | CodeCode Available | 2 |
| OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models | Aug 25, 2023 | Common Sense ReasoningComputational Efficiency | CodeCode Available | 2 |
| DARWIN Series: Domain Specific Large Language Models for Natural Science | Aug 25, 2023 | Knowledge Graphs | CodeCode Available | 2 |
| The GENEA Challenge 2023: A large scale evaluation of gesture generation models in monadic and dyadic settings | Aug 24, 2023 | Gesture Generation | CodeCode Available | 2 |
| FastSurfer-HypVINN: Automated sub-segmentation of the hypothalamus and adjacent structures on high-resolutional brain MRI | Aug 24, 2023 | GPUSegmentation | CodeCode Available | 2 |
| Scenimefy: Learning to Craft Anime Scene via Semi-Supervised Image-to-Image Translation | Aug 24, 2023 | Image-to-Image Translation | CodeCode Available | 2 |
| NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes | Aug 24, 2023 | Generalizable Novel View SynthesisNovel View Synthesis | CodeCode Available | 2 |
| Dense Text-to-Image Generation with Attention Modulation | Aug 24, 2023 | Image GenerationText to Image Generation | CodeCode Available | 2 |
| BridgeData V2: A Dataset for Robot Learning at Scale | Aug 24, 2023 | Imitation LearningMulti-Task Learning | CodeCode Available | 2 |
| Motion In-Betweening with Phase Manifolds | Aug 24, 2023 | Mixture-of-Expertsmotion in-betweening | CodeCode Available | 2 |
| WavMark: Watermarking for Audio Generation | Aug 24, 2023 | Audio Generation | CodeCode Available | 2 |
| StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction | Aug 24, 2023 | Autonomous Driving | CodeCode Available | 2 |
| Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations | Aug 23, 2023 | BenchmarkingDecoder | CodeCode Available | 2 |
| From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning | Aug 23, 2023 | Instruction Following | CodeCode Available | 2 |
| Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion | Aug 23, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| Knowledge Graph Prompting for Multi-Document Question Answering | Aug 22, 2023 | graph constructionOpen-Domain Question Answering | CodeCode Available | 2 |
| IT3D: Improved Text-to-3D Generation with Explicit View Synthesis | Aug 22, 2023 | 3D GenerationText to 3D | CodeCode Available | 2 |
| G3Reg: Pyramid Graph-based Global Registration using Gaussian Ellipsoid Model | Aug 22, 2023 | | CodeCode Available | 2 |
| SONAR: Sentence-Level Multimodal and Language-Agnostic Representations | Aug 22, 2023 | DecoderMachine Translation | CodeCode Available | 2 |
| VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection | Aug 22, 2023 | Anomaly DetectionBinary Classification | CodeCode Available | 2 |
| SeamlessM4T: Massively Multilingual & Multimodal Machine Translation | Aug 22, 2023 | Automatic Speech RecognitionMachine Translation | CodeCode Available | 2 |
| TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under Complex Motions and Diverse Scenes | Aug 22, 2023 | Multi-Object TrackingObject Tracking | CodeCode Available | 2 |
| ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes | Aug 22, 2023 | 3D Semantic SegmentationNovel View Synthesis | CodeCode Available | 2 |
| Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning | Aug 22, 2023 | Caption GenerationLarge Language Model | CodeCode Available | 2 |
| Giraffe: Adventures in Expanding Context Lengths in LLMs | Aug 21, 2023 | 16k4k | CodeCode Available | 2 |
| Turning a CLIP Model into a Scene Text Spotter | Aug 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |