| Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler | Aug 23, 2024 | | CodeCode Available | 2 |
| DeTPP: Leveraging Object Detection for Robust Long-Horizon Event Prediction | Aug 23, 2024 | DiversityPoint Processes | CodeCode Available | 2 |
| ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM | Aug 22, 2024 | Misinformation | CodeCode Available | 2 |
| UMERegRobust -- Universal Manifold Embedding Compatible Features for Robust Point Cloud Registration | Aug 22, 2024 | Point Cloud Registration | CodeCode Available | 2 |
| Scalable Autoregressive Image Generation with Mamba | Aug 22, 2024 | Image GenerationMamba | CodeCode Available | 2 |
| MuMA-ToM: Multi-modal Multi-Agent Theory of Mind | Aug 22, 2024 | | CodeCode Available | 2 |
| Towards Evaluating and Building Versatile Large Language Models for Medicine | Aug 22, 2024 | Multiple-choicenamed-entity-recognition | CodeCode Available | 2 |
| Personality Alignment of Large Language Models | Aug 21, 2024 | Personality Alignment | CodeCode Available | 2 |
| KAN4TSF: Are KAN and KAN-based models Effective for Time Series Forecasting? | Aug 21, 2024 | Mixture-of-ExpertsTime Series | CodeCode Available | 2 |
| VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment | Aug 21, 2024 | Video AlignmentVideo Editing | CodeCode Available | 2 |
| Critique-out-Loud Reward Models | Aug 21, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation | Aug 21, 2024 | Fault DiagnosisManagement | CodeCode Available | 2 |
| Pano2Room: Novel View Synthesis from a Single Indoor Panorama | Aug 21, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| HMT-UNet: A hybird Mamba-Transformer Vision UNet for Medical Image Segmentation | Aug 21, 2024 | Image SegmentationMamba | CodeCode Available | 2 |
| biorecap: an R package for summarizing bioRxiv preprints with a local LLM | Aug 21, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| RaNDT SLAM: Radar SLAM Based on Intensity-Augmented Normal Distributions Transform | Aug 21, 2024 | Indoor LocalizationSimultaneous Localization and Mapping | CodeCode Available | 2 |
| UNetMamba: An Efficient UNet-Like Mamba for Semantic Segmentation of High-Resolution Remote Sensing Images | Aug 21, 2024 | MambaSegmentation | CodeCode Available | 2 |
| Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks | Aug 20, 2024 | Multi-agent Reinforcement LearningMulti-Task Learning | CodeCode Available | 2 |
| MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning | Aug 20, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| PerturBench: Benchmarking Machine Learning Models for Cellular Perturbation Analysis | Aug 20, 2024 | Benchmarking | CodeCode Available | 2 |
| ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks | Aug 20, 2024 | | CodeCode Available | 2 |
| Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search | Aug 20, 2024 | Decision MakingDialogue Generation | CodeCode Available | 2 |
| BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model | Aug 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| deepmriprep: Voxel-based Morphometry (VBM) Preprocessing via Deep Neural Networks | Aug 20, 2024 | GPUImage Registration | CodeCode Available | 2 |
| MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding | Aug 20, 2024 | | CodeCode Available | 2 |
| GSLoc: Efficient Camera Pose Refinement via 3D Gaussian Splatting | Aug 20, 2024 | 3DGSNeRF | CodeCode Available | 2 |
| DEGAS: Detailed Expressions on Full-Body Gaussian Avatars | Aug 20, 2024 | 3DGSNeural Rendering | CodeCode Available | 2 |
| FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Aug 20, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| PartGS:Learning Part-aware 3D Representations by Fusing 2D Gaussians and Superquadrics | Aug 20, 2024 | 3D Reconstruction | CodeCode Available | 2 |
| PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting | Aug 20, 2024 | Multivariate Time Series ForecastingTemporal Sequences | CodeCode Available | 2 |
| LegalBench-RAG: A Benchmark for Retrieval-Augmented Generation in the Legal Domain | Aug 19, 2024 | RAGRetrieval | CodeCode Available | 2 |
| TraDiffusion: Trajectory-Based Training-Free Image Generation | Aug 19, 2024 | Image Generation | CodeCode Available | 2 |
| C2P-CLIP: Injecting Category Common Prompt in CLIP to Enhance Generalization in Deepfake Detection | Aug 19, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 2 |
| PA-LLaVA: A Large Language-Vision Assistant for Human Pathology Image Understanding | Aug 18, 2024 | Language ModellingQuestion Answering | CodeCode Available | 2 |
| SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama | Aug 18, 2024 | Script GenerationVideo Captioning | CodeCode Available | 2 |
| Selective Prompt Anchoring for Code Generation | Aug 17, 2024 | Code Generation | CodeCode Available | 2 |
| TC-RAG:Turing-Complete RAG's Case study on Medical LLM Systems | Aug 17, 2024 | RAGRetrieval | CodeCode Available | 2 |
| Gaussian in the Dark: Real-Time View Synthesis From Inconsistent Dark Images Using Gaussian Splatting | Aug 17, 2024 | | CodeCode Available | 2 |
| Segment Anything with Multiple Modalities | Aug 17, 2024 | SegmentationSensor Fusion | CodeCode Available | 2 |
| An Open-Source American Sign Language Fingerspell Recognition and Semantic Pose Retrieval Interface | Aug 17, 2024 | Pose EstimationPose Retrieval | CodeCode Available | 2 |
| Accelerating Giant Impact Simulations with Machine Learning | Aug 16, 2024 | | CodeCode Available | 2 |
| A Survey on Benchmarks of Multimodal Large Language Models | Aug 16, 2024 | Question AnsweringSurvey | CodeCode Available | 2 |
| MIA-Tuner: Adapting Large Language Models as Pre-training Text Detector | Aug 16, 2024 | Inference AttackMembership Inference Attack | CodeCode Available | 2 |
| OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction | Aug 16, 2024 | PredictionTraffic Prediction | CodeCode Available | 2 |
| RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search | Aug 16, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis | Aug 16, 2024 | Contrastive LearningDiagnostic | CodeCode Available | 2 |
| PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders | Aug 16, 2024 | 3D Object Classification3D Point Cloud Classification | CodeCode Available | 2 |
| xGen-MM (BLIP-3): A Family of Open Large Multimodal Models | Aug 16, 2024 | In-Context Learning | CodeCode Available | 2 |
| EasyRec: Simple yet Effective Language Models for Recommendation | Aug 16, 2024 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 |
| Efficient Autoregressive Audio Modeling via Next-Scale Prediction | Aug 16, 2024 | Audio GenerationFAD | CodeCode Available | 2 |