| BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks and Defenses on Large Language Models | Aug 23, 2024 | Data Poisoningtext-classification | CodeCode Available | 3 |
| Frequency-aware Feature Fusion for Dense Image Prediction | Aug 23, 2024 | Prediction | CodeCode Available | 3 |
| BoostTrack++: using tracklet information to detect more objects in multiple object tracking | Aug 23, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 |
| Controllable Text Generation for Large Language Models: A Survey | Aug 22, 2024 | AttributePrompt Engineering | CodeCode Available | 3 |
| Exploring the Feasibility of Automated Data Standardization using Large Language Models for Seamless Positioning | Aug 22, 2024 | Data Integration | CodeCode Available | 3 |
| GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF Fusion | Aug 22, 2024 | Computational Efficiency | CodeCode Available | 3 |
| Recent Advances on Machine Learning for Computational Fluid Dynamics: A Survey | Aug 22, 2024 | scientific discoverySymbolic Regression | CodeCode Available | 3 |
| RAGLAB: A Modular and Research-Oriented Unified Framework for Retrieval-Augmented Generation | Aug 21, 2024 | RAGRetrieval | CodeCode Available | 3 |
| GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian Splatting | Aug 21, 2024 | Representation Learning | CodeCode Available | 3 |
| A Survey of Embodied Learning for Object-Centric Robotic Manipulation | Aug 21, 2024 | Imitation LearningObject | CodeCode Available | 3 |
| Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model | Aug 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| A Short Review and Evaluation of SAM2's Performance in 3D CT Image Segmentation | Aug 20, 2024 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| AnyGraph: Graph Foundation Model in the Wild | Aug 20, 2024 | Graph LearningMixture-of-Experts | CodeCode Available | 3 |
| Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation | Aug 20, 2024 | Code CompletionCode Generation | CodeCode Available | 3 |
| Accelerating Goal-Conditioned RL Algorithms and Research | Aug 20, 2024 | GPUreinforcement-learning | CodeCode Available | 3 |
| NeuFlow v2: High-Efficiency Optical Flow Estimation on Edge Devices | Aug 19, 2024 | Optical Flow Estimation | CodeCode Available | 3 |
| Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation | Aug 19, 2024 | Image GenerationVideo Generation | CodeCode Available | 3 |
| LoopSplat: Loop Closure by Registering 3D Gaussian Splats | Aug 19, 2024 | 3DGSPoint Cloud Registration | CodeCode Available | 3 |
| SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models | Aug 19, 2024 | image-classificationImage Classification | CodeCode Available | 3 |
| ALS-HAR: Harnessing Wearable Ambient Light Sensors to Enhance IMU-based Human Activity Recogntion | Aug 18, 2024 | Activity RecognitionHuman Activity Recognition | CodeCode Available | 3 |
| The First Competition on Resource-Limited Infrared Small Target Detection Challenge: Methods and Results | Aug 18, 2024 | | CodeCode Available | 3 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 |
| Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling | Aug 16, 2024 | Retrieval | CodeCode Available | 3 |
| RadioDiff: An Effective Generative Diffusion Model for Sampling-Free Dynamic Radio Map Construction | Aug 16, 2024 | | CodeCode Available | 3 |
| ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models | Aug 16, 2024 | GPUModel Compression | CodeCode Available | 3 |
| SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation | Aug 16, 2024 | Image SegmentationMarine Animal Segmentation | CodeCode Available | 3 |
| Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models | Aug 15, 2024 |
| CodeCode Available | 3 |
| Graph Retrieval-Augmented Generation: A Survey | Aug 15, 2024 | HallucinationRAG | CodeCode Available | 3 |
| 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Aug 15, 2024 | image-classificationImage Classification | CodeCode Available | 3 |
| FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution Rendering | Aug 15, 2024 | Computational EfficiencyScheduling | CodeCode Available | 3 |
| Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization | Aug 15, 2024 | Speech Synthesis | CodeCode Available | 3 |
| Aquila2 Technical Report | Aug 14, 2024 | Management | CodeCode Available | 3 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation | Aug 14, 2024 | Speech Synthesistext-to-speech | CodeCode Available | 3 |
| OpenResearcher: Unleashing AI for Accelerated Scientific Research | Aug 13, 2024 | RAGRetrieval | CodeCode Available | 3 |
| Imagen 3 | Aug 13, 2024 | | CodeCode Available | 3 |
| BMX: Entropy-weighted Similarity and Semantic-enhanced Lexical Search | Aug 13, 2024 | Information RetrievalRetrieval | CodeCode Available | 3 |
| FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework | Aug 12, 2024 | Object TrackingOptical Flow Estimation | CodeCode Available | 3 |
| SkillMimic: Learning Basketball Interaction Skills from Demonstrations | Aug 12, 2024 | DiversityHuman-Object Interaction Detection | CodeCode Available | 3 |
| UniPortrait: A Unified Framework for Identity-Preserving Single- and Multi-Human Image Personalization | Aug 12, 2024 | Layout Generation | CodeCode Available | 3 |
| Music2Latent: Consistency Autoencoders for Latent Audio Compression | Aug 12, 2024 | Audio CompressionInformation Retrieval | CodeCode Available | 3 |
| Mambular: A Sequential Model for Tabular Deep Learning | Aug 12, 2024 | Deep LearningMamba | CodeCode Available | 3 |
| VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents | Aug 12, 2024 | | CodeCode Available | 3 |
| LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale | Aug 10, 2024 | GPULanguage Modelling | CodeCode Available | 3 |
| MooER: LLM-based Speech Recognition and Translation Models from Moore Threads | Aug 9, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 3 |
| Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 | Aug 9, 2024 | All | CodeCode Available | 3 |
| BoFire: Bayesian Optimization Framework Intended for Real Experiments | Aug 9, 2024 | Bayesian Optimization | CodeCode Available | 3 |
| Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Aug 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| ECG-FM: An Open Electrocardiogram Foundation Model | Aug 9, 2024 | Contrastive LearningDiagnostic | CodeCode Available | 3 |
| UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling | Aug 9, 2024 | GPULanguage Modeling | CodeCode Available | 3 |