| DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement | May 14, 2023 | CPUSpeech Enhancement | CodeCode Available | 4 |
| Locally Typical Sampling | Feb 1, 2022 | Abstractive Text SummarizationStory Generation | CodeCode Available | 4 |
| Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | Jan 16, 2025 | Causal Inferencecounterfactual | CodeCode Available | 4 |
| Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | Mar 12, 2025 | DenoisingLanguage Modeling | CodeCode Available | 4 |
| EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction | May 29, 2022 | Autonomous DrivingCPU | CodeCode Available | 4 |
| AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset | Apr 23, 2025 | MathMathematical Reasoning | CodeCode Available | 4 |
| Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length | Apr 12, 2024 | State Space Models | CodeCode Available | 4 |
| TIAViz: A Browser-based Visualization Tool for Computational Pathology Models | Feb 15, 2024 | whole slide images | CodeCode Available | 4 |
| MegActor: Harness the Power of Raw Video for Vivid Portrait Animation | May 31, 2024 | Portrait AnimationStyle Transfer | CodeCode Available | 4 |
| SLIM: Sparsified Late Interaction for Multi-Vector Retrieval with Inverted Indexes | Feb 13, 2023 | Information RetrievalRetrieval | CodeCode Available | 4 |
| Retrieval-Generation Synergy Augmented Large Language Models | Oct 8, 2023 | Question AnsweringRetrieval | CodeCode Available | 4 |
| OtterHD: A High-Resolution Multi-modality Model | Nov 7, 2023 | modelVisual Question Answering | CodeCode Available | 4 |
| Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning | May 11, 2022 | Few-Shot Text ClassificationIn-Context Learning | CodeCode Available | 4 |
| Universal and Transferable Adversarial Attacks on Aligned Language Models | Jul 27, 2023 | Adversarial AttackIngenuity | CodeCode Available | 4 |
| Generative Representational Instruction Tuning | Feb 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Competition-Level Code Generation with AlphaCode | Feb 8, 2022 | Code Generation | CodeCode Available | 4 |
| G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model | Dec 18, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Zero-1-to-3: Zero-shot One Image to 3D Object | Mar 20, 2023 | 3D ReconstructionImage to 3D | CodeCode Available | 4 |
| Guiding a Diffusion Model with a Bad Version of Itself | Jun 4, 2024 | Image Generation | CodeCode Available | 4 |
| ShapeLLM-Omni: A Native Multimodal LLM for 3D Generation and Understanding | Jun 2, 2025 | 3D GenerationLarge Language Model | CodeCode Available | 4 |
| FedML-HE: An Efficient Homomorphic-Encryption-Based Privacy-Preserving Federated Learning System | Mar 20, 2023 | Federated LearningPrivacy Preserving | CodeCode Available | 4 |
| CrisperWhisper: Accurate Timestamps on Verbatim Speech Transcriptions | Aug 29, 2024 | Dynamic Time Warpingspeech-recognition | CodeCode Available | 4 |
| LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token | Jan 7, 2025 | GPUVisual Question Answering (VQA) | CodeCode Available | 4 |
| Attention Mesh: High-fidelity Face Mesh Prediction in Real-time | Jun 19, 2020 | Vocal Bursts Intensity Prediction | CodeCode Available | 4 |
| LBM: Latent Bridge Matching for Fast Image-to-Image Translation | Mar 10, 2025 | Depth EstimationImage Relighting | CodeCode Available | 4 |
| No Language is an Island: Unifying Chinese and English in Financial Large Language Models, Instruction Data, and Benchmarks | Mar 10, 2024 | Financial Analysis | CodeCode Available | 4 |
| Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis | Aug 18, 2023 | Dynamic ReconstructionNovel View Synthesis | CodeCode Available | 4 |
| Improving and generalizing flow-based generative models with minibatch optimal transport | Feb 1, 2023 | | CodeCode Available | 4 |
| DPFlow: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework | Mar 19, 2025 | 8kAction Recognition | CodeCode Available | 4 |
| BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers | Mar 31, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 4 |
| LLM-Enhanced Data Management | Feb 4, 2024 | HallucinationManagement | CodeCode Available | 4 |
| CoBa: Convergence Balancer for Multitask Finetuning of Large Language Models | Oct 9, 2024 | Multi-Task Learning | CodeCode Available | 4 |
| Sailor: Open Language Models for South-East Asia | Apr 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation | Mar 29, 2022 | Binary ClassificationSegmentation | CodeCode Available | 4 |
| Large Language Models for Data Annotation and Synthesis: A Survey | Feb 21, 2024 | Survey | CodeCode Available | 4 |
| Vision GNN: An Image is Worth Graph of Nodes | Jun 1, 2022 | Image ClassificationObject Detection | CodeCode Available | 4 |
| PixelsDB: Serverless and NL-Aided Data Analytics with Flexible Service Levels and Prices | May 30, 2024 | Scheduling | CodeCode Available | 4 |
| OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics | Jun 14, 2025 | Benchmarking | CodeCode Available | 4 |
| Convolutional Kolmogorov-Arnold Networks | Jun 19, 2024 | Kolmogorov-Arnold Networks | CodeCode Available | 4 |
| FSID: Fully Synthetic Image Denoising via Procedural Scene Generation | Dec 7, 2022 | DenoisingImage Denoising | CodeCode Available | 4 |
| StyleBooth: Image Style Editing with Multimodal Instruction | Apr 18, 2024 | | CodeCode Available | 4 |
| Inception-Based Crowd Counting -- Being Fast while Remaining Accurate | Oct 18, 2022 | Crowd Counting | CodeCode Available | 4 |
| PFLlib: A Beginner-Friendly and Comprehensive Personalized Federated Learning Library and Benchmark | Dec 8, 2023 | Federated LearningPersonalized Federated Learning | CodeCode Available | 4 |
| miniCTX: Neural Theorem Proving with (Long-)Contexts | Aug 5, 2024 | Automated Theorem Proving | CodeCode Available | 4 |
| How is ChatGPT's behavior changing over time? | Jul 18, 2023 | Code GenerationLanguage Modelling | CodeCode Available | 4 |
| Deep Patch Visual SLAM | Aug 3, 2024 | GPUVisual Odometry | CodeCode Available | 4 |
| Towards Automated Circuit Discovery for Mechanistic Interpretability | Apr 28, 2023 | | CodeCode Available | 4 |
| VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning | May 17, 2025 | 2D Object DetectionObject Counting | CodeCode Available | 4 |
| TigerBot: An Open Multilingual Multitask LLM | Dec 14, 2023 | | CodeCode Available | 4 |
| PLAID: An Efficient Engine for Late Interaction Retrieval | May 19, 2022 | CPUGPU | CodeCode Available | 4 |