16k

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 146 papers

Title	Date	Tasks	Status	Hype	Score
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence	Jun 17, 2024	16kLanguage Modeling	CodeCode Available	9	5
Global Structure-from-Motion Revisited	Jul 29, 2024	16k	CodeCode Available	7	5
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness	May 27, 2022	16k4k	CodeCode Available	6	5
Code Llama: Open Foundation Models for Code	Aug 24, 2023	16kCode Generation	CodeCode Available	6	5
Learning to (Learn at Test Time): RNNs with Expressive Hidden States	Jul 5, 2024	16k8k	CodeCode Available	5	5
Long-form factuality in large language models	Mar 27, 2024	16kForm	CodeCode Available	4	5
SnapKV: LLM Knows What You are Looking for Before Generation	Apr 22, 2024	16kGPU	CodeCode Available	3	5
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding	Aug 28, 2023	16kCode Completion	CodeCode Available	3	5
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model Transformation	Oct 4, 2024	16kCode Generation	CodeCode Available	3	5
M+: Extending MemoryLLM with Scalable Long-Term Memory	Feb 1, 2025	16kGPU	CodeCode Available	3	5
Investigating Efficiently Extending Transformers for Long Input Summarization	Aug 8, 2022	16kLong-range modeling	CodeCode Available	3	5
LinFusion: 1 GPU, 1 Minute, 16K Image	Sep 3, 2024	16kCausal Inference	CodeCode Available	3	5
FlashDMoE: Fast Distributed MoE in a Single Kernel	Jun 5, 2025	16kCPU	CodeCode Available	3	5
Benchmarking Large Language Models on CFLUE -- A Chinese Financial Language Understanding Evaluation Dataset	May 17, 2024	16kBenchmarking	CodeCode Available	3	5
1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data	Aug 7, 2024	16k2k	CodeCode Available	3	5
Training-Free Long-Context Scaling of Large Language Models	Feb 27, 2024	16k	CodeCode Available	3	5
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding	Jun 29, 2023	16kImage Captioning	CodeCode Available	2	5
Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key	Jan 16, 2025	16kHallucination	CodeCode Available	2	5
LV-Eval: A Balanced Long-Context Benchmark with 5 Length Levels Up to 256K	Feb 6, 2024	16kBenchmarking	CodeCode Available	2	5
Giraffe: Adventures in Expanding Context Lengths in LLMs	Aug 21, 2023	16k4k	CodeCode Available	2	5
Training Long-Context LLMs Efficiently via Chunk-wise Optimization	May 22, 2025	16kGPU	CodeCode Available	2	5
UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based Mobile GUI Agents	May 27, 2025	16k	CodeCode Available	2	5
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs	Sep 3, 2024	16kBenchmarking	CodeCode Available	1	5
COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval	Oct 24, 2020	16kRetrieval	CodeCode Available	1	5
Complex Temporal Question Answering on Knowledge Graphs	Sep 18, 2021	16kEntity Embeddings	CodeCode Available	1	5
Faster Causal Attention Over Large Sequences Through Sparse Flash Attention	Jun 1, 2023	16k8k	CodeCode Available	1	5
SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images	Jul 16, 2024	16k	CodeCode Available	1	5
The Expressive Leaky Memory Neuron: an Efficient and Expressive Phenomenological Neuron Model Can Solve Long-Horizon Tasks	Jun 14, 2023	16kClassification	CodeCode Available	1	5
LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors	Jun 20, 2024	16kInstruction Following	CodeCode Available	1	5
DeepDarts: Modeling Keypoints as Objects for Automatic Scorekeeping in Darts using a Single Camera	May 20, 2021	16kData Augmentation	CodeCode Available	1	5
Neural Fourier Modelling: A Highly Compact Approach to Time-Series Analysis	Oct 7, 2024	16kAnomaly Detection	CodeCode Available	1	5
MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention	May 24, 2025	16k4k	CodeCode Available	1	5
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention	Mar 2, 2024	16kCPU	CodeCode Available	1	5
Home Electricity Data Generator (HEDGE): An open-access tool for the generation of electric vehicle, residential demand, and PV generation profiles	Oct 2, 2023	16k	CodeCode Available	1	5
Classifying the classifier: dissecting the weight space of neural networks	Feb 13, 2020	16k	CodeCode Available	1	5
Scaling Laws of RoPE-based Extrapolation	Oct 8, 2023	16k	CodeCode Available	1	5
CIRCLe: Color Invariant Representation Learning for Unbiased Classification of Skin Lesions	Aug 29, 2022	16kFairness	CodeCode Available	1	5
Detecting and Preventing Hallucinations in Large Vision Language Models	Aug 11, 2023	16kHallucination	CodeCode Available	1	5
Long Range Arena: A Benchmark for Efficient Transformers	Nov 8, 2020	16kBenchmarking	CodeCode Available	1	5
MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale	Nov 30, 2021	16kImage Classification	CodeCode Available	1	5
An In-Depth Exploration of Person Re-Identification and Gait Recognition in Cloth-Changing Conditions	Jan 1, 2023	16kGait Recognition	CodeCode Available	1	5
Hierarchical Nearest Neighbor Graph Embedding for Efficient Dimensionality Reduction	Mar 24, 2022	16kData Augmentation	CodeCode Available	1	5
Hydragen: High-Throughput LLM Inference with Shared Prefixes	Feb 7, 2024	16kChatbot	CodeCode Available	1	5
Factored Verification: Detecting and Reducing Hallucination in Summaries of Academic Papers	Oct 16, 2023	16kHallucination	CodeCode Available	1	5
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs	Feb 4, 2025	16kDescriptive	CodeCode Available	1	5
MorphoCluster: Efficient Annotation of Plankton images by Clustering	May 4, 2020	16kClustering	CodeCode Available	1	5
Denial-of-Service Poisoning Attacks against Large Language Models	Oct 14, 2024	16kSpeech-to-Text	CodeCode Available	1	5
BNLP: Natural language processing toolkit for Bengali language	Jan 31, 2021	16kNER	CodeCode Available	1	5
Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society	Apr 30, 2020	16k	CodeCode Available	1	5
Analyzing the Effectiveness of Large Language Models on Text-to-SQL Synthesis	Jan 22, 2024	16kProgram Synthesis	CodeCode Available	1	5

Show:10 25 50

← PrevPage 1 of 3Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Suprime2	1'"	1	—	Unverified