SOTAVerified

Representation Learning

Representation Learning is a process in machine learning where algorithms extract meaningful patterns from raw data to create representations that are easier to understand and process. These representations can be designed for interpretability, reveal hidden features, or be used for transfer learning. They are valuable across many fundamental machine learning tasks like image classification and retrieval.

Deep neural networks can be considered representation learning models that typically encode information which is projected into a different subspace. These representations are then usually passed on to a linear classifier to, for instance, train a classifier.

Representation learning can be divided into:

  • Supervised representation learning: learning representations on task A using annotated data and used to solve task B
  • Unsupervised representation learning: learning representations on a task in an unsupervised way (label-free data). These are then used to address downstream tasks and reducing the need for annotated data when learning news tasks. Powerful models like GPT and BERT leverage unsupervised representation learning to tackle language tasks.

More recently, self-supervised learning (SSL) is one of the main drivers behind unsupervised representation learning in fields like computer vision and NLP.

Here are some additional readings to go deeper on the task:

( Image credit: Visualizing and Understanding Convolutional Networks )

Papers

Showing 150 of 10580 papers

TitleStatusHype
VMamba: Visual State Space ModelCode7
Full Scaling Automation for Sustainable Development of Green Data CentersCode7
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-SpeechCode6
Orbit: A Unified Simulation Framework for Interactive Robot Learning EnvironmentsCode5
Chinese CLIP: Contrastive Vision-Language Pretraining in ChineseCode5
Point Transformer V3: Simpler Faster StrongerCode5
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMsCode5
A Time Series is Worth 64 Words: Long-term Forecasting with TransformersCode5
CodeGen2: Lessons for Training LLMs on Programming and Natural LanguagesCode5
Masked Completion via Structured Diffusion with White-Box TransformersCode5
Self-Supervised Pre-Training for Table Structure Recognition TransformerCode4
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment AnythingCode4
Resources for Brewing BEIR: Reproducible Reference Models and an Official LeaderboardCode4
ControlVAE: Tuning, Analytical Properties, and Performance AnalysisCode4
Morphological Prototyping for Unsupervised Slide Representation Learning in Computational PathologyCode4
Lightweight Pixel Difference Networks for Efficient Visual Representation LearningCode4
A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future DirectionsCode4
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised PretrainingCode4
2D Matryoshka Sentence EmbeddingsCode4
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsCode4
Sundial: A Family of Highly Capable Time Series Foundation ModelsCode4
SVFR: A Unified Framework for Generalized Video Face RestorationCode4
LLM2CLIP: Powerful Language Model Unlocks Richer Visual RepresentationCode4
Multi-label Cluster Discrimination for Visual Representation LearningCode4
ROLAND: Graph Learning Framework for Dynamic GraphsCode3
Robust and Efficient Medical Imaging with Self-SupervisionCode3
Common Sense Reasoning for Deepfake DetectionCode3
Probabilistic Forecasting with Temporal Convolutional Neural NetworkCode3
Addressing Representation Collapse in Vector Quantized Models with One Linear LayerCode3
ConvNeXt V2: Co-designing and Scaling ConvNets with Masked AutoencodersCode3
OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed DomainCode3
Point Transformer V3: Simpler, Faster, StrongerCode3
Questioning Representational Optimism in Deep Learning: The Fractured Entangled Representation HypothesisCode3
SGFormer: Single-Layer Graph Transformers with Approximation-Free Linear ComplexityCode3
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive LossCode3
Multi-Modality Representation Learning for Antibody-Antigen Interactions PredictionCode3
Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Language ModelsCode3
HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image AnalysisCode3
Momentum Contrast for Unsupervised Visual Representation LearningCode3
MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector QuantizationCode3
GaussianOcc: Fully Self-supervised and Efficient 3D Occupancy Estimation with Gaussian SplattingCode3
A Survey on Self-Supervised Learning for Non-Sequential Tabular DataCode3
Foundation Models for Music: A SurveyCode3
GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial UnderstandingCode3
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language UnderstandingCode3
Evaluating representation learning on the protein structure universeCode3
EEGPT: Pretrained Transformer for Universal and Reliable Representation of EEG SignalsCode3
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image AnalysisCode3
Elucidating the Design Space of Multimodal Protein Language ModelsCode3
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMsCode3
Show:102550
← PrevPage 1 of 212Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SciNCLAvg.81.8Unverified
2SPECTERAvg.80Unverified
3CiteomaticAvg.76Unverified
4Sci-DeCLUTRAvg.66.6Unverified
5SciBERTAvg.59.6Unverified
6BioBERTAvg.58.8Unverified
7CiteBERTAvg.58.8Unverified
#ModelMetricClaimedVerifiedStatus
1top_model_weights_with_3d_21:1 Accuracy0.75Unverified
#ModelMetricClaimedVerifiedStatus
1Resnet 18Accuracy (%)97.05Unverified
#ModelMetricClaimedVerifiedStatus
1Morphological NetworkAccuracy97.3Unverified
#ModelMetricClaimedVerifiedStatus
1Max Margin ContrastiveSilhouette Score0.56Unverified