Unsupervised Multi-object Segmentation Using Attention and Soft-argmax
2022-05-26Code Available1· sign in to hype
Bruno Sauvalle, Arnaud de La Fortelle
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/BrunoSauvalle/ASTOfficialpytorch★ 15
Abstract
We introduce a new architecture for unsupervised object-centric representation learning and multi-object detection and segmentation, which uses a translation-equivariant attention mechanism to predict the coordinates of the objects present in the scene and to associate a feature vector to each object. A transformer encoder handles occlusions and redundant detections, and a convolutional autoencoder is in charge of background reconstruction. We show that this architecture significantly outperforms the state of the art on complex synthetic benchmarks.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| ObjectsRoom | AST | ARI-FG | 0.87 | — | Unverified |
| ShapeStacks | AST | ARI-FG | 0.82 | — | Unverified |