Multimodal Text and Image Classification
Classification with both source Image and Text
Papers
Showing 1–7 of 7 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Early Fusion (Bert + InceptionV3) | Accuracy (%) | 92.5 | — | Unverified |
| 2 | Late Fusion (Bert + InceptionV3) | Accuracy (%) | 84.59 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Convolutional image feature extraction and dense concatenating | Accuracy | 88 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Two Branch Network (Text - Bert + Image - Nts-Net) | Accuracy | 96.81 | — | Unverified |