Connecting Multi-modal Contrastive Representations

原创

已于 2023-12-17 01:43:26 修改 · 1.1k 阅读

标签

#深度学习

收录于

于 2023-12-16 23:23:05 首次发布

文章目录

1.Motivation:
2.Challenges:
3.Contribution:
4.Network
5.Experiment
6.感悟

1.Motivation:

Current Multi-modal Contrastive Representation (MCR) learning relies on massive high-quality data pairs, which limits its further development on more modalities.
当前的多模态对比表示（MCR）学习依赖于大量高质量的数据对，这限制了其在更多模态上的进一步发展。

2.Challenges:

Embeddings in MCR spaces are incapable of comprehensively reflecting all the semantic information of the input.
MCR空间中的嵌入无法全面反映输入的所有语义信息。
MCR spaces exhibit a modality gap phenomenon, i.e., the embeddings of different modalities are located in two completely separate regions in each MCR space.
MCR空间表现出模态间隙现象，即不同模态的嵌入位于每个MCR空间中两个完全独立的区域。