Computer Vision Toolbox Model for OpenAI CLIP Network

The Contrastive Learning Image Pre-Training (CLIP) network is a vision language model that can be used for joint image-text classification.

MathWorks Computer Vision Toolbox Team

44.0 次下载

(0)

2026/1/26

下载

关注

下载

关注

The CLIP network uses contrastive learning to encode image and textual data into a shared feature space for joint classification. Images and text with high similarity will be close in this feature space, and have a high CLIP score. This further enables image search from input text, and text search from an input image.

MATLAB 版本兼容性

兼容 R2026a

平台兼容性

Windows
macOS (Apple 芯片)
macOS (Intel)
Linux

Computer Vision Toolbox Model for OpenAI CLIP Network

标签

必需项

MATLAB 版本兼容性

平台兼容性