Multimodal Models

CLIP, DALL-E, GPT-4V, and models that understand both text and images.

Part of Generative AI & Foundation Models on neo-ai.

Browse all neo-ai courses · Back to course overview