Multimodal Models

AI/IA

Generative Adversarial Networks (GANs, 2014)

GANs are a type of generative model that uses two neural networks:

A generator network that generates fake data
A discriminator network that tries to distinguish between real and fake data

For images, the generator network takes a random noise vector and generates an image. The discriminator network takes an image and tries to determine if it's real or fake.

Image: Goodfellow et al. 2020

Multimodal Models

Introduction to Multimodal AI

Text-to-Image Models

Generative Adversarial Networks (GANs, 2014)

CLIP (Contrastive Language-Image Pre-training, Radford et al., 2021)

Take a moment to think about this

Before we consider generative models, let's discuss what else we can do with CLIP

DALL-E (OpenAI, 2021)

Other Generation Models

Sora

Oasis

Beyond Generation

Vision Transformers

What do we use it for?

Creative Applications

Suno

MusicFX

Ethical Considerations

Is It a Tool or an Fake Artist?

Lab: AI Art Exhibition and Critique

Updates