This week in multimodal ai art (23/Apr - 29/Apr)

Follow on Twitter, come hang on Discord, consider supporting on Patreon. Try MindsEye, our image generation GUI

Text-to-Image synthesizers:

- CLIP GEN code released (GitHub, Paper)

by HFAiLab
HFAiLab released the code for CLIP-GEN. CLIP-GEN is a text-to-image synthethyzer that can train on images only (no need for an image-text-pair) - making use of CLIP's ability to tell apart image and text already. The model uses a VQGAN to generate the images. I will get it to run locally and report back on my Twitter


- Dall-E Mega early training checkpoint (Hugging Face Spaces)

by Boris Dayma
We reported two weeks ago that DALL-E Mega (the DALL-E replication) had started training. Boris Dayma the have hooked a 15% in-training model to Hugging Face Spaces, so you can get a glimpse of it (and it already looks amazing so early in training!)

- StyleGAN XL 1024px models released (GitHub), Colab requires adaptation)

by Autonomous Vision
The Automomous Vision StyleGAN team released two big (1024px x 1024px) pre-trained models for StyleGAN XL. That is a rare feat, as models for multimodal ai art usually go as high as 512px only. Instructions on how to adapt the colab to use it are on the thread of the Twitter below:

- StyleGAN-Human + CLIP (Colab)

by Diego Porres
We reported last week on StyleGAN Human. Diego Porres added CLIP Guidance to it

New CLIP and CLIP like models:

OpenCLIP ViT-B/16 (GitHub)

by mlfoundations
Four weeks ago we reported on the LAION 400M trained OpenCLIP ViT-B/32. They released a second model with the larger and higher quality ViT-B/16 vision transformer and their performance still rivals OpenAI's CLIP. I am not aware of any notebook that uses that yet, but I will soon add all OpenCLIP models as options on MindsEye

Flamingo Visual Language Model (Paper and Blog post only)

by DeepMind
DeepMind published a paper and a blog about Flamingo: a visual language model that just needs a few examples to learn new things it hasn't seen before - you can also 'correct' it if it gets things wrong. The code wasn't released, but this can give a glimpse on "teaching" AI art models new concepts with just a few examples, which is really exciting

Learning AI Art:

AIAIArt course (GitHub, Discord)

AIArt is a free and open source AI art course by John Whitaker. There are synchronous classes for the next few Saturdays 4 PM UTC on Twitch. All previous classes stay recorded and available on Google Colabs on the GitHub link, this Saturday (April 30th) they will have the Diffusion course!