A big week: OpenAI released the biggest CLIP on the original paper. Also good news in the StyleGAN world as well as a new Colab that stiches together two other Colabs: Latent Diffusion + Disco Diffusion v5.2
StyleGAN Human* StyleGAN XL + CLIP - Katherine's modification* Centipede Diffusion Colab Notebook* MindsEye now can pilot Latent Diffusion
* code released
OpenAI released the ViT-L/[email protected] CLIP model. This is the biggest model mentioned in the original CLIP paper released 1 year and 4 months ago. It is super VRAM heavy (doesn't run on free and in most of the Colab Pro GPUs). The community is still tuning the models to work well with the model. Despite being the biggest on the paper, it is still smaller than the (unreleased CLIP ViT-H/16 used to train Dalle 2)
Without further optimization of settings specifically for it this CLIP ViT-L/[email protected], it gives interesting results when hooked to Diffusion
— multimodal ai art (@multimodalart) April 22, 2022
"a mecha robot in a favela"
CLIP ViT-L/[email protected] / CLIP ViT-B/16 + 32 pic.twitter.com/gxKe5NX98H
This probably shows that this huge model (25GB of VRAM, no Colab free for it) requires best parameter tuning and settings to make it shine@RiversHaveWings is exploring with combining it to Deep Image Prior, @devdef added it to their Disco Diffusion fork https://t.co/MTvY8yApmt
— multimodal ai art (@multimodalart) April 22, 2022
StyleGAN-Human: A Data-Centric Odyssey of Human Generation
— AK (@ak92501) April 22, 2022
project page: https://t.co/mNy0zaN3L8
github: https://t.co/LCelCAIpXo pic.twitter.com/G3DtIh3gZs
"the gateway between dreams, trending on ArtStation" StyleGAN XL + CLIP pic.twitter.com/Aetc8upA47
— Rivers Have Wings (@RiversHaveWings) April 20, 2022
Whoa, this is promising! Centipede Diffusion combines the our Latent Diffusion Colab and Disco Diffusion by @gandamu_ml, @Somnai_dreams and @zippy731 in a clever way (basically Disco is used as an upscaler for Latent!)https://t.co/K8BNHezh3d
— multimodal ai art (@multimodalart) April 20, 2022
Will try it and report back soon! https://t.co/UmaMnuaEsx pic.twitter.com/teaYiqGD9v
MindsEye, our GUI for text-to-image now has a new model for it to pilot: Latent Diffusion. Now from the same UI and without touching a line of code (not even to tweak parameters) you can control VQGAN+CLIP, CLIP Guided Diffusion and now Latent Diffusion. Check MindsEye out!
I've just added Latent Diffusion to MindsEye beta! 🧠👁️🎨
— multimodal ai art (@multimodalart) April 20, 2022
It's the GLID-3 XL implementation by Jack000, goes a beyond CompVis' in a few fronts: allows for init images, negative prompts, has a filtered model, CLIP Guidance and more!
Check it out @ https://t.co/7sxIw0pKWC pic.twitter.com/kY61k0zee4
AIArt is a free and open source AI art course by John Whitaker. There are synchronous classes for the next few Saturdays 4 PM UTC on Twitch. All previous classes stay recorded and available on Google Colabs on the GitHub link, this Saturday there's no classes so it's a good time to start so you can catch up with the content!