What an amazing week for multimodal ai art! Plenty of completely new models, new CLIP variants and a giant dataset! I will let the sheer amount of news speak for themselves
RQ-VAE-Transformer* Glid-3* StyleGAN XL* Make-a-Scene Disco Diffusion v5.1 upgrade* CLOOB Guided Diffusion*
ViT-B/16 CLOOB* KELIP - Korean+English CLIP* LAION CLIP ViT-B/32* CLOOB-YFCC-CFG*
A new architecture that does not use diffusion. A spiritual follow-up to VQGAN Super fast. Super VRAM hungry
Got the newest text-to-image, RQ-VAE-Transformer (https://t.co/bQdkIM9eAg, released 25/03) running locally
— multimodal ai art (@multimodalart) March 26, 2022
Remarks:
- Big. 3.9B params model released
- Heavy. Needs >=32GB VRAM
- Fast. 256 samples in < 10 sec
"an illustration of an alien forest" pic.twitter.com/mRWg4RL9Fj
A combination of GLIDE+CLIP+Latent Diffusion, with a mid-training model released that despite being small, is a powerful model for photorealistic generation
Got the new text-to-image, GLID-3 (https://t.co/Uko74Us1Y6, released 29/03) running locally
— multimodal ai art (@multimodalart) March 29, 2022
Remarks:
- Mixed approach: a bit of GLIDE, Latent Diffusion and CLIP
- Current model: a mid-training 600M parameters
- Colab available (https://t.co/PFFW7hzv8k)
"a panda holding a beer" pic.twitter.com/v8jVpXzwID
Likely not immediately, but we are discussing it. We are thinking of ways to minimize the potential misuse of such models.
— Devi Parikh (@deviparikh) March 29, 2022
Disco Diffusion 5.1 is out https://t.co/uiZUj3Ivff
— Roope Rainisto (@rainisto) March 31, 2022
Of note: video_init_seed_continuity option has been added for video source mode. i.e. stop changing the seed during the video.
Here's an example - 50% skip steps. Night and day. I'm going to have fun with this one. pic.twitter.com/ZvtAVkqMDN
"jumping off a diving board into a swimming pool of gold coins, trending on artstation"
— John David Pressman (@jd_pressman) March 8, 2022
(CLOOB Guided CLIP Conditioned Diffusion [cc12m_1]) pic.twitter.com/SOvjzRg7Ln
A mind-blowing CLIP Guided 3D mesh deformation and stylization
I am sharing a lightweight ClipMatrix demo https://t.co/Escy0xLNgp
— Nikolay Jetchev (@NJetchev) March 29, 2022
I hope that many will try it, modify the code, and create inspiring text-controlled 3D art. Have fun with the Colab, and let us make a great AI art community around this tool.
"Rusty Metal Robot"#3Dart #AIart pic.twitter.com/o0gqtPWUq0
Big models like this can be used to train: