This (longer) week in multimodal ai art (14/Jun - 24/Jun)

Follow on Twitter, come hang on Discord, consider supporting on Patreon. Try MindsEye, our image generation GUI

We're reporting longer than a week as a few readers expressed desire for the newsletter to be back to the end of the week instead of the beginning!

Text-to-Image synthesizers:

- Google Parti announced (Webpage, Paper)

by Google
Parti - pathways Autoregressive - a text-to-image model co-developed with Imagen by Google, it doesn't use diffusion models - rather it scales up Transformer + VQGAN architectures like DALL-E 1 and its open source replicas (dalle-pytorch, ruDALLE, DALL-E Mini). I got very excited they made a GitHub repo for it thinking they could release the code in the future. Reading the paper it is unfortunately not the case: They won't release it. They cite the 'risks and limitations' on the paper to justify not doing it.

- CogView 2 released (GitHub, Spaces)

by Tsinghua University
We reported on the CogView2 paper two weeks ago - and now they released the code! It is the biggest public model currently available, competitive with DALL-E 2 and Imagen despite perceptually slightly inferior. You can try it right no on a Hugging Face Spaces.

- Latent Majesty Diffusion updated to 1.6 (GitHub, Colab)

by multimodalart and Dango233
We released version 1.6 to Latent Majesty Diffusion, staying competitive to high quality open source text-to-image generation, try it out!

- PixelArt Diffusion updated to v.3 (GitHub, Colab)

by KaliYuga
A new model for the exciting pixel art diffusion model!

- CLIPandPaste released (GitHub, Colab)

Another way (after CLIP-CLOP) to make collages using CLIP! Very fun approach (PS: this was out on the previous week, but I forgot to include on the newsletter!)

Text-to-3D updates

- CLIP-NeRF baseline (GitHub)

by Ruixiang JIANG
A non-strict reproduction of the CLIP-NeRF paper, for stylizing 3D scenes with text!


New CLIP and CLIP-like models:

RegionCLIP released (GitHub, Spaces)

by Microsoft
Microsoft released RegionCLIP - a way to break an image down for CLIP to do classification. I wonder how the community could also use it for image generation.

Learning AI Art:

AIAIArt course (GitHub, Discord)

AIArt is a free and open source AI art course by John Whitaker. There are synchronous classes for the next few Saturdays 4 PM UTC on Twitch. All previous classes stay recorded and available on Google Colabs on the GitHub link