An implementation of: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
You need GPT4-Azure or Gemini Pro to use it. Local LLMs support is still being worked on.
You must log in or # to comment.
Neat! I’ve known that Regional Prompter is powerful, but it’s too much of a pain for me to bother using. Hopefully this makes it easier.
New Lemmy Post: zydxt/sd-webui-rpg-diffusionmaster: RPG-DiffusionMaster Extension for A1111 (https://lemmy.dbzer0.com/post/13707306)
Tagging: #StableDiffusion(Replying in the OP of this thread (NOT THIS BOT!) will appear as a comment in the lemmy discussion.)
I am a FOSS bot. Check my README: https://github.com/db0/lemmy-tagginator/blob/main/README.md