Reminder for Krita plugin users and VRAMlets in general about new VRAM optimization: >github.com/Doggettx/stable-diffusion/tree/autocast-improvements >If you want to use it in another fork, just grab the following 2 files and overwrite them in the fork. Make a backup of them first incase something goes wrong >ldm\modules\attention.py >ldm\modules\diffusionmodules\model.py
Vanilla stable diffusons: one 512x512 image: 9.57it/s batch of four 512x512 images: 2.99it/s
Doggettx fork: one 512x512 image: 7.54it/s batch of four 512x512 images: 2.18it/s
Charles Carter
Thanks. It's interesting to see hands and artists in the outputs, working on what I'm generating. My prompt doesn't include that explicitly, but it's in every fourth image or so. Sometimes brushes, inkwells.
Why are my Pepes getting censored? This is very consistent, not a one off thing. I also get many censored results when doing portraits of a bald guy. Very annoying.
Second one is A roman temple in the middle of a futuristic city, bustling with people, at night, electric sparks, dark, neon lighting, splits complimentary explores, retro futurism, circuitry, machines, hyperrealistic, extremely detailed art, masterpiece, concept art, Greg rutkowski
100% agree. Or at least, just limit yourself. Another user compared it to watching almost identical cars pass by on the highway and that really nails it.
Zachary Martin
I don't have the exact prompt saved, but it was pretty much:
nude naked breasts cleavage Daisy Ridley as Rey Skywalker, desert, movie still, film grain, desert, Star Wars, The Force Awakens, 4k
Then I fixed the faces in GFPGAN and upscaled and fixed them again in CodeFromer.
well. only trust what you can test yourself, especially on the chans :)
Joseph Ward
It's been a fun one. >A scene of infinite ink being poured over itself, fantasy, intricate, elegant, kinetic, by Tooth Wu, cinematic, hyperrealism, highly detailed, colorful Make sure you play with all of the samplers and mess with the CFG, too. There's really no wrong approach here, none that I've found.
I didn't notice a difference in speed with a 1650, but it requires much more RAM for me than the old opt-split-attention. 512x512 raised from 1.7 GB to more than 2 GB, 512x768 was in 2.1 GB, now it OOMs (I have 4 GB). Same results both with Voldy's commit and with Doggettx's last version in the separate branch, I had to rollback Voldy's commit to keep using the old optimization.
if it save memory enough for you to do a batch of more stuff it may overall be faster per image. still need to test For those with good cards it's probably not worth it
it's ok every thread we had some fag spamming stuff at least big toddy anime girls are not that bad
Jack Nelson
You made this?
Jordan Sanders
>French maid, professional photo shoot Subject seems to be in frame, meaning heads get cut off due to cropping in the training data by default. I was hoping it would be zoomed out further by default since that would reduce cropping issues.
sounds right, but you have to encode the prompt first and repeat or slice it to the correct batch_size
David Allen
not at lower samples others do fine with
Wyatt Wood
Hilariously, with --opt-split-attention, I'm getting almost same speed a no opt.
one 512x512 image: 9.25it/s batch of four 512x512 images: 2.94it/s
Dominic Richardson
>I cant imagine any of the training data was tagged with "poorly drawn face" or "bad anatomy," so I doubt those negative prompt terms are helping. He thinks the trainers were not training the model on 4K FULL HD, 8K, 32K, 256K FULLER HD. user....
anyone tried to get a middle finger? I've tried all the variations I can think of, but ended up resorting to editing and inpainting. it keeps trying to flip the hand over.
if it fits in memory without, it won't split. so there will be no difference apart from maybe a minmal hiccup from a slightly different layout compared to the vanilla implementation.
Ethan Myers
Worse, you're not even getting paid. Just hitting the rat pleasure brain button over and over
I get a clear downgrade in performance when using his repo fully.
Obviously I ran without that flag.
Wyatt Sullivan
Is that upscaled in the first place or was it rendered at that size? The street size and some of the elements like the weird lamps in the middle of the street look weird/wrong in any case. Since the model was trained at 512x512 trying to render at much larger size almost always screws up your composition, faggot. Here for instance is a simple prompt "John Berkey Sci-Fi", the scene composition makes sense and the elements are clear and recognizable at 512x512, it's always a spaceship or a space station, or a spaceship around a space station.
>spend 2 hours training an embedding on the textual inversion collab >it's fucking shit Anyone have tips for doing this right? How important is the initializer token? I was getting better results without the embedding.