Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

stduhpf · 2025-02-12T02:09:16Z

Implements this paper: https://arxiv.org/abs/2410.02416

TLDR:

APG is a set of 3 modilfications for CFG:

reverse momentum: The CFG delta is getting steered away from (or closer to) the previous step's CFG delta ( --apg-momentum)
normalization: the L2 norm of the CFG delta is clamped to some value "norm threshold" (--apg-nt)
projection: the CFG delta (out_uncond-out_cond) is orthogonally projected on the same "direction" as out_cond. The final delta is linearly interpolated between the original delta and the projected delta with the parameter "eta" (--apg-eta)

Then the guidance update is computed like in normal CFG: output = out_cond + (cfg_scale-1)*delta

No extra forward pass is required, so the performance cost is negligible.

Thanks mostly to the normalization, but also the projection, this allows to take adventage of very large CFG scales without getting deep-fried output images. I'm not sure how usefull the reverse momentum really is, but it was in the paper so I added it too (I think it prevents the CFG from going too much "in the same direction" at every step?).

Usage

[your usual command with cfg here] --apg-eta 0 --apg-nt 5 --apg-momentum -0.5

Recommanded values:

eta: between 0 and 1, closer to 0 seems better. In the paper, they recommend setting it to 0 altogether (supports any real values though including negatives. Setting it to 1 neutralizes the effect)
norm: threshold between 1 and 25 depending on the model/prompts (setting it to zero or negative disables the thresholding)
momentum: preferably negative, ideally between 0 and -1 (again, any value is technically supported, setting it to 0 neutralizes the effect)

Feel free to play around with the settings, going outside of the recommended ranges can have interesting effects, especially with eta and momentum.

Tips

To help you figure out the right setting for the norm threshold, you can use the SD_LOG_CFG_DELTA_NORM environement variable to "ON" (for example, on windows powershell: $env:SD_LOG_CFG_DELTA_NORM="ON"). Then you can run your model with normal CFG.

This will print a number at each step on the terminal/logger (coresponding to the unclamped L2 norm of the CFG delta).

Pick a number that's within the range of values printed as the base norm threshold (like the median for example).

If you want to use CFG scale that's above the recomended one for the model you're using, I recommend using something this formula to update the threshold accordingly:

apg-nt = base_norm_threshold*(recomended_cfg_scale-1)/(cfg_scale-1)

The ideal parameters will depending on the prompt and other settings, but they will most likely stay in the same order of magnitude for the same model.

I also added an experimental smoothing parameter (--apg-nt-smoothing) for the normalization. In the paper they're using a "saturate" function (min(1,threshold/norm)), which has two potential issues: it has a kink (not continuously differentiable), and is not invertible as all input values outside of the $[0,1]$ range get mapped to $1$.

This experimental feature remplaces the $min(1,x)$ function with $\frac{x}{\left(1+x^{\frac{1}{p}}\right)^{p}}$, which is smooth and invertible. It is equivalent to $f(x)=x$ for small values of $x$ (just like the min) and perfectly approximates to the original $min(1,x)$ as the value of $p$ goes to $0$.

A good value of the smoothing parameter would map the upper bound of the CFG delta norms to somewhere in the [0.95, 0.99] range with this formula. There is no closed form formula to find a good value, but you can just try things and see how it goes.

I made this to experiment with the values for picking the threshold parameters: https://www.desmos.com/calculator/7sir5unorl

Edit: I also added unconditionnal SLG (--slg-uncond) (I stole the idea from deepbeepmeep/Wan2GP#61)

Just a simpler version of SLG (Skip Layer Guidance, introduced in #451) for DiT models.

Default SLG requires a third forward pass of the network with some layers skipped. This increase the computing time by a bit under 50% for the SLG steps, wich isn't ideal.

Unconditionnal SLG skips layers during the same unconditionnal pass used for CFG/APG. It seems to be about as effective as normal SLG, but it's even faster than CFG, thanks to the layers being skipped.

Downside: it's less flexible, --slg-scale should be kept to 0 and --cfg-scale now controls both the CFG and the SLG.
Upside: It's faster.

setting both --slg-scale != 0 and --slg-uncond at the same time will most likely degrade image quality while using more compute. It's possible, but not recommended. (Maybe it could be worth to investigate skipping a different sets of layers with normal slg and unconditionnal slg, but we're getting too far out of scope for this PR)

fix default slg params

wbruna · 2025-04-04T11:01:18Z

APG works even with distilled models. I was able to get good LCM generations with 4+ CFG, and negatives.

stduhpf force-pushed the apg branch from 371dbba to 2194a0f Compare February 22, 2025 15:21

stduhpf changed the title ~~Add support for APG (adaptive projected guidance)~~ Add support for APG (adaptive projected guidance) + unconditionnal SLG Mar 13, 2025

stduhpf force-pushed the apg branch from d8ea903 to 24e8455 Compare March 13, 2025 23:49

stduhpf added 7 commits March 14, 2025 00:50

apg: first implementation

a5dbce5

refactor guidance params in lib

02114c2

main: add apg support

98e056b

add apg settings to image params

e64b3b8

Fix cfg 1 crash

98064d0

Fix CI build

6baa3a6

apg: add experimental threshold smoothing parameter

fb44a88

stduhpf force-pushed the apg branch from 24e8455 to f3c2c64 Compare March 13, 2025 23:51

add uncond slg variant

8408ee1

fix default slg params

stduhpf force-pushed the apg branch from f3c2c64 to 8408ee1 Compare March 14, 2025 11:13

stduhpf mentioned this pull request May 18, 2025

Override text encoders for unet models #682

Open

apg: add SD_LOG_CFG_DELTA_NORM

36f9bd4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

stduhpf commented Feb 12, 2025 •

edited

Loading

Uh oh!

wbruna commented Apr 4, 2025

Uh oh!

Uh oh!

Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

Are you sure you want to change the base?

Add support for APG (adaptive projected guidance) + unconditionnal SLG #593

Conversation

stduhpf commented Feb 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TLDR:

Usage

Tips

Uh oh!

wbruna commented Apr 4, 2025

Uh oh!

Uh oh!

stduhpf commented Feb 12, 2025 •

edited

Loading