Skip to content

Fix SDXL t2i adapters expect BGR instead of RGB #7205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 29, 2024

Conversation

dunkeroni
Copy link
Contributor

Summary

Added conversions that trigger on SDXL to change the color order for t2i image inputs, which is required for OpenPose to work correctly. TencentARC t2i adapters were trained on cv2 preprocessor outputs in BGR format. As far as I can tell, those are the only publicly available SDXL-T2I available at this time, and OpenPose is the only one that is not greyscale. Changing the image order has no effect on canny/depth/etc.

Fix is applied for both main backend and the as-of-yet-unused modular backend. Also I snuck in a fix for the progress bar so the modular backend doesn't crash before denoising begins.

Related Issues / Discussions

https://www.reddit.com/r/StableDiffusion/comments/1f854k3/psa_fixing_sdxl_t2iadapter_openpose/

QA Instructions

Before:
image

After:
image

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)

@github-actions github-actions bot added python PRs that change python files invocations PRs that change invocations backend PRs that change backend files labels Oct 26, 2024
@psychedelicious psychedelicious merged commit 47168b5 into invoke-ai:main Oct 29, 2024
14 checks passed
psychedelicious added a commit that referenced this pull request Nov 1, 2024
…7215)

## Summary

This change mimics the unet padding strategy to align T2I featuremaps
with the latents during denoising. It also slightly adjusts the crop and
scale logic so that the control will match the input image without
shifting when it needs to pad.

## Related Issues / Discussions

<!--WHEN APPLICABLE: List any related issues or discussions on github or
discord. If this PR closes an issue, please use the "Closes #1234"
format, so that the issue will be automatically closed when the PR
merges.-->

## QA Instructions

Image generated at 1032x1024

![image](https://github.com/user-attachments/assets/7ea579e4-61dc-4b6b-aa84-33d676d160c6)

Image generated at 1080x1040 to prove feature alignment.

![image](https://github.com/user-attachments/assets/ee6e5b6a-d0d5-474d-9fc4-f65c104964bd)

Edge artifacts on the bottom and right are a result of SDXL's unet
padding, and t2i influence will be cut off in those regions.

## Merge Plan

Contingent on #7205 
Currently the Canvas UI prevents users from generating non-64
resolutions while t2i adapter layers are active. Will leave this as a
draft until fixing that.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _Documentation added / updated (if applicable)_
@dunkeroni dunkeroni deleted the sdxl_t2i_bgr branch June 8, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files invocations PRs that change invocations python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants