Hey everyone!
We are excited to announce the launch of something new today!
We wanted users to experiment with how models like Stable Diffusion can 'reimagine' images. By doing so, it will allow people to see how the world is seen through models like Stable Difusion, and also to understand their biases and limitations.
Stable Diffusion Reimagine allows you to generate multiple variations of a single image. There is no longer a need for complex prompts. You can simply upload an image and create as many variations as you want. It is not trying to recreate the same face, person, or object as the input. Instead, Stable Diffusion Reimagine will recreate a *new* image, *inspired* by the original ones.
✨ TRY IT FOR FREE ✨
https://clipdrop.co/stable-diffu...
Stable Diffusion Reimagine is based on a new algorithm created by stability.ai
The classic text-to-image Stable Diffusion model is trained to be conditioned on text inputs.
This version replaces the original text encoder with an image encoder. So instead of generating images based on text input, images are generated from an image. Some noise is added to generate variation after the encoder is put through the algorithm.
This approach produces similar-looking images with different details and compositions. Unlike the image-to-image algorithm, the source image is first fully encoded, so the generator does not use a single pixel from the original one!
This model will soon be open-sourced in StabilityAI’s GitHub:
https://github.com/Stability-AI/...
We hope you will like it!
Congrats on the launch! Primary feedback is that outputs seem to be blurry/out of focus (compared to my photographic inputs). Realize you are replacing the text encoder so can't use prompts to achieve photorealistic composition, but wonder if there's a path to achieve compositional consistency in some other way?
ClipDrop