mirror of
https://github.com/invoke-ai/InvokeAI.git
synced 2025-01-07 03:17:05 +08:00
Global replace [ \t]+$, add "GB" (#1751)
* "GB" * Replace [ \t]+$ global Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
This commit is contained in:
parent
4fd97ceddd
commit
7d8d4bcafb
2
.gitattributes
vendored
2
.gitattributes
vendored
@ -1,4 +1,4 @@
|
||||
# Auto normalizes line endings on commit so devs don't need to change local settings.
|
||||
# Only affects text files and ignores other file types.
|
||||
# Only affects text files and ignores other file types.
|
||||
# For more info see: https://www.aleksandrhovhannisyan.com/blog/crlf-vs-lf-normalizing-line-endings-in-git/
|
||||
* text=auto
|
||||
|
@ -1,4 +1,4 @@
|
||||
<img src="docs/assets/invoke_ai_banner.png" align="center">
|
||||
<img src="docs/assets/invoke_ai_banner.png" align="center">
|
||||
|
||||
Invoke-AI is a community of software developers, researchers, and user
|
||||
interface experts who have come together on a voluntary basis to build
|
||||
@ -81,5 +81,5 @@ area. Disputes are resolved by open and honest communication.
|
||||
|
||||
## Signature
|
||||
|
||||
This document has been collectively crafted and approved by the current InvokeAI team members, as of 28 Nov 2022: **lstein** (Lincoln Stein), **blessedcoolant**, **hipsterusername** (Kent Keirsey), **Kyle0654** (Kyle Schouviller), **damian0815**, **mauwii** (Matthias Wild), **Netsvetaev** (Artur Netsvetaev), **psychedelicious**, **tildebyte**, and **keturn**. Although individuals within the group may hold differing views on particular details and/or their implications, we are all in agreement about its fundamental statements, as well as their significance and importance to this project moving forward.
|
||||
This document has been collectively crafted and approved by the current InvokeAI team members, as of 28 Nov 2022: **lstein** (Lincoln Stein), **blessedcoolant**, **hipsterusername** (Kent Keirsey), **Kyle0654** (Kyle Schouviller), **damian0815**, **mauwii** (Matthias Wild), **Netsvetaev** (Artur Netsvetaev), **psychedelicious**, **tildebyte**, and **keturn**. Although individuals within the group may hold differing views on particular details and/or their implications, we are all in agreement about its fundamental statements, as well as their significance and importance to this project moving forward.
|
||||
|
||||
|
10
README.md
10
README.md
@ -53,11 +53,11 @@ For full installation and upgrade instructions, please see:
|
||||
|
||||
1. Go to the bottom of the [Latest Release Page](https://github.com/invoke-ai/InvokeAI/releases/latest)
|
||||
2. Download the .zip file for your OS (Windows/macOS/Linux).
|
||||
3. Unzip the file.
|
||||
3. Unzip the file.
|
||||
4. If you are on Windows, double-click on the `install.bat` script. On macOS, open a Terminal window, drag the file `install.sh` from Finder into the Terminal, and press return. On Linux, run `install.sh`.
|
||||
5. Wait a while, until it is done.
|
||||
5. Wait a while, until it is done.
|
||||
6. The folder where you ran the installer from will now be filled with lots of files. If you are on Windows, double-click on the `invoke.bat` file. On macOS, open a Terminal window, drag `invoke.sh` from the folder into the Terminal, and press return. On Linux, run `invoke.sh`
|
||||
7. Press 2 to open the "browser-based UI", press enter/return, wait a minute or two for Stable Diffusion to start up, then open your browser and go to http://localhost:9090.
|
||||
7. Press 2 to open the "browser-based UI", press enter/return, wait a minute or two for Stable Diffusion to start up, then open your browser and go to http://localhost:9090.
|
||||
8. Type `banana sushi` in the box on the top left and click `Invoke`:
|
||||
|
||||
<div align="center"><img src="docs/assets/invoke-web-server-1.png" width=640></div>
|
||||
@ -161,9 +161,9 @@ problems and other issues.
|
||||
# Contributing
|
||||
|
||||
Anyone who wishes to contribute to this project, whether documentation, features, bug fixes, code
|
||||
cleanup, testing, or code reviews, is very much encouraged to do so.
|
||||
cleanup, testing, or code reviews, is very much encouraged to do so.
|
||||
|
||||
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
|
||||
To join, just raise your hand on the InvokeAI Discord server (#dev-chat) or the GitHub discussion board.
|
||||
|
||||
If you are unfamiliar with how
|
||||
to contribute to GitHub projects, here is a
|
||||
|
@ -21,7 +21,7 @@ This model card focuses on the model associated with the Stable Diffusion model,
|
||||
|
||||
# Uses
|
||||
|
||||
## Direct Use
|
||||
## Direct Use
|
||||
The model is intended for research purposes only. Possible research areas and
|
||||
tasks include
|
||||
|
||||
@ -68,11 +68,11 @@ Using the model to generate content that is cruel to individuals is a misuse of
|
||||
considerations.
|
||||
|
||||
### Bias
|
||||
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
|
||||
Stable Diffusion v1 was trained on subsets of [LAION-2B(en)](https://laion.ai/blog/laion-5b/),
|
||||
which consists of images that are primarily limited to English descriptions.
|
||||
Texts and images from communities and cultures that use other languages are likely to be insufficiently accounted for.
|
||||
This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
|
||||
While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
|
||||
Stable Diffusion v1 was trained on subsets of [LAION-2B(en)](https://laion.ai/blog/laion-5b/),
|
||||
which consists of images that are primarily limited to English descriptions.
|
||||
Texts and images from communities and cultures that use other languages are likely to be insufficiently accounted for.
|
||||
This affects the overall output of the model, as white and western cultures are often set as the default. Further, the
|
||||
ability of the model to generate content with non-English prompts is significantly worse than with English-language prompts.
|
||||
|
||||
|
||||
@ -84,7 +84,7 @@ The model developers used the following dataset for training the model:
|
||||
- LAION-2B (en) and subsets thereof (see next section)
|
||||
|
||||
**Training Procedure**
|
||||
Stable Diffusion v1 is a latent diffusion model which combines an autoencoder with a diffusion model that is trained in the latent space of the autoencoder. During training,
|
||||
Stable Diffusion v1 is a latent diffusion model which combines an autoencoder with a diffusion model that is trained in the latent space of the autoencoder. During training,
|
||||
|
||||
- Images are encoded through an encoder, which turns images into latent representations. The autoencoder uses a relative downsampling factor of 8 and maps images of shape H x W x 3 to latents of shape H/f x W/f x 4
|
||||
- Text prompts are encoded through a ViT-L/14 text-encoder.
|
||||
@ -108,12 +108,12 @@ filtered to images with an original size `>= 512x512`, estimated aesthetics scor
|
||||
- **Batch:** 32 x 8 x 2 x 4 = 2048
|
||||
- **Learning rate:** warmup to 0.0001 for 10,000 steps and then kept constant
|
||||
|
||||
## Evaluation Results
|
||||
## Evaluation Results
|
||||
Evaluations with different classifier-free guidance scales (1.5, 2.0, 3.0, 4.0,
|
||||
5.0, 6.0, 7.0, 8.0) and 50 PLMS sampling
|
||||
steps show the relative improvements of the checkpoints:
|
||||
|
||||
![pareto](assets/v1-variants-scores.jpg)
|
||||
![pareto](assets/v1-variants-scores.jpg)
|
||||
|
||||
Evaluated using 50 PLMS steps and 10000 random prompts from the COCO2017 validation set, evaluated at 512x512 resolution. Not optimized for FID scores.
|
||||
## Environmental Impact
|
||||
|
@ -43,7 +43,7 @@ def get_canvas_generation_mode(
|
||||
)
|
||||
|
||||
"""
|
||||
Mask images are white in areas where no change should be made, black where changes
|
||||
Mask images are white in areas where no change should be made, black where changes
|
||||
should be made.
|
||||
"""
|
||||
|
||||
|
@ -31,7 +31,7 @@ stable-diffusion-1.4:
|
||||
width: 512
|
||||
height: 512
|
||||
waifu-diffusion-1.3:
|
||||
description: Stable Diffusion 1.4 fine tuned on anime-styled images (4.27)
|
||||
description: Stable Diffusion 1.4 fine tuned on anime-styled images (4.27 GB)
|
||||
repo_id: hakurei/waifu-diffusion-v1-3
|
||||
config: v1-inference.yaml
|
||||
file: model-epoch09-float32.ckpt
|
||||
|
@ -107,4 +107,4 @@ lightning:
|
||||
benchmark: True
|
||||
max_steps: 4000000
|
||||
# max_steps: 4000
|
||||
|
||||
|
||||
|
@ -107,4 +107,4 @@ lightning:
|
||||
benchmark: False
|
||||
max_steps: 6200
|
||||
# max_steps: 4000
|
||||
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
The Unified Canvas is a tool designed to streamline and simplify the process of composing an image using Stable Diffusion. It offers artists all of the available Stable Diffusion generation modes (Text To Image, Image To Image, Inpainting, and Outpainting) as a single unified workflow. The flexibility of the tool allows you to tweak and edit image generations, extend images beyond their initial size, and to create new content in a freeform way both inside and outside of existing images.
|
||||
The Unified Canvas is a tool designed to streamline and simplify the process of composing an image using Stable Diffusion. It offers artists all of the available Stable Diffusion generation modes (Text To Image, Image To Image, Inpainting, and Outpainting) as a single unified workflow. The flexibility of the tool allows you to tweak and edit image generations, extend images beyond their initial size, and to create new content in a freeform way both inside and outside of existing images.
|
||||
|
||||
This document explains the basics of using the Unified Canvas, introducing you to its features and tools one by one. It also describes some of the more advanced tools available to power users of the Canvas.
|
||||
|
||||
@ -21,7 +21,7 @@ Accepting generations will commit the new generation to the **Base Layer**. You
|
||||
The **Mask Layer** consists of any masked sections that have been created to inform Inpainting generations. You can paint a new mask, or edit an existing mask, using the Brush tool and the Eraser with the Mask layer set as your Active layer. Any masked areas will only affect generation inside of the current bounding box.
|
||||
|
||||
### Bounding Box
|
||||
When generating a new image, Invoke will process and apply new images within the area denoted by the **Bounding Box**. The Width & Height settings of the Bounding Box, as well as its location within the Unified Canvas and pixels or empty space that it encloses, determine how new invocations are generated - see [Inpainting & Outpainting](#inpainting-and-outpainting) below. The Bounding Box can be moved and resized using the Move (V) tool. It can also be resized using the Bounding Box options in the Options Panel. By using these controls you can generate larger or smaller images, control which sections of the image are being processed, as well as control Bounding Box tools like the Bounding Box fill/erase.
|
||||
When generating a new image, Invoke will process and apply new images within the area denoted by the **Bounding Box**. The Width & Height settings of the Bounding Box, as well as its location within the Unified Canvas and pixels or empty space that it encloses, determine how new invocations are generated - see [Inpainting & Outpainting](#inpainting-and-outpainting) below. The Bounding Box can be moved and resized using the Move (V) tool. It can also be resized using the Bounding Box options in the Options Panel. By using these controls you can generate larger or smaller images, control which sections of the image are being processed, as well as control Bounding Box tools like the Bounding Box fill/erase.
|
||||
|
||||
### <a name="inpainting-and-outpainting"></a> Inpainting & Outpainting
|
||||
"Inpainting" means asking the AI to refine part of an image while leaving the rest alone. For example, updating a portrait of your grandmother to have her wear a biker's jacket.
|
||||
@ -48,9 +48,9 @@ To get started with the Unified Canvas, you will want to generate a new base lay
|
||||
|
||||
From there, you can consider the following techniques to augment your image:
|
||||
* **New Images**: Move the bounding box to an empty area of the Canvas, type in your prompt, and Invoke, to generate a new image using the Text to Image function.
|
||||
* **Image Correction**: Use the color picker and brush tool to paint corrections on the image, switch to the Mask layer, and brush a mask over your painted area to use **Inpainting**. You can also use the **ImageToImage** generation method to invoke new interpretations of the image.
|
||||
* **Image Correction**: Use the color picker and brush tool to paint corrections on the image, switch to the Mask layer, and brush a mask over your painted area to use **Inpainting**. You can also use the **ImageToImage** generation method to invoke new interpretations of the image.
|
||||
* **Image Expansion**: Move the bounding box to include a portion of your initial image, and a portion of transparent/empty pixels, then Invoke using a prompt that describes what you'd like to see in that area. This will Outpaint the image. You'll typically find more coherent results if you keep about 50-60% of the original image in the bounding box. Make sure that the Image To Image Strength slider is set to a high value - you may need to set it higher than you are used to.
|
||||
* **New Content on Existing Images**: If you want to add new details or objects into your image, use the brush tool to paint a sketch of what you'd like to see on the image, switch to the Mask layer, and brush a mask over your painted area to use **Inpainting**. If the masked area is small, consider using a smaller bounding box to take advantage of Invoke's automatic Scaling features, which can help to produce better details.
|
||||
* **New Content on Existing Images**: If you want to add new details or objects into your image, use the brush tool to paint a sketch of what you'd like to see on the image, switch to the Mask layer, and brush a mask over your painted area to use **Inpainting**. If the masked area is small, consider using a smaller bounding box to take advantage of Invoke's automatic Scaling features, which can help to produce better details.
|
||||
* **And more**: There are a number of creative ways to use the Canvas, and the above are just starting points. We're excited to see what you come up with!
|
||||
|
||||
|
||||
@ -82,27 +82,27 @@ Features with non-obvious behavior are detailed below, in order to provide clari
|
||||
## Toolbar
|
||||
|
||||
### Mask Options
|
||||
* **Enable Mask** - This flag can be used to Enable or Disable the currently painted mask. If you have painted a mask, but you don't want it affect the next invocation, but you *also* don't want to delete it, then you can set this option to Disable. When you want the mask back, set this back to Enable.
|
||||
* **Enable Mask** - This flag can be used to Enable or Disable the currently painted mask. If you have painted a mask, but you don't want it affect the next invocation, but you *also* don't want to delete it, then you can set this option to Disable. When you want the mask back, set this back to Enable.
|
||||
* **Preserve Masked Area** - When enabled, Preserve Masked Area inverts the effect of the Mask on the Inpainting process. Pixels in masked areas will be kept unchanged, and unmasked areas will be regenerated.
|
||||
|
||||
### Creative Tools
|
||||
* **Brush - Base/Mask Modes** - The Brush tool switches automatically between different modes of operation for the Base and Mask layers respectively.
|
||||
* On the Base layer, the brush will directly paint on the Canvas using the color selected on the Brush Options menu.
|
||||
* **Brush - Base/Mask Modes** - The Brush tool switches automatically between different modes of operation for the Base and Mask layers respectively.
|
||||
* On the Base layer, the brush will directly paint on the Canvas using the color selected on the Brush Options menu.
|
||||
* On the Mask layer, the brush will create a new mask. If you're finding the mask difficult to see over the existing content of the Unified Canvas, you can change the color it is drawn with using the color selector on the Mask Options dropdown.
|
||||
* **Erase Bounding Box** - On the Base layer, erases all pixels within the Bounding Box.
|
||||
* **Fill Bounding Box** - On the Base layer, fills all pixels within the Bounding Box with the currently selected color.
|
||||
|
||||
### Canvas Tools
|
||||
* **Move Tool** - Allows for manipulation of the Canvas view (by dragging on the Canvas, outside the bounding box), the Bounding Box (by dragging the edges of the box), or the Width/Height of the Bounding Box (by dragging one of the 9 directional handles).
|
||||
* **Reset View** - Click to re-orients the view to the center of the Bounding Box.
|
||||
* **Reset View** - Click to re-orients the view to the center of the Bounding Box.
|
||||
* **Merge Visible** - If your browser is having performance problems drawing the image in the Unified Canvas, click this to consolidate all of the information currently being rendered by your browser into a merged copy of the image. This lowers the resource requirements and should improve performance.
|
||||
|
||||
## Seam Correction
|
||||
When doing Inpainting or Outpainting, Invoke needs to merge the pixels generated by Stable Diffusion into your existing image. To do this, the area around the `seam` at the boundary between your image and the new generation is automatically blended to produce a seamless output. In a fully automatic process, a mask is generated to cover the seam, and then the area of the seam is Inpainted.
|
||||
When doing Inpainting or Outpainting, Invoke needs to merge the pixels generated by Stable Diffusion into your existing image. To do this, the area around the `seam` at the boundary between your image and the new generation is automatically blended to produce a seamless output. In a fully automatic process, a mask is generated to cover the seam, and then the area of the seam is Inpainted.
|
||||
|
||||
Although the default options should work well most of the time, sometimes it can help to alter the parameters that control the seam Inpainting. A wider seam and a blur setting of about 1/3 of the seam have been noted as producing consistently strong results (e.g. 96 wide and 16 blur - adds up to 32 blur with both sides). Seam strength of 0.7 is best for reducing hard seams.
|
||||
* **Seam Size** - The size of the seam masked area. Set higher to make a larger mask around the seam.
|
||||
* **Seam Blur** - The size of the blur that is applied on *each* side of the masked area.
|
||||
* **Seam Blur** - The size of the blur that is applied on *each* side of the masked area.
|
||||
* **Seam Strength** - The Image To Image Strength parameter used for the Inpainting generation that is applied to the seam area.
|
||||
* **Seam Steps** - The number of generation steps that should be used to Inpaint the seam.
|
||||
|
||||
|
@ -39,7 +39,7 @@ Looking for a short version? Here's a TL;DR in 3 tables.
|
||||
!!! tip "suggestions"
|
||||
|
||||
For most use cases, `K_LMS`, `K_HEUN` and `K_DPM_2` are the best choices (the latter 2 run 0.5x as quick, but tend to converge 2x as quick as `K_LMS`). At very low steps (≤ `-s8`), `K_HEUN` and `K_DPM_2` are not recommended. Use `K_LMS` instead.
|
||||
|
||||
|
||||
For variability, use `K_EULER_A` (runs 2x as quick as `K_DPM_2_A`).
|
||||
|
||||
---
|
||||
|
@ -100,7 +100,7 @@ directory
|
||||
The original Stable Diffusion version 1.4 weight file (4.27 GB)
|
||||
Download? [n] n
|
||||
[4] waifu-diffusion-1.3:
|
||||
Stable Diffusion 1.4 fine tuned on anime-styled images (4.27)
|
||||
Stable Diffusion 1.4 fine tuned on anime-styled images (4.27 GB)
|
||||
Download? [n] y
|
||||
[5] ft-mse-improved-autoencoder-840000:
|
||||
StabilityAI improved autoencoder fine-tuned for human faces (recommended; 335 MB) (recommended)
|
||||
|
@ -64,7 +64,7 @@ steps:
|
||||
It should look like the follwing:
|
||||
|
||||
```
|
||||
Python 3.9.5 (default, Nov 23 2021, 15:27:38)
|
||||
Python 3.9.5 (default, Nov 23 2021, 15:27:38)
|
||||
[GCC 9.3.0] on linux
|
||||
Type "help", "copyright", "credits" or "license" for more information.
|
||||
>>> from patchmatch import patch_match
|
||||
|
@ -1 +0,0 @@
|
||||
020_INSTALL_MANUAL.md
|
429
docs/installation/INSTALL_MANUAL.md
Normal file
429
docs/installation/INSTALL_MANUAL.md
Normal file
@ -0,0 +1,429 @@
|
||||
---
|
||||
title: Manual Installation
|
||||
---
|
||||
|
||||
<figure markdown>
|
||||
# :fontawesome-brands-linux: Linux | :fontawesome-brands-apple: macOS | :fontawesome-brands-windows: Windows
|
||||
</figure>
|
||||
|
||||
!!! warning "This is for advanced Users"
|
||||
|
||||
who are already experienced with using conda or pip
|
||||
|
||||
## Introduction
|
||||
|
||||
You have two choices for manual installation, the [first one](#Conda_method)
|
||||
based on the Anaconda3 package manager (`conda`), and
|
||||
[a second one](#PIP_method) which uses basic Python virtual environment (`venv`)
|
||||
commands and the PIP package manager. Both methods require you to enter commands
|
||||
on the terminal, also known as the "console".
|
||||
|
||||
On Windows systems you are encouraged to install and use the
|
||||
[Powershell](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell-on-windows?view=powershell-7.3),
|
||||
which provides compatibility with Linux and Mac shells and nice features such as
|
||||
command-line completion.
|
||||
|
||||
### Conda method
|
||||
|
||||
1. Check that your system meets the
|
||||
[hardware requirements](index.md#Hardware_Requirements) and has the
|
||||
appropriate GPU drivers installed. In particular, if you are a Linux user
|
||||
with an AMD GPU installed, you may need to install the
|
||||
[ROCm driver](https://rocmdocs.amd.com/en/latest/Installation_Guide/Installation-Guide.html).
|
||||
|
||||
InvokeAI does not yet support Windows machines with AMD GPUs due to the lack
|
||||
of ROCm driver support on this platform.
|
||||
|
||||
To confirm that the appropriate drivers are installed, run `nvidia-smi` on
|
||||
NVIDIA/CUDA systems, and `rocm-smi` on AMD systems. These should return
|
||||
information about the installed video card.
|
||||
|
||||
Macintosh users with MPS acceleration, or anybody with a CPU-only system,
|
||||
can skip this step.
|
||||
|
||||
2. You will need to install Anaconda3 and Git if they are not already
|
||||
available. Use your operating system's preferred package manager, or
|
||||
download the installers manually. You can find them here:
|
||||
|
||||
- [Anaconda3](https://www.anaconda.com/)
|
||||
- [git](https://git-scm.com/downloads)
|
||||
|
||||
3. Clone the [InvokeAI](https://github.com/invoke-ai/InvokeAI) source code from
|
||||
GitHub:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/invoke-ai/InvokeAI.git
|
||||
```
|
||||
|
||||
This will create InvokeAI folder where you will follow the rest of the
|
||||
steps.
|
||||
|
||||
4. Enter the newly-created InvokeAI folder:
|
||||
|
||||
```bash
|
||||
cd InvokeAI
|
||||
```
|
||||
|
||||
From this step forward make sure that you are working in the InvokeAI
|
||||
directory!
|
||||
|
||||
5. Select the appropriate environment file:
|
||||
|
||||
We have created a series of environment files suited for different operating
|
||||
systems and GPU hardware. They are located in the
|
||||
`environments-and-requirements` directory:
|
||||
|
||||
<figure markdown>
|
||||
|
||||
| filename | OS |
|
||||
| :----------------------: | :----------------------------: |
|
||||
| environment-lin-amd.yml | Linux with an AMD (ROCm) GPU |
|
||||
| environment-lin-cuda.yml | Linux with an NVIDIA CUDA GPU |
|
||||
| environment-mac.yml | Macintosh |
|
||||
| environment-win-cuda.yml | Windows with an NVIDA CUDA GPU |
|
||||
|
||||
</figure>
|
||||
|
||||
Choose the appropriate environment file for your system and link or copy it
|
||||
to `environment.yml` in InvokeAI's top-level directory. To do so, run
|
||||
following command from the repository-root:
|
||||
|
||||
!!! Example ""
|
||||
|
||||
=== "Macintosh and Linux"
|
||||
|
||||
!!! todo "Replace `xxx` and `yyy` with the appropriate OS and GPU codes as seen in the table above"
|
||||
|
||||
```bash
|
||||
ln -sf environments-and-requirements/environment-xxx-yyy.yml environment.yml
|
||||
```
|
||||
|
||||
When this is done, confirm that a file `environment.yml` has been linked in
|
||||
the InvokeAI root directory and that it points to the correct file in the
|
||||
`environments-and-requirements`.
|
||||
|
||||
```bash
|
||||
ls -la
|
||||
```
|
||||
|
||||
=== "Windows"
|
||||
|
||||
!!! todo " Since it requires admin privileges to create links, we will use the copy command to create your `environment.yml`"
|
||||
|
||||
```cmd
|
||||
copy environments-and-requirements\environment-win-cuda.yml environment.yml
|
||||
```
|
||||
|
||||
Afterwards verify that the file `environment.yml` has been created, either via the
|
||||
explorer or by using the command `dir` from the terminal
|
||||
|
||||
```cmd
|
||||
dir
|
||||
```
|
||||
|
||||
!!! warning "Do not try to run conda on directly on the subdirectory environments file. This won't work. Instead, copy or link it to the top-level directory as shown."
|
||||
|
||||
6. Create the conda environment:
|
||||
|
||||
```bash
|
||||
conda env update
|
||||
```
|
||||
|
||||
This will create a new environment named `invokeai` and install all InvokeAI
|
||||
dependencies into it. If something goes wrong you should take a look at
|
||||
[troubleshooting](#troubleshooting).
|
||||
|
||||
7. Activate the `invokeai` environment:
|
||||
|
||||
In order to use the newly created environment you will first need to
|
||||
activate it
|
||||
|
||||
```bash
|
||||
conda activate invokeai
|
||||
```
|
||||
|
||||
Your command-line prompt should change to indicate that `invokeai` is active
|
||||
by prepending `(invokeai)`.
|
||||
|
||||
8. Pre-Load the model weights files:
|
||||
|
||||
!!! tip
|
||||
|
||||
If you have already downloaded the weights file(s) for another Stable
|
||||
Diffusion distribution, you may skip this step (by selecting "skip" when
|
||||
prompted) and configure InvokeAI to use the previously-downloaded files. The
|
||||
process for this is described in [here](INSTALLING_MODELS.md).
|
||||
|
||||
```bash
|
||||
python scripts/configure_invokeai.py
|
||||
```
|
||||
|
||||
The script `configure_invokeai.py` will interactively guide you through the
|
||||
process of downloading and installing the weights files needed for InvokeAI.
|
||||
Note that the main Stable Diffusion weights file is protected by a license
|
||||
agreement that you have to agree to. The script will list the steps you need
|
||||
to take to create an account on the site that hosts the weights files,
|
||||
accept the agreement, and provide an access token that allows InvokeAI to
|
||||
legally download and install the weights files.
|
||||
|
||||
If you get an error message about a module not being installed, check that
|
||||
the `invokeai` environment is active and if not, repeat step 5.
|
||||
|
||||
9. Run the command-line- or the web- interface:
|
||||
|
||||
!!! example ""
|
||||
|
||||
!!! warning "Make sure that the conda environment is activated, which should create `(invokeai)` in front of your prompt!"
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
python scripts/invoke.py
|
||||
```
|
||||
|
||||
=== "local Webserver"
|
||||
|
||||
```bash
|
||||
python scripts/invoke.py --web
|
||||
```
|
||||
|
||||
=== "Public Webserver"
|
||||
|
||||
```bash
|
||||
python scripts/invoke.py --web --host 0.0.0.0
|
||||
```
|
||||
|
||||
If you choose the run the web interface, point your browser at
|
||||
http://localhost:9090 in order to load the GUI.
|
||||
|
||||
10. Render away!
|
||||
|
||||
Browse the [features](../features/CLI.md) section to learn about all the things you
|
||||
can do with InvokeAI.
|
||||
|
||||
Note that some GPUs are slow to warm up. In particular, when using an AMD
|
||||
card with the ROCm driver, you may have to wait for over a minute the first
|
||||
time you try to generate an image. Fortunately, after the warm up period
|
||||
rendering will be fast.
|
||||
|
||||
11. Subsequently, to relaunch the script, be sure to run "conda activate
|
||||
invokeai", enter the `InvokeAI` directory, and then launch the invoke
|
||||
script. If you forget to activate the 'invokeai' environment, the script
|
||||
will fail with multiple `ModuleNotFound` errors.
|
||||
|
||||
## Updating to newer versions of the script
|
||||
|
||||
This distribution is changing rapidly. If you used the `git clone` method
|
||||
(step 5) to download the InvokeAI directory, then to update to the latest and
|
||||
greatest version, launch the Anaconda window, enter `InvokeAI` and type:
|
||||
|
||||
```bash
|
||||
git pull
|
||||
conda env update
|
||||
python scripts/configure_invokeai.py --no-interactive #optional
|
||||
```
|
||||
|
||||
This will bring your local copy into sync with the remote one. The last step may
|
||||
be needed to take advantage of new features or released models. The
|
||||
`--no-interactive` flag will prevent the script from prompting you to download
|
||||
the big Stable Diffusion weights files.
|
||||
|
||||
## pip Install
|
||||
|
||||
To install InvokeAI with only the PIP package manager, please follow these
|
||||
steps:
|
||||
|
||||
1. Make sure you are using Python 3.9 or higher. The rest of the install
|
||||
procedure depends on this:
|
||||
|
||||
```bash
|
||||
python -V
|
||||
```
|
||||
|
||||
2. Install the `virtualenv` tool if you don't have it already:
|
||||
|
||||
```bash
|
||||
pip install virtualenv
|
||||
```
|
||||
|
||||
3. From within the InvokeAI top-level directory, create and activate a virtual
|
||||
environment named `invokeai`:
|
||||
|
||||
```bash
|
||||
virtualenv invokeai
|
||||
source invokeai/bin/activate
|
||||
```
|
||||
|
||||
4. Pick the correct `requirements*.txt` file for your hardware and operating
|
||||
system.
|
||||
|
||||
We have created a series of environment files suited for different operating
|
||||
systems and GPU hardware. They are located in the
|
||||
`environments-and-requirements` directory:
|
||||
|
||||
<figure markdown>
|
||||
|
||||
| filename | OS |
|
||||
| :---------------------------------: | :-------------------------------------------------------------: |
|
||||
| requirements-lin-amd.txt | Linux with an AMD (ROCm) GPU |
|
||||
| requirements-lin-arm64.txt | Linux running on arm64 systems |
|
||||
| requirements-lin-cuda.txt | Linux with an NVIDIA (CUDA) GPU |
|
||||
| requirements-mac-mps-cpu.txt | Macintoshes with MPS acceleration |
|
||||
| requirements-lin-win-colab-cuda.txt | Windows with an NVIDA (CUDA) GPU<br>(supports Google Colab too) |
|
||||
|
||||
</figure>
|
||||
|
||||
Select the appropriate requirements file, and make a link to it from
|
||||
`requirements.txt` in the top-level InvokeAI directory. The command to do
|
||||
this from the top-level directory is:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Macintosh and Linux"
|
||||
|
||||
!!! info "Replace `xxx` and `yyy` with the appropriate OS and GPU codes."
|
||||
|
||||
```bash
|
||||
ln -sf environments-and-requirements/requirements-xxx-yyy.txt requirements.txt
|
||||
```
|
||||
|
||||
=== "Windows"
|
||||
|
||||
!!! info "on Windows, admin privileges are required to make links, so we use the copy command instead"
|
||||
|
||||
```cmd
|
||||
copy environments-and-requirements\requirements-lin-win-colab-cuda.txt requirements.txt
|
||||
```
|
||||
|
||||
!!! warning
|
||||
|
||||
Please do not link or copy `environments-and-requirements/requirements-base.txt`.
|
||||
This is a base requirements file that does not have the platform-specific
|
||||
libraries. Also, be sure to link or copy the platform-specific file to
|
||||
a top-level file named `requirements.txt` as shown here. Running pip on
|
||||
a requirements file in a subdirectory will not work as expected.
|
||||
|
||||
When this is done, confirm that a file named `requirements.txt` has been
|
||||
created in the InvokeAI root directory and that it points to the correct
|
||||
file in `environments-and-requirements`.
|
||||
|
||||
5. Run PIP
|
||||
|
||||
Be sure that the `invokeai` environment is active before doing this:
|
||||
|
||||
```bash
|
||||
pip install --prefer-binary -r requirements.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
Here are some common issues and their suggested solutions.
|
||||
|
||||
### Conda
|
||||
|
||||
#### Conda fails before completing `conda update`
|
||||
|
||||
The usual source of these errors is a package incompatibility. While we have
|
||||
tried to minimize these, over time packages get updated and sometimes introduce
|
||||
incompatibilities.
|
||||
|
||||
We suggest that you search
|
||||
[Issues](https://github.com/invoke-ai/InvokeAI/issues) or the "bugs-and-support"
|
||||
channel of the [InvokeAI Discord](https://discord.gg/ZmtBAhwWhy).
|
||||
|
||||
You may also try to install the broken packages manually using PIP. To do this,
|
||||
activate the `invokeai` environment, and run `pip install` with the name and
|
||||
version of the package that is causing the incompatibility. For example:
|
||||
|
||||
```bash
|
||||
pip install test-tube==0.7.5
|
||||
```
|
||||
|
||||
You can keep doing this until all requirements are satisfied and the `invoke.py`
|
||||
script runs without errors. Please report to
|
||||
[Issues](https://github.com/invoke-ai/InvokeAI/issues) what you were able to do
|
||||
to work around the problem so that others can benefit from your investigation.
|
||||
|
||||
### Create Conda Environment fails on MacOS
|
||||
|
||||
If conda create environment fails with lmdb error, this is most likely caused by Clang.
|
||||
Run brew config to see which Clang is installed on your Mac. If Clang isn't installed, that's causing the error.
|
||||
Start by installing additional XCode command line tools, followed by brew install llvm.
|
||||
|
||||
```bash
|
||||
xcode-select --install
|
||||
brew install llvm
|
||||
```
|
||||
|
||||
If brew config has Clang installed, update to the latest llvm and try creating the environment again.
|
||||
|
||||
#### `configure_invokeai.py` or `invoke.py` crashes at an early stage
|
||||
|
||||
This is usually due to an incomplete or corrupted Conda install. Make sure you
|
||||
have linked to the correct environment file and run `conda update` again.
|
||||
|
||||
If the problem persists, a more extreme measure is to clear Conda's caches and
|
||||
remove the `invokeai` environment:
|
||||
|
||||
```bash
|
||||
conda deactivate
|
||||
conda env remove -n invokeai
|
||||
conda clean -a
|
||||
conda update
|
||||
```
|
||||
|
||||
This removes all cached library files, including ones that may have been
|
||||
corrupted somehow. (This is not supposed to happen, but does anyway).
|
||||
|
||||
#### `invoke.py` crashes at a later stage
|
||||
|
||||
If the CLI or web site had been working ok, but something unexpected happens
|
||||
later on during the session, you've encountered a code bug that is probably
|
||||
unrelated to an install issue. Please search
|
||||
[Issues](https://github.com/invoke-ai/InvokeAI/issues), file a bug report, or
|
||||
ask for help on [Discord](https://discord.gg/ZmtBAhwWhy)
|
||||
|
||||
#### My renders are running very slowly
|
||||
|
||||
You may have installed the wrong torch (machine learning) package, and the
|
||||
system is running on CPU rather than the GPU. To check, look at the log messages
|
||||
that appear when `invoke.py` is first starting up. One of the earlier lines
|
||||
should say `Using device type cuda`. On AMD systems, it will also say "cuda",
|
||||
and on Macintoshes, it should say "mps". If instead the message says it is
|
||||
running on "cpu", then you may need to install the correct torch library.
|
||||
|
||||
You may be able to fix this by installing a different torch library. Here are
|
||||
the magic incantations for Conda and PIP.
|
||||
|
||||
!!! todo "For CUDA systems"
|
||||
|
||||
- conda
|
||||
|
||||
```bash
|
||||
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
|
||||
```
|
||||
|
||||
- pip
|
||||
|
||||
```bash
|
||||
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
|
||||
```
|
||||
|
||||
!!! todo "For AMD systems"
|
||||
|
||||
- conda
|
||||
|
||||
```bash
|
||||
conda activate invokeai
|
||||
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.2/
|
||||
```
|
||||
|
||||
- pip
|
||||
|
||||
```bash
|
||||
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.2/
|
||||
```
|
||||
|
||||
More information and troubleshooting tips can be found at https://pytorch.org.
|
@ -3,10 +3,10 @@ info:
|
||||
title: Stable Diffusion
|
||||
description: |-
|
||||
TODO: Description Here
|
||||
|
||||
|
||||
Some useful links:
|
||||
- [Stable Diffusion Dream Server](https://github.com/lstein/stable-diffusion)
|
||||
|
||||
|
||||
license:
|
||||
name: MIT License
|
||||
url: https://github.com/lstein/stable-diffusion/blob/main/LICENSE
|
||||
@ -36,7 +36,7 @@ paths:
|
||||
description: successful operation
|
||||
content:
|
||||
image/png:
|
||||
schema:
|
||||
schema:
|
||||
type: string
|
||||
format: binary
|
||||
'404':
|
||||
@ -66,7 +66,7 @@ paths:
|
||||
description: successful operation
|
||||
content:
|
||||
image/png:
|
||||
schema:
|
||||
schema:
|
||||
type: string
|
||||
format: binary
|
||||
'404':
|
||||
|
1
frontend/dist/index.html
vendored
1
frontend/dist/index.html
vendored
@ -15,7 +15,6 @@
|
||||
|
||||
<body>
|
||||
<div id="root"></div>
|
||||
|
||||
<script nomodule>!function(){var e=document,t=e.createElement("script");if(!("noModule"in t)&&"onbeforeload"in t){var n=!1;e.addEventListener("beforeload",(function(e){if(e.target===t)n=!0;else if(!e.target.hasAttribute("nomodule")||!n)return;e.preventDefault()}),!0),t.type="module",t.src=".",e.head.appendChild(t),t.remove()}}();</script>
|
||||
<script nomodule crossorigin id="vite-legacy-polyfill" src="./assets/polyfills-legacy-dde3a68a.js"></script>
|
||||
<script nomodule crossorigin id="vite-legacy-entry" data-src="./assets/index-legacy-b98e060c.js">System.import(document.getElementById('vite-legacy-entry').getAttribute('data-src'))</script>
|
||||
|
@ -341,7 +341,7 @@ class Args(object):
|
||||
|
||||
if not hasattr(cmd_switches,name) and not hasattr(arg_switches,name):
|
||||
raise AttributeError
|
||||
|
||||
|
||||
value_arg,value_cmd = (None,None)
|
||||
try:
|
||||
value_cmd = getattr(cmd_switches,name)
|
||||
@ -397,7 +397,7 @@ class Args(object):
|
||||
description=
|
||||
"""
|
||||
Generate images using Stable Diffusion.
|
||||
Use --web to launch the web interface.
|
||||
Use --web to launch the web interface.
|
||||
Use --from_file to load prompts from a file path or standard input ("-").
|
||||
Otherwise you will be dropped into an interactive command prompt (type -h for help.)
|
||||
Other command-line arguments are defaults that can usually be overridden
|
||||
@ -1052,7 +1052,7 @@ def metadata_dumps(opt,
|
||||
Given an Args object, returns a dict containing the keys and
|
||||
structure of the proposed stable diffusion metadata standard
|
||||
https://github.com/lstein/stable-diffusion/discussions/392
|
||||
This is intended to be turned into JSON and stored in the
|
||||
This is intended to be turned into JSON and stored in the
|
||||
"sd
|
||||
'''
|
||||
|
||||
@ -1135,7 +1135,7 @@ def args_from_png(png_file_path) -> list[Args]:
|
||||
meta = ldm.invoke.pngwriter.retrieve_metadata(png_file_path)
|
||||
except AttributeError:
|
||||
return [legacy_metadata_load({},png_file_path)]
|
||||
|
||||
|
||||
try:
|
||||
return metadata_loads(meta)
|
||||
except:
|
||||
@ -1234,4 +1234,4 @@ def legacy_metadata_load(meta,pathname) -> Args:
|
||||
opt.prompt = ''
|
||||
opt.seed = 0
|
||||
return opt
|
||||
|
||||
|
||||
|
@ -119,11 +119,11 @@ class Concepts(object):
|
||||
self.download_concept(concept_name)
|
||||
path = os.path.join(self._concept_path(concept_name), file_name)
|
||||
return path if os.path.exists(path) else None
|
||||
|
||||
|
||||
def concept_is_downloaded(self, concept_name)->bool:
|
||||
concept_directory = self._concept_path(concept_name)
|
||||
return os.path.exists(concept_directory)
|
||||
|
||||
|
||||
def download_concept(self,concept_name)->bool:
|
||||
repo_id = self._concept_id(concept_name)
|
||||
dest = self._concept_path(concept_name)
|
||||
@ -136,7 +136,7 @@ class Concepts(object):
|
||||
|
||||
os.makedirs(dest, exist_ok=True)
|
||||
succeeded = True
|
||||
|
||||
|
||||
bytes = 0
|
||||
def tally_download_size(chunk, size, total):
|
||||
nonlocal bytes
|
||||
|
@ -21,7 +21,7 @@ class Embiggen(Generator):
|
||||
def generate(self,prompt,iterations=1,seed=None,
|
||||
image_callback=None, step_callback=None,
|
||||
**kwargs):
|
||||
|
||||
|
||||
scope = choose_autocast(self.precision)
|
||||
make_image = self.get_make_image(
|
||||
prompt,
|
||||
@ -39,7 +39,7 @@ class Embiggen(Generator):
|
||||
results.append([image, seed])
|
||||
if image_callback is not None:
|
||||
image_callback(image, seed, prompt_in=prompt)
|
||||
seed = self.new_seed()
|
||||
seed = self.new_seed()
|
||||
return results
|
||||
|
||||
@torch.no_grad()
|
||||
@ -179,9 +179,9 @@ class Embiggen(Generator):
|
||||
# Clamp values to max 255
|
||||
if distanceToLR > 255:
|
||||
distanceToLR = 255
|
||||
#Place the pixel as invert of distance
|
||||
#Place the pixel as invert of distance
|
||||
agradientC.putpixel((x, y), round(255 - distanceToLR))
|
||||
|
||||
|
||||
# Create alternative asymmetric diagonal corner to use on "tailing" intersections to prevent hard edges
|
||||
# Fits for a left-fading gradient on the bottom side and full opacity on the right side.
|
||||
agradientAsymC = Image.new('L', (256, 256))
|
||||
|
@ -62,7 +62,7 @@ class Omnibus(Img2Img,Txt2Img):
|
||||
|
||||
if init_image is not None and mask_image is not None: # inpainting
|
||||
masked_image = init_image * (1 - mask_image) # masked image is the image masked by mask - masked regions zero
|
||||
|
||||
|
||||
elif init_image is not None: # img2img
|
||||
scope = choose_autocast(self.precision)
|
||||
|
||||
@ -99,7 +99,7 @@ class Omnibus(Img2Img,Txt2Img):
|
||||
device=model.device,
|
||||
num_samples=num_samples,
|
||||
)
|
||||
|
||||
|
||||
c = model.cond_stage_model.encode(batch["txt"])
|
||||
c_cat = list()
|
||||
for ck in model.concat_keys:
|
||||
@ -164,10 +164,10 @@ class Omnibus(Img2Img,Txt2Img):
|
||||
|
||||
def sample_to_image(self, samples)->Image.Image:
|
||||
gen_result = super().sample_to_image(samples).convert('RGB')
|
||||
|
||||
|
||||
if self.pil_image is None or self.pil_mask is None:
|
||||
return gen_result
|
||||
|
||||
corrected_result = super(Img2Img, self).repaste_and_color_correct(gen_result, self.pil_image, self.pil_mask, self.mask_blur_radius)
|
||||
|
||||
|
||||
return corrected_result
|
||||
|
@ -9,8 +9,8 @@ class InitImageResizer():
|
||||
def resize(self,width=None,height=None) -> Image:
|
||||
"""
|
||||
Return a copy of the image resized to fit within
|
||||
a box width x height. The aspect ratio is
|
||||
maintained. If neither width nor height are provided,
|
||||
a box width x height. The aspect ratio is
|
||||
maintained. If neither width nor height are provided,
|
||||
then returns a copy of the original image. If one or the other is
|
||||
provided, then the other will be calculated from the
|
||||
aspect ratio.
|
||||
@ -19,7 +19,7 @@ class InitImageResizer():
|
||||
that it can be passed to img2img()
|
||||
"""
|
||||
im = self.image
|
||||
|
||||
|
||||
ar = im.width/float(im.height)
|
||||
|
||||
# Infer missing values from aspect ratio
|
||||
@ -44,7 +44,7 @@ class InitImageResizer():
|
||||
# no resize necessary, but return a copy
|
||||
if im.width == width and im.height == height:
|
||||
return im.copy()
|
||||
|
||||
|
||||
# otherwise resize the original image so that it fits inside the bounding box
|
||||
resized_image = self.image.resize((rw,rh),resample=Image.Resampling.LANCZOS)
|
||||
return resized_image
|
||||
|
@ -1,5 +1,5 @@
|
||||
'''
|
||||
Manage a cache of Stable Diffusion model files for fast switching.
|
||||
Manage a cache of Stable Diffusion model files for fast switching.
|
||||
They are moved between GPU and CPU as necessary. If CPU memory falls
|
||||
below a preset minimum, the least recently used model will be
|
||||
cleared and loaded from disk when next needed.
|
||||
@ -51,7 +51,7 @@ class ModelCache(object):
|
||||
identifier.
|
||||
'''
|
||||
return model_name in self.config
|
||||
|
||||
|
||||
def get_model(self, model_name:str):
|
||||
'''
|
||||
Given a model named identified in models.yaml, return
|
||||
@ -66,7 +66,7 @@ class ModelCache(object):
|
||||
if model_name not in self.models: # make room for a new one
|
||||
self._make_cache_room()
|
||||
self.offload_model(self.current_model)
|
||||
|
||||
|
||||
if model_name in self.models:
|
||||
requested_model = self.models[model_name]['model']
|
||||
print(f'>> Retrieving model {model_name} from system RAM cache')
|
||||
@ -92,7 +92,7 @@ class ModelCache(object):
|
||||
print(f'** restoring {self.current_model}')
|
||||
self.get_model(self.current_model)
|
||||
return
|
||||
|
||||
|
||||
self.current_model = model_name
|
||||
self._push_newest_model(model_name)
|
||||
return {
|
||||
@ -191,7 +191,7 @@ class ModelCache(object):
|
||||
omega[model_name] = config
|
||||
if clobber:
|
||||
self._invalidate_cached_model(model_name)
|
||||
|
||||
|
||||
def _load_model(self, model_name:str):
|
||||
"""Load and initialize the model from configuration variables passed at object creation time"""
|
||||
if model_name not in self.config:
|
||||
@ -254,7 +254,7 @@ class ModelCache(object):
|
||||
model.to(self.device)
|
||||
# model.to doesn't change the cond_stage_model.device used to move the tokenizer output, so set it here
|
||||
model.cond_stage_model.device = self.device
|
||||
|
||||
|
||||
model.eval()
|
||||
|
||||
for module in model.modules():
|
||||
@ -274,7 +274,7 @@ class ModelCache(object):
|
||||
)
|
||||
|
||||
return model, width, height, model_hash
|
||||
|
||||
|
||||
def offload_model(self, model_name:str) -> None:
|
||||
'''
|
||||
Offload the indicated model to CPU. Will call
|
||||
@ -290,7 +290,7 @@ class ModelCache(object):
|
||||
gc.collect()
|
||||
if self._has_cuda():
|
||||
torch.cuda.empty_cache()
|
||||
|
||||
|
||||
def scan_model(self, model_name, checkpoint):
|
||||
# scan model
|
||||
print(f'>> Scanning Model: {model_name}')
|
||||
@ -320,7 +320,7 @@ class ModelCache(object):
|
||||
if least_recent_model is not None:
|
||||
del self.models[least_recent_model]
|
||||
gc.collect()
|
||||
|
||||
|
||||
def print_vram_usage(self) -> None:
|
||||
if self._has_cuda:
|
||||
print('>> Current VRAM usage: ','%4.2fG' % (torch.cuda.memory_allocated() / 1e9))
|
||||
@ -355,12 +355,12 @@ class ModelCache(object):
|
||||
if model_name in self.stack:
|
||||
self.stack.remove(model_name)
|
||||
self.models.pop(model_name,None)
|
||||
|
||||
|
||||
def _model_to_cpu(self,model):
|
||||
if self.device != 'cpu':
|
||||
model.cond_stage_model.device = 'cpu'
|
||||
model.first_stage_model.to('cpu')
|
||||
model.cond_stage_model.to('cpu')
|
||||
model.cond_stage_model.to('cpu')
|
||||
model.model.to('cpu')
|
||||
return model.to('cpu')
|
||||
else:
|
||||
@ -390,7 +390,7 @@ class ModelCache(object):
|
||||
with contextlib.suppress(ValueError):
|
||||
self.stack.remove(model_name)
|
||||
self.stack.append(model_name)
|
||||
|
||||
|
||||
def _has_cuda(self) -> bool:
|
||||
return self.device.type == 'cuda'
|
||||
|
||||
|
@ -10,7 +10,7 @@ class Restoration():
|
||||
else:
|
||||
print('>> GFPGAN Disabled')
|
||||
gfpgan = None
|
||||
|
||||
|
||||
# Load CodeFormer
|
||||
codeformer = self.load_codeformer()
|
||||
if codeformer.codeformer_model_exists:
|
||||
@ -18,7 +18,7 @@ class Restoration():
|
||||
else:
|
||||
print('>> CodeFormer Disabled')
|
||||
codeformer = None
|
||||
|
||||
|
||||
return gfpgan, codeformer
|
||||
|
||||
# Face Restore Models
|
||||
|
@ -14,7 +14,7 @@ class CodeFormerRestoration():
|
||||
|
||||
if not os.path.isabs(codeformer_dir):
|
||||
codeformer_dir = os.path.join(Globals.root, codeformer_dir)
|
||||
|
||||
|
||||
self.model_path = os.path.join(codeformer_dir, codeformer_model_path)
|
||||
self.codeformer_model_exists = os.path.isfile(self.model_path)
|
||||
|
||||
@ -35,9 +35,9 @@ class CodeFormerRestoration():
|
||||
from ldm.invoke.restoration.codeformer_arch import CodeFormer
|
||||
from torchvision.transforms.functional import normalize
|
||||
from PIL import Image
|
||||
|
||||
|
||||
cf_class = CodeFormer
|
||||
|
||||
|
||||
cf = cf_class(
|
||||
dim_embd=512,
|
||||
codebook_size=1024,
|
||||
|
@ -119,7 +119,7 @@ class TransformerSALayer(nn.Module):
|
||||
tgt_mask: Optional[Tensor] = None,
|
||||
tgt_key_padding_mask: Optional[Tensor] = None,
|
||||
query_pos: Optional[Tensor] = None):
|
||||
|
||||
|
||||
# self attention
|
||||
tgt2 = self.norm1(tgt)
|
||||
q = k = self.with_pos_embed(tgt2, query_pos)
|
||||
@ -159,7 +159,7 @@ class Fuse_sft_block(nn.Module):
|
||||
|
||||
@ARCH_REGISTRY.register()
|
||||
class CodeFormer(VQAutoEncoder):
|
||||
def __init__(self, dim_embd=512, n_head=8, n_layers=9,
|
||||
def __init__(self, dim_embd=512, n_head=8, n_layers=9,
|
||||
codebook_size=1024, latent_size=256,
|
||||
connect_list=['32', '64', '128', '256'],
|
||||
fix_modules=['quantize','generator']):
|
||||
@ -179,14 +179,14 @@ class CodeFormer(VQAutoEncoder):
|
||||
self.feat_emb = nn.Linear(256, self.dim_embd)
|
||||
|
||||
# transformer
|
||||
self.ft_layers = nn.Sequential(*[TransformerSALayer(embed_dim=dim_embd, nhead=n_head, dim_mlp=self.dim_mlp, dropout=0.0)
|
||||
self.ft_layers = nn.Sequential(*[TransformerSALayer(embed_dim=dim_embd, nhead=n_head, dim_mlp=self.dim_mlp, dropout=0.0)
|
||||
for _ in range(self.n_layers)])
|
||||
|
||||
# logits_predict head
|
||||
self.idx_pred_layer = nn.Sequential(
|
||||
nn.LayerNorm(dim_embd),
|
||||
nn.Linear(dim_embd, codebook_size, bias=False))
|
||||
|
||||
|
||||
self.channels = {
|
||||
'16': 512,
|
||||
'32': 256,
|
||||
@ -221,7 +221,7 @@ class CodeFormer(VQAutoEncoder):
|
||||
enc_feat_dict = {}
|
||||
out_list = [self.fuse_encoder_block[f_size] for f_size in self.connect_list]
|
||||
for i, block in enumerate(self.encoder.blocks):
|
||||
x = block(x)
|
||||
x = block(x)
|
||||
if i in out_list:
|
||||
enc_feat_dict[str(x.shape[-1])] = x.clone()
|
||||
|
||||
@ -266,7 +266,7 @@ class CodeFormer(VQAutoEncoder):
|
||||
fuse_list = [self.fuse_generator_block[f_size] for f_size in self.connect_list]
|
||||
|
||||
for i, block in enumerate(self.generator.blocks):
|
||||
x = block(x)
|
||||
x = block(x)
|
||||
if i in fuse_list: # fuse after i-th block
|
||||
f_size = str(x.shape[-1])
|
||||
if w>0:
|
||||
|
@ -13,19 +13,19 @@ class GFPGAN():
|
||||
self,
|
||||
gfpgan_model_path='models/gfpgan/GFPGANv1.4.pth'
|
||||
) -> None:
|
||||
|
||||
|
||||
if not os.path.isabs(gfpgan_model_path):
|
||||
gfpgan_model_path=os.path.abspath(os.path.join(Globals.root,gfpgan_model_path))
|
||||
self.model_path = gfpgan_model_path
|
||||
self.gfpgan_model_exists = os.path.isfile(self.model_path)
|
||||
|
||||
|
||||
if not self.gfpgan_model_exists:
|
||||
print('## NOT FOUND: GFPGAN model not found at ' + self.model_path)
|
||||
return None
|
||||
|
||||
|
||||
def model_exists(self):
|
||||
return os.path.isfile(self.model_path)
|
||||
|
||||
|
||||
def process(self, image, strength: float, seed: str = None):
|
||||
if seed is not None:
|
||||
print(f'>> GFPGAN - Restoring Faces for image seed:{seed}')
|
||||
|
@ -51,7 +51,7 @@ class Outcrop(object):
|
||||
color_match = True,
|
||||
force_outpaint = True, # this just stops the warning about erased regions
|
||||
)
|
||||
|
||||
|
||||
# swap sampler back
|
||||
self.generate.sampler = curr_sampler
|
||||
return result
|
||||
|
@ -16,7 +16,7 @@ class Outpaint(object):
|
||||
def wrapped_callback(img,seed,**kwargs):
|
||||
image_callback(img,seed,use_prefix=prefix,**kwargs)
|
||||
|
||||
|
||||
|
||||
return self.generate.prompt2image(
|
||||
prompt,
|
||||
seed = seed,
|
||||
|
@ -67,7 +67,7 @@ class ESRGAN():
|
||||
|
||||
# REALSRGAN expects a BGR np array; make array and flip channels
|
||||
bgr_image_array = np.array(image, dtype=np.uint8)[...,::-1]
|
||||
|
||||
|
||||
output, _ = upsampler.enhance(
|
||||
bgr_image_array,
|
||||
outscale=upsampler_scale,
|
||||
|
@ -13,7 +13,7 @@ from basicsr.utils.registry import ARCH_REGISTRY
|
||||
|
||||
def normalize(in_channels):
|
||||
return torch.nn.GroupNorm(num_groups=32, num_channels=in_channels, eps=1e-6, affine=True)
|
||||
|
||||
|
||||
|
||||
@torch.jit.script
|
||||
def swish(x):
|
||||
@ -210,15 +210,15 @@ class AttnBlock(nn.Module):
|
||||
# compute attention
|
||||
b, c, h, w = q.shape
|
||||
q = q.reshape(b, c, h*w)
|
||||
q = q.permute(0, 2, 1)
|
||||
q = q.permute(0, 2, 1)
|
||||
k = k.reshape(b, c, h*w)
|
||||
w_ = torch.bmm(q, k)
|
||||
w_ = torch.bmm(q, k)
|
||||
w_ = w_ * (int(c)**(-0.5))
|
||||
w_ = F.softmax(w_, dim=2)
|
||||
|
||||
# attend to values
|
||||
v = v.reshape(b, c, h*w)
|
||||
w_ = w_.permute(0, 2, 1)
|
||||
w_ = w_.permute(0, 2, 1)
|
||||
h_ = torch.bmm(v, w_)
|
||||
h_ = h_.reshape(b, c, h, w)
|
||||
|
||||
@ -270,18 +270,18 @@ class Encoder(nn.Module):
|
||||
def forward(self, x):
|
||||
for block in self.blocks:
|
||||
x = block(x)
|
||||
|
||||
|
||||
return x
|
||||
|
||||
|
||||
class Generator(nn.Module):
|
||||
def __init__(self, nf, emb_dim, ch_mult, res_blocks, img_size, attn_resolutions):
|
||||
super().__init__()
|
||||
self.nf = nf
|
||||
self.ch_mult = ch_mult
|
||||
self.nf = nf
|
||||
self.ch_mult = ch_mult
|
||||
self.num_resolutions = len(self.ch_mult)
|
||||
self.num_res_blocks = res_blocks
|
||||
self.resolution = img_size
|
||||
self.resolution = img_size
|
||||
self.attn_resolutions = attn_resolutions
|
||||
self.in_channels = emb_dim
|
||||
self.out_channels = 3
|
||||
@ -315,24 +315,24 @@ class Generator(nn.Module):
|
||||
blocks.append(nn.Conv2d(block_in_ch, self.out_channels, kernel_size=3, stride=1, padding=1))
|
||||
|
||||
self.blocks = nn.ModuleList(blocks)
|
||||
|
||||
|
||||
|
||||
def forward(self, x):
|
||||
for block in self.blocks:
|
||||
x = block(x)
|
||||
|
||||
|
||||
return x
|
||||
|
||||
|
||||
|
||||
@ARCH_REGISTRY.register()
|
||||
class VQAutoEncoder(nn.Module):
|
||||
def __init__(self, img_size, nf, ch_mult, quantizer="nearest", res_blocks=2, attn_resolutions=[16], codebook_size=1024, emb_dim=256,
|
||||
beta=0.25, gumbel_straight_through=False, gumbel_kl_weight=1e-8, model_path=None):
|
||||
super().__init__()
|
||||
logger = get_root_logger()
|
||||
self.in_channels = 3
|
||||
self.nf = nf
|
||||
self.n_blocks = res_blocks
|
||||
self.in_channels = 3
|
||||
self.nf = nf
|
||||
self.n_blocks = res_blocks
|
||||
self.codebook_size = codebook_size
|
||||
self.embed_dim = emb_dim
|
||||
self.ch_mult = ch_mult
|
||||
@ -363,11 +363,11 @@ class VQAutoEncoder(nn.Module):
|
||||
self.kl_weight
|
||||
)
|
||||
self.generator = Generator(
|
||||
self.nf,
|
||||
self.nf,
|
||||
self.embed_dim,
|
||||
self.ch_mult,
|
||||
self.n_blocks,
|
||||
self.resolution,
|
||||
self.ch_mult,
|
||||
self.n_blocks,
|
||||
self.resolution,
|
||||
self.attn_resolutions
|
||||
)
|
||||
|
||||
@ -432,4 +432,4 @@ class VQGANDiscriminator(nn.Module):
|
||||
raise ValueError(f'Wrong params!')
|
||||
|
||||
def forward(self, x):
|
||||
return self.main(x)
|
||||
return self.main(x)
|
||||
|
@ -1,5 +1,5 @@
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
def _conv_forward_asymmetric(self, input, weight, bias):
|
||||
"""
|
||||
Patch for Conv2d._conv_forward that supports asymmetric padding
|
||||
@ -27,4 +27,4 @@ def configure_model_padding(model, seamless, seamless_axes):
|
||||
if hasattr(m, 'asymmetric_padding_mode'):
|
||||
del m.asymmetric_padding_mode
|
||||
if hasattr(m, 'asymmetric_padding'):
|
||||
del m.asymmetric_padding
|
||||
del m.asymmetric_padding
|
||||
|
@ -61,7 +61,7 @@ def build_opt(post_data, seed, gfpgan_model_exists):
|
||||
broken = True
|
||||
break
|
||||
opt.with_variations.append([seed, weight])
|
||||
|
||||
|
||||
if broken:
|
||||
raise CanceledException
|
||||
|
||||
@ -99,7 +99,7 @@ class DreamServer(BaseHTTPRequestHandler):
|
||||
self.send_header("Content-type", "application/json")
|
||||
self.end_headers()
|
||||
output = []
|
||||
|
||||
|
||||
log_file = os.path.join(self.outdir, "legacy_web_log.txt")
|
||||
if os.path.exists(log_file):
|
||||
with open(log_file, "r") as log:
|
||||
|
@ -45,7 +45,7 @@ def build_opt(post_data, seed, gfpgan_model_exists):
|
||||
broken = True
|
||||
break
|
||||
opt.with_variations.append([seed, weight])
|
||||
|
||||
|
||||
if broken:
|
||||
raise CanceledException
|
||||
|
||||
@ -84,7 +84,7 @@ class DreamServer(BaseHTTPRequestHandler):
|
||||
self.send_header("Content-type", "application/json")
|
||||
self.end_headers()
|
||||
output = []
|
||||
|
||||
|
||||
log_file = os.path.join(self.outdir, "dream_web_log.txt")
|
||||
if os.path.exists(log_file):
|
||||
with open(log_file, "r") as log:
|
||||
|
@ -2,13 +2,13 @@
|
||||
assignment of masks via text prompt using clipseg.
|
||||
|
||||
Here is typical usage:
|
||||
|
||||
|
||||
from ldm.invoke.txt2mask import Txt2Mask, SegmentedGrayscale
|
||||
from PIL import Image
|
||||
|
||||
txt2mask = Txt2Mask(self.device)
|
||||
segmented = txt2mask.segment(Image.open('/path/to/img.png'),'a bagel')
|
||||
|
||||
|
||||
# this will return a grayscale Image of the segmented data
|
||||
grayscale = segmented.to_grayscale()
|
||||
|
||||
@ -45,7 +45,7 @@ class SegmentedGrayscale(object):
|
||||
def __init__(self, image:Image, heatmap:torch.Tensor):
|
||||
self.heatmap = heatmap
|
||||
self.image = image
|
||||
|
||||
|
||||
def to_grayscale(self,invert:bool=False)->Image:
|
||||
return self._rescale(Image.fromarray(np.uint8(255 - self.heatmap * 255 if invert else self.heatmap * 255)))
|
||||
|
||||
|
@ -113,7 +113,7 @@ class Sampler(object):
|
||||
'ddim_sigmas_for_original_num_steps',
|
||||
sigmas_for_original_sampling_steps,
|
||||
)
|
||||
|
||||
|
||||
@torch.no_grad()
|
||||
def stochastic_encode(self, x0, t, use_original_steps=False, noise=None):
|
||||
# fast, but does not allow for exact reconstruction
|
||||
@ -340,7 +340,7 @@ class Sampler(object):
|
||||
x_dec = x_latent
|
||||
x0 = init_latent
|
||||
self.prepare_to_sample(t_enc=total_steps, all_timesteps_count=all_timesteps_count, **kwargs)
|
||||
|
||||
|
||||
for i, step in enumerate(iterator):
|
||||
index = total_steps - i - 1
|
||||
ts = torch.full(
|
||||
@ -373,7 +373,7 @@ class Sampler(object):
|
||||
t_next = ts_next,
|
||||
step_count=len(self.ddim_timesteps)
|
||||
)
|
||||
|
||||
|
||||
x_dec, pred_x0, e_t = outs
|
||||
if img_callback:
|
||||
img_callback(x_dec,i)
|
||||
@ -385,7 +385,7 @@ class Sampler(object):
|
||||
return torch.randn(shape, device=self.device)
|
||||
else:
|
||||
return x_T
|
||||
|
||||
|
||||
def p_sample(
|
||||
self,
|
||||
img,
|
||||
@ -423,10 +423,10 @@ class Sampler(object):
|
||||
timesteps that will be used for sampling, depending on the t_enc in img2img.
|
||||
'''
|
||||
return self.ddim_timesteps[:ddim_steps]
|
||||
|
||||
|
||||
def q_sample(self,x0,ts):
|
||||
'''
|
||||
Returns self.model.q_sample(x0,ts). Is overridden in the k* samplers to
|
||||
Returns self.model.q_sample(x0,ts). Is overridden in the k* samplers to
|
||||
return self.model.inner_model.q_sample(x0,ts)
|
||||
'''
|
||||
return self.model.q_sample(x0,ts)
|
||||
|
@ -220,7 +220,7 @@ class AttnBlock(nn.Module):
|
||||
|
||||
if mem_required > mem_free_total:
|
||||
steps = 2**(math.ceil(math.log(mem_required / mem_free_total, 2)))
|
||||
|
||||
|
||||
slice_size = q.shape[1] // steps if (q.shape[1] % steps) == 0 else q.shape[1]
|
||||
|
||||
else:
|
||||
@ -228,7 +228,7 @@ class AttnBlock(nn.Module):
|
||||
slice_size = 1
|
||||
else:
|
||||
slice_size = min(q.shape[1], math.floor(2**30 / (q.shape[0] * q.shape[1])))
|
||||
|
||||
|
||||
for i in range(0, q.shape[1], slice_size):
|
||||
end = i + slice_size
|
||||
|
||||
|
@ -241,7 +241,7 @@ class EmbeddingManager(nn.Module):
|
||||
# both will be stored in this dictionary
|
||||
for term in self.string_to_param_dict.keys():
|
||||
term = term.strip('<').strip('>')
|
||||
self.concepts_loaded[term] = True
|
||||
self.concepts_loaded[term] = True
|
||||
print(f'>> Current embedding manager terms: {", ".join(self.string_to_param_dict.keys())}')
|
||||
|
||||
def _expand_directories(self, paths:list[str]):
|
||||
|
@ -548,7 +548,7 @@ class WeightedFrozenCLIPEmbedder(FrozenCLIPEmbedder):
|
||||
|
||||
#print(f"assembled tokens for '{fragments}' into tensor of shape {lerped_embeddings.shape}")
|
||||
|
||||
# append to batch
|
||||
# append to batch
|
||||
batch_z = lerped_embeddings.unsqueeze(0) if batch_z is None else torch.cat([batch_z, lerped_embeddings.unsqueeze(0)], dim=1)
|
||||
batch_tokens = tokens.unsqueeze(0) if batch_tokens is None else torch.cat([batch_tokens, tokens.unsqueeze(0)], dim=1)
|
||||
|
||||
|
@ -98,7 +98,7 @@ def _get_paths_from_images(path):
|
||||
|
||||
"""
|
||||
# --------------------------------------------
|
||||
# split large images into small images
|
||||
# split large images into small images
|
||||
# --------------------------------------------
|
||||
"""
|
||||
|
||||
|
@ -221,7 +221,7 @@ def rand_perlin_2d(shape, res, device, fade = lambda t: 6*t**5 - 15*t**4 + 10*t*
|
||||
grid = torch.stack(torch.meshgrid(torch.arange(0, res[0], delta[0]), torch.arange(0, res[1], delta[1]), indexing='ij'), dim = -1).to(device) % 1
|
||||
|
||||
rand_val = torch.rand(res[0]+1, res[1]+1)
|
||||
|
||||
|
||||
angles = 2*math.pi*rand_val
|
||||
gradients = torch.stack((torch.cos(angles), torch.sin(angles)), dim = -1).to(device)
|
||||
|
||||
@ -249,7 +249,7 @@ def ask_user(question: str, answers: list):
|
||||
def debug_image(debug_image, debug_text, debug_show=True, debug_result=False, debug_status=False ):
|
||||
if not debug_status:
|
||||
return
|
||||
|
||||
|
||||
image_copy = debug_image.copy()
|
||||
ImageDraw.Draw(image_copy).text(
|
||||
(5, 5),
|
||||
@ -261,4 +261,4 @@ def debug_image(debug_image, debug_text, debug_show=True, debug_result=False, de
|
||||
image_copy.show()
|
||||
|
||||
if debug_result:
|
||||
return image_copy
|
||||
return image_copy
|
||||
|
4
main.py
4
main.py
@ -474,7 +474,7 @@ class ImageLogger(Callback):
|
||||
self.check_frequency(check_idx)
|
||||
and hasattr( # batch_idx % self.batch_freq == 0
|
||||
pl_module, 'log_images'
|
||||
)
|
||||
)
|
||||
and callable(pl_module.log_images)
|
||||
and self.max_images > 0
|
||||
):
|
||||
@ -868,7 +868,7 @@ if __name__ == '__main__':
|
||||
if hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
|
||||
trainer_opt.accelerator = 'mps'
|
||||
trainer_opt.detect_anomaly = False
|
||||
|
||||
|
||||
trainer = Trainer.from_argparse_args(trainer_opt, **trainer_kwargs)
|
||||
trainer.logdir = logdir ###
|
||||
|
||||
|
@ -24,7 +24,7 @@ for f in filenames:
|
||||
except PermissionError:
|
||||
sys.stderr.write(f'{f} could not be opened due to inadequate permissions\n')
|
||||
continue
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
@ -398,7 +398,7 @@ SAMPLER_CHOICES = [
|
||||
def create_argv_parser():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="""Generate images using Stable Diffusion.
|
||||
Use --web to launch the web interface.
|
||||
Use --web to launch the web interface.
|
||||
Use --from_file to load prompts from a file path or standard input ("-").
|
||||
Otherwise you will be dropped into an interactive command prompt (type -h for help.)
|
||||
Other command-line arguments are defaults that can usually be overridden
|
||||
|
@ -8,9 +8,9 @@ from functools import partial
|
||||
import torch
|
||||
|
||||
def get_placeholder_loop(placeholder_string, embedder, use_bert):
|
||||
|
||||
|
||||
new_placeholder = None
|
||||
|
||||
|
||||
while True:
|
||||
if new_placeholder is None:
|
||||
new_placeholder = input(f"Placeholder string {placeholder_string} was already used. Please enter a replacement string: ")
|
||||
@ -21,7 +21,7 @@ def get_placeholder_loop(placeholder_string, embedder, use_bert):
|
||||
|
||||
if token is not None:
|
||||
return new_placeholder, token
|
||||
|
||||
|
||||
def get_clip_token_for_string(tokenizer, string):
|
||||
batch_encoding = tokenizer(
|
||||
string,
|
||||
@ -37,7 +37,7 @@ def get_clip_token_for_string(tokenizer, string):
|
||||
|
||||
if torch.count_nonzero(tokens - 49407) == 2:
|
||||
return tokens[0, 1]
|
||||
|
||||
|
||||
return None
|
||||
|
||||
def get_bert_token_for_string(tokenizer, string):
|
||||
@ -53,16 +53,16 @@ if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser()
|
||||
|
||||
parser.add_argument(
|
||||
"--root_dir",
|
||||
type=str,
|
||||
"--root_dir",
|
||||
type=str,
|
||||
default='.',
|
||||
help="Path to the InvokeAI install directory containing 'models', 'outputs' and 'configs'."
|
||||
)
|
||||
|
||||
parser.add_argument(
|
||||
"--manager_ckpts",
|
||||
type=str,
|
||||
nargs="+",
|
||||
"--manager_ckpts",
|
||||
type=str,
|
||||
nargs="+",
|
||||
required=True,
|
||||
help="Paths to a set of embedding managers to be merged."
|
||||
)
|
||||
@ -90,7 +90,7 @@ if __name__ == "__main__":
|
||||
|
||||
EmbeddingManager = partial(EmbeddingManager, embedder, ["*"])
|
||||
|
||||
string_to_token_dict = {}
|
||||
string_to_token_dict = {}
|
||||
string_to_param_dict = torch.nn.ParameterDict()
|
||||
|
||||
placeholder_to_src = {}
|
||||
|
@ -54,7 +54,7 @@ function loadPriorResults() {
|
||||
appendOutput(src, seed, metadata, true);
|
||||
});
|
||||
});
|
||||
|
||||
|
||||
// Load until page is full
|
||||
if (!priorResultsLoadState.initialized) {
|
||||
if (document.body.scrollHeight <= window.innerHeight) {
|
||||
|
@ -9,7 +9,7 @@ function toBase64(file) {
|
||||
|
||||
function appendOutput(src, seed, config) {
|
||||
let outputNode = document.createElement("figure");
|
||||
|
||||
|
||||
let variations = config.with_variations;
|
||||
if (config.variation_amount > 0) {
|
||||
variations = (variations ? variations + ',' : '') + seed + ':' + config.variation_amount;
|
||||
|
Loading…
Reference in New Issue
Block a user