Stable Diffusion AMD Linux ROCm

Stable-Diffusion-AMD-Linux-Low-VRAM

~~#->~~This repository contains instructions on how to host your own AI for image generation using stable diffusion with an 8GB VRAM AMD GPU on Linux. This is an affordable and efficient alternative to using Google Colab, which can be quite expensive.

Watch this YouTube video to learn how to install stable diffusion and make it work on your AMD GPU using ROCm. Please note that each GPU is unique, and the launch parameters required may vary. However, the launch parameters used in the video are as follows:

--FINALno-half GUI--always-batch-cond-uncond RETARD--opt-sub-quad-attention GUIDE---medvram --disable-nan-check
<-#

~~##->"HE~~

For ~~WHO~~a ~~SHALL~~complete ~~NOT~~list BEof ~~NAMED"<-##~~launch ~~->The~~parameters, ~~definitive~~check ~~Stable~~out ~~Diffusion~~the ~~experience~~Optimizations ~~™<-~~wiki.

->~~---NEW~~If ~~FEATURE~~you ~~SHOWCASE & HOWTO---~~<- ~~Notable: Inpainting/Outpainting, Live generation preview, Tiling, Upscaling, <4gb VRAM support, Negative prompts, CLIP~~ ~~==(Basic) CPU-only guide available~~ ~~Here~~~~== ==~~~~Japanese guide here 日本語ガイド~~==

~~Special thanks~~want to ~~all~~download ~~anons~~my ~~who~~VAE, ~~contributed~~you ~~Minimum~~can.

~~requirements:~~

Prerequisites

~~-16gb~~

To ~~ram~~get ~~-Maxwell-architecture~~started, you'll need the following:

Linux OS (a Debian-based distro is recommended)

AMD GPU (8GB or ~~newer~~more) (You may try with less VRAM)

Python 3

Installation

Download the driver for your AMD GPU ~~with~~from atAMD's ~~least~~website.
~~2gb~~

~~vram~~

Add yourself to the render and video groups using the following commands:

sudo usermod -Linuxa or-G Windowsrender 7/8/10+yourusername      
(Seesudo tabusermod for-a W7-specific-G instructions)video !!!yourusername
note

~~Guide~~

Confirm that you have Python 3 installed by typing the following command into the terminal:
```
python3 --version
```

~~Step 1:~~ Install ~~Git~~ROCm ~~(page)~~by running the following command:
```
sudo amdgpu-install --usecase=rocm --no-dkms
```

~~Step~~Reboot 2:your system using the following command:
```
sudo reboot
```

After rebooting, confirm that your GPU is recognized by running the following command:
```
rocminfo
```

Clone the ~~WebUI~~stable ~~repo~~diffusion toGUI ~~your~~repository:

~~desired~~

sudo locationapt-get ininstall agit
Git bash terminal:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
ORcd just download the WebUI repo .zip HERE and extract


(git clone is easier to update since you can just type git pullin the directory with CMD or Git Bash)
Note: In this guide, /stable-diffusion-webui
 and/stable-diffusion-webui-master refer to the same folder, .zip version is just called master


Step 3: Download the 1.4 AI model from huggingface (requires signup) or HERE
(torrent magnet)
==(NEW 9/7)== Alternate 1.4 Waifu model trained on an additional 56k Danbooru images HERE (mirror)
(torrent magnet)
(Note: everal GB larger than normal model, see instructions below for pruning)
comparison

Step 4: Rename your .ckpt file to "model.ckpt", and place it in stable-diffusion-webui-master

Step 5 (Optional): Download and place GFPGANv1.3.pth into the master webUI directory
(GFPGAN automatically corrects realistic faces)

Step 6: Install Python 3.10.6 (page)

Step 7: Run webui-user.bat from Windows Explorer. Run it as normal user, not as administrator.


Wait patiently while it installs dependencies and does a first time run.
It may seem "stuck" but it isn't. It may take up to 10-15 minutes.
==And you're done!==


==Usage:==


Open webui-user.bat

After loading the model, it should give you a LAN address such as '127.0.0.1:7860'

Enter the address into your browser to enter the GUI environment

To exit, close the CMD window


!!! info RUNNING ON 4GB (And lower!)
==These parameters are also useful for regular users who want to make larger images or batch sizes!==
It is possible to drastically reduce VRAM usage with some modifications:



Step 1: Edit webui-user.bat



Step 2: After  COMMANDLINE_ARGS= , enter your desired parameters:
Example: COMMANDLINE_ARGS=--medvram --opt-split-attention



If you have 4GBPython 3.8 installed, make sure you have VENV capabilities by running the following command (replace with your Python version if necessary):

apt install python3.8-venv



Install pip3 and wheel, and update them using the following commands:

sudo apt install python3-pip           
python -m pip install --upgrade pip wheel



Download any stable diffusion model you like and put it in the models/Stable-diffusion folder. You can find models at CIViTAI, which is also a great source of prompts.



For better performance, upgrade to the latest stable kernel by running the following commands:


sudo apt-get update
sudo apt-get dist-upgrade


Reboot your system again using the following command:


sudo reboot


Go to the virtual env of SD:


cd stable-diffusion-webui
python -m venv venv 
source venv/bin/activate


Install the PyTorch machine learning library for AMD:


pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.4.2

Note: This is to be installed in the VENV, not on the OS!


After installation, check your version numbers with the command:


pip list | grep 'torch'

The output should show Torch, torchvision, and torchaudio version numbers with ROCM tagged at the end.



Optimize VRAM usage with --medvram and want--lowvram launch arguments. Use --always-batch-cond-uncond with --lowvram and --medvram options to makeprevent 512x512bad (orquality. maybeIf upyour results turn out to 640x640)be black images, your card probably does not support float16, so use --medvramprecision full (at the cost of more VRAM).


IfBenchmark youdifferent haveoptions. 4GBThe VRAMoptions andthat wantenabled generating nice images (1024x1024) upscaled to make4K larger images, or you get an out of memory error with --medvram,
use --medvram --opt-split-attention instead.were:


If you have 4GB VRAM and you still get an out of memory error,
use 
--lowvramno-half --always-batch-cond-uncond --opt-split-sub-quad-attention instead.



If you have 2GB VRAM or 4GB VRAM and want to make images larger (but slower) than you can with --medvram,
use --lowvram --opt-split-attention.



If you have ==more VRAM== and want to make larger images than you can usually make,
just use --medvram --opt-split-attentiondisable-nan-check
.
(GenerationLaunch willthe be moderately slower but some swear on these parameters)
-Otherwise, do not use any of these-command:
python launch.py ==NOTES:==


If you get a green/black screen instead of generated pictures, you have a card that doesn't support half precision floating point numbers (known problem on 16xx cards):
You must use --precisionopt-sub-quad-attention full--medvram --disable-nan-check --always-batch-cond-uncond --no-half

inNote: additionThese options may differ for other graphics card models.

The time taken to othergenerate flags,1024x1024 img2img was 1m 16.33s, and the1024x1024 modelhires willfix takewas much1m more39s. spaceGenerating inbase VRAMimages 
Make(512x512px) suretakes to disable hardware acceleration in your browser and close anything which might be occupying VRAM if you are getting out-of-memory errors, and possibly remove GFPGAN
!!! note LINKS/NOTES/TIPS


->==-----LINKS-----==<-10-20s.


SD resource link hub


Artist list with pictures

Check out the wiki https://wiki.installgentoo.com/wiki/Stable_Diffusion


Inpainting Tips


Anon's guide for anime vectors (Waifu Diffusion)


Textual Inversion guide


Remacri Upscaler(Landscapes) Lollypop Upscaler(Anthropomorphic Figures)


Other Upscaler Models
(Place upscaler models in ESRGAN folder)


Trinart Alternate .cpkt (Pixiv-esque illustrations, not as cohesive as waifu diffusion)

Build great aesthetic prompts using the prompt builder

Japanese keywords: https://chara-zokusei.jp/question_list

Use Darkreader to change your Gradio theme to dark mode


Informal Training guide (30gb vram+)


Stable diffusion WebUI repo


Waifu Diffusion huggingface page


->==-----TROUBLESHOOTING-----==<-


Make sure you have the latest CUDA toolkit and GPU drivers you can run

if your version of Python is not in PATH (or if another version is)
create or modify webui.settings.bat in the root folder (same place as webui.bat)
add the line set PYTHON=python to say the full path to your python executable: set PYTHON=B:\soft\Python310\python.exe
You can do this for python, but not for git.

The installer creates a python virtual environment, so none of installed modules will affect your system installation of python if you had one prior to installing this.

To prevent the creation of virtual environment and use your system python, edit webui.bat replacing set VENV_DIR=venv with set VENV_DIR=

webui.bat installs requirements from files requirements_versions.txt, which lists versions for modules specifically compatible with Python 3.10.6.
If you choose to install for a different version of python, editing webui.bat to have set REQS_FILE=requirements.txt instead of set REQS_FILE=requirements_versions.txt may help (but I still reccomend you to just use the recommended version of python).

If yousomething feeldoes younot brokework, somethingcheck the Torch, torchvision, and wanttorchaudio toversion reinstall from scratch, delete directories: venv, repositories.

If your output is a jumbled rainbow mess your image resolution is set TOO LOW

Having too high of a CFG level will also introduce rainbow distortion, your CFG shouldn't be set above 20

On older systems, you may have to change cudatoolkit=11.3 to cudatoolkit=9.0

Make sure your installation is on the C: drive

This guide is designed for NVIDIA GPUs only, as stable diffusion requires cuda cores.
AMD users should try THIS GUIDE


->==-----TIPS-----==<-


The Waifu model and normal .cpkt have their own pros and cons;
Non-anime promps donenumbers with the waifu .cpkt will be biased toward anime stylization, making realistic faces and people more difficultcommand:
Hover
overpip UIlist elements| forgrep informative'torch'
tooltips
The outpaintingversion scriptnumbers requiresshould HIGHhave DENOISINGROCM totagged work properly (eg. 0.9)
Outpainting benefits from high steps

Use (((  ))) around keywords to increase their strength and [[[ ]]] to decrease their strength


Save prompt as style allows you to save a prompt as an easily selectable output. A box to select will appear toat the leftend.
of RollUsage after youinstall
saveEvery your first style, allowing you to make a selection. Prompts can be deleted by accessing styles.csv
(This can be helpful if you find a combination that generations really good images and want to repeatedly use it with varied subjects.)

You can drag your favorite result from the output tab on the right back into img2img for further iteration

The k_euler_a and k_dpm_2_a samplers give vastly different, more intricate results from the same seed & prompt

Unlike other samplers, k_euler_a can generate high quality results from low steps. Try it with 10-25 instead of 50

The seed for each generated result is in the output filename iftime you want to revisitlaunch it
stable Usingdiffusion, go back to the venv where all dependencies are installed with the following commands:

cd stable-diffusion-webui
python -m venv venv 
source venv/bin/activate
python launch.py --opt-sub-quad-attention --medvram --disable-nan-check --always-batch-cond-uncond --no-half

Watch out for VRAM usage and system temps with the following commands:

sudo radeontop
watch -n 1 sensors

Adjust your fan curve if your temps are too high (70C).

In this youtube video, I show the process of generating images with high details and a 4K end size using the same keywords as a generated image in img2img produces interesting variants

It's recommended to have your prompts be at least 512 pixels in one dimension, or a 384x384 square at the smallest
Anything smaller will have heavy artifacting

512x512 will always yield the most accurate results as the model was trained at that resolution

Try Low strength (0.3-0.4) + High CFG in img2img for interesting outputs

You can use Japanese Unicode characters in prompts


->==-----Launching Different Models-----==<-

If you have multiple models installed and you want to switch between them conveniently, you can make another .bat


Make a copy of webui-user.bat and name it whatever you want

After  COMMANDLINE_ARGS=, add --ckpt followed by the desired model to your launch parameters:
eg: COMMANDLINE_ARGS=--ckpt wd-v1-2-full-emma.ckpt


->==-----Pruning Waifu .cpkt-----==<-

The Waifu diffusion model is normally 7gb due to redundant training data,
but it can be reduced to 3.6gb without any loss of quality, reducing ram usage and loading time drastically


Download https://github.com/harubaru/waifu-diffusion/blob/main/scripts/prune.py

Edit the last line in prune.py to the path of your waifu-diffusion ckpt, then run python prune.py in cmd from your main folder,
You will now have a pruned .ckpt


->==-----Changing UI Defaults-----==<-


After running once, a ui-config.json file appears in webui master directory:
Edit values to your liking and the next time you launch the program they will be applied.


->==-----Auto-update-----==<-
Note: This only applies to those who used git clone to install the repo, and not those who used the .zip
You can set your script to automatically update by editing webui-user.bat
Add git pull one line above call webui.bat
Save
->==-----Running Online-----==<-


Use --share option to run online. You will get a xxx.app.gradio link. This is the intended way to use the program in collabs.

Use --listen to make the server listen to network connections. This will allow computers on local newtork to access the UI, and if you configure port forwarding, also computers on the internet.

Use --port xxxx to make the server listen on a specific port, xxxx being the wanted port. Remember that all ports below 1024 needs root/admin rights, for this reason it is advised to use a port above 1024. Defaults to port 7860 if available.


->==-----Enabling Negative Prompts-----==<-

Negative prompts are a powerful tool to remove unwanted features and elements from your generations
They should be enabled by default, but if not:


Edit webui-user.bat

After  COMMANDLINE_ARGS=, add --show-negative-prompt to your launch parameters:
COMMANDLINE_ARGS=--show-negative-prompt


!!! info RUNNING ON WINDOWS 7/CONDA
==(You can also try this method if the traditional install isn't working)==
Windows 7 does not allow for directly installing the version of Python recommended in this guide on it's own.
However, it does allow for installing the latest versions of Python within Conda:



Follow all the same steps from the main guide, up to Step 5



Download Miniconda HERE. Download Miniconda 3



Install Miniconda in the default location. Install for all users.
Uncheck "Register Miniconda as the system Python 3.9" unless you want to



Open Anaconda Prompt (miniconda3)



In Miniconda, navigate to the /stable-diffusion-webui-master folder wherever you downloaded using "cd" to jump folders.
(Or just type "cd" followed by a space, and then drag the folder into the Anaconda prompt.)



Type the following commands to make an environment and install the necessary dependencies:



conda create --name qwe
(You can name it whatever you want instead of qwe)



conda activate qwe



conda install python



conda install git



webui.bat
(Note: it may seem like it's stuck on "Installing torch" in the beginning. This is normal and should take 10-15 minutes)
==It should now be ready to use==



==NOTE:== On Windows 7, you may get "api-ms-win-core-path-l1-1-0.dll is missing" in Conda.
This is because new versions of Python and programs that rely on Python (Like Blender, etc.) require a system file only present in newer versions of Windows
Luckily, it has been backported to be compatible with W7, and can be downloaded HERE (Github page)
Upzip and copy the x86 .dll into C:\Windows\SysWOW64 and the x64 .dll into C:\Windows\System32 and reboot, then Python should install successfully



RUNNING:


Navigate to /stable-diffusion-webui-master in Miniconda

Type conda activate qwe
(You will need to type 'conda activate qwe' every time you wish to run webui)

Type webui-user.bat

After loading the model it should give you a LAN address such as '127.0.0.1:7860'
Enter the address into your browser to enter the GUI environment


!!! info EXTRAS

--OLD MODEL--
The original v1.3 leaked model from June can be downloaded here:
https://drinkordiecdn.lol/sd-v1-3-full-ema.ckpt
Backup Download: https://download1980.mediafire.com/3nu6nlhy92ag/wnlyj8vikn2kpzn/sd-v1-3-full-ema.ckpt
Torrent Magnet: https://rentry.co/6gocs

--OLD GUIDE--
The original hlky guide (replaced as of 9/8/22) is here: https://rentry.org/GUItard
Japanese hlky guide https://news.livedoor.com/article/detail/22794512/
The original guide (replaced as of 8/25/22) is here: https://rentry.org/kretard

->APPROXIMADE RENDER TIME BY GPU (50 steps)<-

->SAMPLER COMPARISON<-
8GB).