Stable Diffusion AMD Linux ROCm
Stable-Diffusion-AMD-Linux-Low-VRAM
#->This repository contains instructions on how to host your own AI for image generation using stable diffusion with an 8GB VRAM AMD GPU on Linux. This is an affordable and efficient alternative to using Google Colab, which can be quite expensive.
Watch this YouTube video to learn how to install stable diffusion and make it work on your AMD GPU using ROCm. Please note that each GPU is unique, and the launch parameters required may vary. However, the launch parameters used in the video are as follows:
--FINALno-halfGUI--always-batch-cond-uncondRETARD--opt-sub-quad-attentionGUIDE---medvram --disable-nan-check<-#
For WHOa SHALLcomplete NOTlist BEof NAMED"<-##launch ->Theparameters, definitivecheck Stableout Diffusionthe experienceOptimizations ™<-wiki.
->---NEWIf FEATUREyou SHOWCASE & HOWTO---<-
Notable: Inpainting/Outpainting, Live generation preview, Tiling, Upscaling, <4gb VRAM support, Negative prompts, CLIP
==(Basic) CPU-only guide available Here==
==Japanese guide here 日本語ガイド==
Special thankswant to alldownload anonsmy whoVAE, contributedyou Minimumcan.
Prerequisites
To ramget -Maxwell-architecturestarted, you'll need the following:
Installation
Download the driver for your AMD GPU withfrom atAMD's leastwebsite.
Add yourself to the render and video groups using the following commands:
sudo usermod -Linuxa or-G Windowsrender 7/8/10+yourusername
(Seesudo tabusermod for-a W7-specific-G instructions)video !!!yourusername
note
Confirm that you have Python 3 installed by typing the following command into the terminal:
python3 --version
Step 1: Install GitROCm (page)by running the following command:
sudo amdgpu-install --usecase=rocm --no-dkms
StepReboot 2:your system using the following command:
sudo reboot
After rebooting, confirm that your GPU is recognized by running the following command:
rocminfo
Clone the WebUIstable repodiffusion toGUI yourrepository:
sudo locationapt-get ininstall agit
Git bash terminal:
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
ORcd just download the WebUI repo .zip HERE and extract
(git clone is easier to update since you can just type git pullin the directory with CMD or Git Bash)
Note: In this guide, /stable-diffusion-webui
and/stable-diffusion-webui-master refer to the same folder, .zip version is just called master
Step 3: Download the 1.4 AI model from huggingface (requires signup) or HERE
(torrent magnet)
==(NEW 9/7)== Alternate 1.4 Waifu model trained on an additional 56k Danbooru images HERE (mirror)
(torrent magnet)
(Note: everal GB larger than normal model, see instructions below for pruning)
comparison
Step 4: Rename your .ckpt file to "model.ckpt", and place it in stable-diffusion-webui-master
Step 5 (Optional): Download and place GFPGANv1.3.pth into the master webUI directory
(GFPGAN automatically corrects realistic faces)
Step 6: Install Python 3.10.6 (page)
Step 7: Run webui-user.bat from Windows Explorer. Run it as normal user, not as administrator.
==Usage:==
!!! info RUNNING ON 4GB (And lower!)
==These parameters are also useful for regular users who want to make larger images or batch sizes!==
It is possible to drastically reduce VRAM usage with some modifications:
Step 1: Edit webui-user.bat
Step 2: After COMMANDLINE_ARGS= , enter your desired parameters:
Example: COMMANDLINE_ARGS=--medvram --opt-split-attention
If you have 4GBPython 3.8 installed, make sure you have VENV capabilities by running the following command (replace with your Python version if necessary):
apt install python3.8-venv
Install pip3 and wheel, and update them using the following commands:
sudo apt install python3-pip
python -m pip install --upgrade pip wheel
Download any stable diffusion model you like and put it in the models/Stable-diffusion folder. You can find models at CIViTAI, which is also a great source of prompts.
For better performance, upgrade to the latest stable kernel by running the following commands:
sudo apt-get update
sudo apt-get dist-upgrade
sudo reboot
cd stable-diffusion-webui
python -m venv venv
source venv/bin/activate
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/rocm5.4.2
Note: This is to be installed in the VENV, not on the OS!
pip list | grep 'torch'
The output should show Torch, torchvision, and torchaudio version numbers with ROCM tagged at the end.

Optimize VRAM usage with --medvram and want--lowvram launch arguments. Use --always-batch-cond-uncond with --lowvram and --medvram options to makeprevent 512x512bad (orquality. maybeIf upyour results turn out to 640x640)be black images, your card probably does not support float16, so use -- (at the cost of more VRAM).medvramprecision full
IfBenchmark youdifferent haveoptions. 4GBThe VRAMoptions andthat wantenabled generating nice images (1024x1024) upscaled to make4K larger images, or you get an out of memory error with --medvram,
use --medvram --opt-split-attention instead.were:
If you have 4GB VRAM and you still get an out of memory error,
use
--lowvramno-half --always-batch-cond-uncond --opt-split-sub-quad-attentioninstead.
If you have 2GB VRAM or 4GB VRAM and want to make images larger (but slower) than you can with --medvram,
use --lowvram --opt-split-attention.
If you have ==more VRAM== and want to make larger images than you can usually make,
just use --medvram --opt-split-attentiondisable-nan-check
.
(Generation
Launch willthe be moderately slower but some swear on these parameters)
-Otherwise, do not use any of these-command:
python launch.py ==NOTES:==
--precisionopt-sub-quad-attention full--medvram --disable-nan-check --always-batch-cond-uncond --no-half
Note: additionThese options may differ for other graphics card models.
The time taken to othergenerate flags,1024x1024 img2img was 1m 16.33s, and the1024x1024 modelhires willfix takewas much1m more39s. spaceGenerating inbase VRAMimages
->==-----LINKS-----==<-10-20s.
->==-----TROUBLESHOOTING-----==<-
set PYTHON=B:\soft\Python310\python.exeVENV_DIR=venvset VENV_DIR=requirements_versions.txt- If
yousomethingfeeldoesyounotbrokework,somethingcheck the Torch, torchvision, andwanttorchaudiotoversionreinstall from scratch, delete directories: venv, repositories.
cudatoolkit=11.3cudatoolkit=9.0->==-----TIPS-----==<-
pip UIlist elements| forgrep informative'torch'
tooltips
The outpaintingversion scriptnumbers requiresshould HIGHhave DENOISINGROCM totagged work properly (eg. 0.9)
Outpainting benefits from high steps
Usage after youinstall
Every your first style, allowing you to make a selection. Prompts can be deleted by accessing styles.csv
(This can be helpful if you find a combination that generations really good images and want to repeatedly use it with varied subjects.)
cd stable-diffusion-webui
python -m venv venv
source venv/bin/activate
python launch.py --opt-sub-quad-attention --medvram --disable-nan-check --always-batch-cond-uncond --no-half
Watch out for VRAM usage and system temps with the following commands:
sudo radeontop
watch -n 1 sensors
Adjust your fan curve if your temps are too high (70C).
In this youtube video, I show the process of generating images with high details and a 4K end size using the same keywords as a generated image in img2img produces interesting variants
->==-----Launching Different Models-----==<-
If you have multiple models installed and you want to switch between them conveniently, you can make another .bat
COMMANDLINE_ARGS=--ckptCOMMANDLINE_ARGS=--ckpt wd-v1-2-full-emma.ckpt->==-----Pruning Waifu .cpkt-----==<-
The Waifu diffusion model is normally 7gb due to redundant training data,
but it can be reduced to 3.6gb without any loss of quality, reducing ram usage and loading time drastically
python prune.py->==-----Changing UI Defaults-----==<-
ui-config.json->==-----Auto-update-----==<-
Note: This only applies to those who used git clone to install the repo, and not those who used the .zip
You can set your script to automatically update by editing webui-user.bat
Add git pull one line above call webui.bat
Save
->==-----Running Online-----==<-
->==-----Enabling Negative Prompts-----==<-
Negative prompts are a powerful tool to remove unwanted features and elements from your generations
They should be enabled by default, but if not:
COMMANDLINE_ARGS=--show-negative-promptCOMMANDLINE_ARGS=--show-negative-prompt!!! info RUNNING ON WINDOWS 7/CONDA
==(You can also try this method if the traditional install isn't working)==
Windows 7 does not allow for directly installing the version of Python recommended in this guide on it's own.
However, it does allow for installing the latest versions of Python within Conda:
Follow all the same steps from the main guide, up to Step 5
Download Miniconda HERE. Download Miniconda 3
Install Miniconda in the default location. Install for all users.
Uncheck "Register Miniconda as the system Python 3.9" unless you want to
Open Anaconda Prompt (miniconda3)
In Miniconda, navigate to the /stable-diffusion-webui-master folder wherever you downloaded using "cd" to jump folders.
(Or just type "cd" followed by a space, and then drag the folder into the Anaconda prompt.)
Type the following commands to make an environment and install the necessary dependencies:
conda create --name qwe
(You can name it whatever you want instead of qwe)
conda activate qwe
conda install python
conda install git
webui.bat
(Note: it may seem like it's stuck on "Installing torch" in the beginning. This is normal and should take 10-15 minutes)
==It should now be ready to use==
==NOTE:== On Windows 7, you may get "api-ms-win-core-path-l1-1-0.dll is missing" in Conda.
This is because new versions of Python and programs that rely on Python (Like Blender, etc.) require a system file only present in newer versions of Windows
Luckily, it has been backported to be compatible with W7, and can be downloaded HERE (Github page)
Upzip and copy the x86 .dll into C:\Windows\SysWOW64 and the x64 .dll into C:\Windows\System32 and reboot, then Python should install successfully
RUNNING:
/stable-diffusion-webui-masterconda activate qwewebui-user.bat!!! info EXTRAS
--OLD MODEL--
The original v1.3 leaked model from June can be downloaded here:
https://drinkordiecdn.lol/sd-v1-3-full-ema.ckpt
Backup Download: https://download1980.mediafire.com/3nu6nlhy92ag/wnlyj8vikn2kpzn/sd-v1-3-full-ema.ckpt
Torrent Magnet: https://rentry.co/6gocs
--OLD GUIDE--
The original hlky guide (replaced as of 9/8/22) is here: https://rentry.org/GUItard
Japanese hlky guide https://news.livedoor.com/article/detail/22794512/
The original guide (replaced as of 8/25/22) is here: https://rentry.org/kretard
->APPROXIMADE RENDER TIME BY GPU (50 steps)<-
->SAMPLER COMPARISON<-
8GB).