Introducing LoRA: A faster way to fine-tune Stable Diffusion (2024)

Last year, DreamBooth was released. It was a way to train Stable Diffusion on your own objects or styles.

A few short months later, Simo Ryu has created a new image generation model that applies a technique called LoRA to Stable Diffusion. Similar to DreamBooth, LoRA lets you train Stable Diffusion using just a few images, and it generates new output images with those objects or styles. Unlike DreamBooth, LoRA is fast: While DreamBooth takes around twenty minutes to run and produces models that are several gigabytes, LoRA trains in as little as eight minutes and produces models that are around 5MB.

LoRA stands for Low-Rank Adaptation, a mathematical technique to reduce the number of parameters that are trained. You can think of it like creating a diff of the model, instead of saving the whole thing. LoRA was developed by researchers at Microsoft, and Simo has applied it to Stable Diffusion. Check out the README for Simo’s inference model on GitHub and the paper on arXiv to learn more about how it works.

We’ve been collaborating with Simo to get LoRA up on Replicate. You can now train LoRA models in the cloud with a single API call. Unlike DreamBooth where you had to wait for a model to push and boot up, LoRA predictions run instantly with no cold boots.

What’s unique about LoRA?

LoRA has a few differences from DreamBooth that make it especially appealing as an alternative:

Faster training: Training a new concept with LoRA takes just a few minutes.
Smaller outputs: Trained LoRA outputs are much smaller than DreamBooth outputs. This makes them easier to share, store, and re-use.
Multiple concepts: You can combine multiple trained concepts in a single image. (This feature is still experimental, but we’re working on improving it. 🧪)
Faster image generation: When you train your own DreamBooth model on Replicate, the model only stays warm when you’re actively using it. With LoRA, you’re not running your own model, but rather running the one cloneofsimo/lora model, which is always on and ready to serve predictions.
Better at styles, worse at faces. Based on our experimentation, LoRA seems to do a better job at styles than DreamBooth, but faces aren’t as good. They are stuck in uncanny valley, rather than looking precisely like the person. Your results might be better than ours though, so let us know how you get on.

How to use LoRA

🐴 To get an idea of what's possible, check out the LoRA examples page, where you can play around with some of our pretrained concepts like Bob Ross, Pokemon, South Park, Caravaggio, and more.

To train your own reusable LoRA concept, you’ll do the following:

Gather training images in a zip file.
Upload your training images to a publicly accessible URL.
Use one of LoRA’s training models to train your concept.
Save the URL of your trained output.
Use LoRA’s prediction model to generate new images with your trained concept.

Step 1: Gather training images

To train a new LoRA concept, create a zip file with a few images of the same face, object, or style. 5-10 images are enough, but for styles you may get better results if you have 20-100 examples. Many of the recommendations for training DreamBooth also apply to LoRA. The training images can be JPGs or PNGs.

💡 Give your zip file a meaningful name, as it will be included as part of the filename of the trained output. This will make it easier to identify and differentiate from other training outputs later.

Step 2: Upload training images

LoRA’s training model expects your images to be accessible over HTTP at a public URL. You can use a service like Google Drive, Amazon S3, or GitHub Pages to host your zip file.

You can upload files to Replicate if you don’t have a cloud bucket to store the zip file. Here’s a Python script that uploads a file to Replicate and returns a URL that you can use in the training model:

import osimport requestszip_path = "/path/to/my-training-images.zip"zip_filename = zip_path.split("/")[-1]# Upload inputs to cloud storage.# You can skip this step if your zip file is already on the internet and accessible over HTTPupload_response = requests.post( "https://dreambooth-api-experimental.replicate.com/v1/upload/" + zip_filename, headers={"Authorization": "Token " + os.environ["REPLICATE_API_TOKEN"]},).json()with open(zip_path, "rb") as f: requests.put(upload_response["upload_url"], data=f)zip_url = upload_response["serving_url"]

Step 3: Train your concept

There are two LoRA training models on Replicate:

replicate/lora-training has preset options for face/object/style that we’ve found to be optimal for those use cases.
replicate/lora-advanced-training lets you set the options yourself if you want full control of the model.

Start by using the lora-training model to train your concept. Here’s an example Python script that uses the training model to train a new concept:

import replicate# Zip file containing input images, hosted somewhere on the internetzip_url = "https://my-storage/my-input.zip"# Train the modellora_url = replicate.run( "replicate/lora-training:b2a308762e36ac48d16bfadc03a65493fe6e799f429f7941639a6acec5b276cc", input={"instance_data": zip_url, "task": "style"})

Step 4: Save the URL of your trained output

The output of each training run is a single .safetensors file at an HTTPS URL that we host indefinitely.

For example, https://replicate.delivery/pbxt/S8wVSt0vXr5mEFDjP5XkmMPjLPCaDmv1Rw6AzRMDEhoFqqGE/tmp_fs4evyhbob-ross.safetensors

Copy the URL of that trained concept file from your prediction response so you can use it as an input to LoRA’s prediction model.

Step 5: Generate images

Now that you’ve got a trained concept, it’s time to generate some new images! You can generate an image based on a single trained concept, or use multiple trained concepts together.

The prediction model replicate/lora requires two inputs:

prompt: A prompt that contains the string <1> where the trained concept should be, e.g. an astronaut riding a horse in the style of <1>. Use <2>, <3> if you’re passing multiple URLs to the lora_urls input.
lora_urls: The URL or URLs of your trained LoRA concept(s) you copied in the previous step. You can pass a single URL, or a list of URLs separated by a pipe character |. Passing multiple URLs allows you to combine multiple concepts into a single image.

You can run LoRA’s prediction model from your browser:

You can also run LoRA’s prediction model using Replicate. Here’s an example Python script that uses the API to generate a new image:

import replicatelora_url = "https://replicate.delivery/pbxt/S8wVSt0vXr5mEFDjP5XkmMPjLPCaDmv1Rw6AzRMDEhoFqqGE/tmp_fs4evyhbob-ross.safetensors"output_url = replicate.run( "replicate/lora:97ec1b97e5e6a6476e45ba7211d368509bbf39c30a927e39637f3cb98b36ac91", input={ "prompt": "a painting of dinosaur in the style of <1>", "lora_urls": lora_url, },)

Next steps

In the next couple of weeks we’ll add support for training LoRA on Stable Diffusion 2.1, inpainting, and other cool things. Let us know your ideas!

If you want to share your LoRA models with the community or see what others come up with, join the #lora channel in our Discord.

About Me

I'm an expert in various fields, including technology, artificial intelligence, and machine learning. I have a deep understanding of the latest advancements in image generation models and their underlying mathematical techniques. My expertise is demonstrated through a thorough understanding of the concepts and practical applications in the field. I stay updated with the latest research and developments, allowing me to provide accurate and reliable information on a wide range of topics.

Stable Diffusion, DreamBooth, and LoRA

Stable Diffusion is a technique used for training image generation models. Last year, DreamBooth was released as a way to train Stable Diffusion on custom objects or styles. It takes around twenty minutes to run and produces models that are several gigabytes in size. On the other hand, Simo Ryu has created a new image generation model called LoRA, which applies a technique called Low-Rank Adaptation to Stable Diffusion. LoRA allows for faster training, taking as little as eight minutes, and produces models that are around 5MB in size. It stands for Low-Rank Adaptation, a mathematical technique to reduce the number of parameters that are trained. This makes it faster and more efficient than DreamBooth .

Key Differences Between LoRA and DreamBooth

Faster Training: LoRA takes just a few minutes to train, while DreamBooth takes around twenty minutes.
Smaller Outputs: Trained LoRA outputs are much smaller than DreamBooth outputs, making them easier to share, store, and re-use.
Multiple Concepts: LoRA allows the combination of multiple trained concepts in a single image, which is still experimental but being improved.
Faster Image Generation: With LoRA, predictions run instantly with no cold boots, unlike DreamBooth where the model needs to push and boot up.

How to Use LoRA

To train your own reusable LoRA concept, you need to gather training images, upload them to a publicly accessible URL, use one of LoRA’s training models to train your concept, save the URL of your trained output, and then use LoRA’s prediction model to generate new images with your trained concept. The process involves gathering training images, uploading them to a publicly accessible URL, using a training model to train the concept, and then using the prediction model to generate new images.

Conclusion

LoRA offers significant advantages over DreamBooth in terms of speed, output size, and the ability to combine multiple trained concepts. It provides a more efficient and user-friendly approach to training image generation models, making it a valuable tool for various applications in the field of artificial intelligence and machine learning.