Create your Lilypad module
This guide will walk you through creating a basic sentiment analysis module using create-lilypad-module
and distilbert/distilbert-base-uncased-finetuned-sst-2-english
(which will be referred to as Distilbert from now on). We will be referring back to the Hugging Face page throughout this guide, so it's best to keep it open and accessible.
Input:
Output:
If you prefer to follow along with a video guide, you can view our live workshop below! π
To build and run a module on Lilypad Network, you'll need to have the Lilypad CLI, Python and Docker on your machine, as well as GitHub and Docker Hub accounts.
For this guide, we'll be using create-lilypad-module
which requires pip
and uses Python
.
The first thing you'll need for your module is a local model to use.
A basic outline for downloading a model from Hugging Face is provided in scripts/download_models.py
. The structure of the script and the methods for downloading a model can differ between models and libraries. Itβs important to tailor the process to the specific requirements of the model you're working with.
You can get started by attempting to run the download_models.py
script.
Since the script hasn't been properly configured yet, it will return an error and point you to the file.
Open scripts/download_models.py
and you will see some TODO
comments with instructions. Let's go through them each in order. You can remove each TODO
comment after completing the task.
First we have a reminder to update our requirements.txt
file, which is used by the Dockerfile
to install the module's dependencies. In the next line is a commented out import
statement.
To find the dependencies that our model requires, we can refer back to Distilbert's Hugging Face page and click on the "Use this model" dropdown, where you will see the π€ Transformers library as an option. Click it.
Most (but not all) models that utilize machine learning use the π€ Transformers library, which provides APIs and tools to easily download and train pretrained models.
You should see a handy modal explaining how to use the model with the Transformers
library. For most models, you'd want to use this. However, Distilbert has a specific tokenizer and model class. Close the modal and scroll to the How to Get Started With the Model section of the model card. We're going to use this instead.
For now, let's look at the top 2 lines of the provided code block:
Notice that torch
is also being used. Copy the transformers
import statement and paste it over the existing import statement in our download_models.py
file.
Now open requirements.txt
:
These are 2 of the most common libraries when working with models. Similar to the import
statement in the download_models.py
file, they are provided by default for convenience, but commented out because although they are common, not every model will use them.
Since this model happens to use both of these libraries, we can uncomment both lines and close the file after saving.
torch
is a collection of APIs for extending PyTorchβs core library of operators.
Return to the download_models.py
file, and look for the next TODO
comment.
If we take a look at the Distilbert Hugging Face page, we can use the copy button next to the name of the module to get the MODULE_IDENTIFIER
. Paste that in as the value.
For our use case, it should look like this:
You're almost ready to download the model. All you need to do now is replace the following 2 lines after the TODO
comment:
Instead of using AutoTokenizer
and AutoModelForSequenceClassification
, replace those with the DistilBertTokenizer
and DistilBertForSequenceClassification
we imported.
The script is now configured! Try running the command again.
The models
directory should now appear in your project. π
No matter which model you are using, be sure to thoroughly read the model's documentation to learn how to properly download and use the model locally.
Now for the fun part, it's time to start using the model!
This time we'll get started by running the run_module
script.
You should see an error with some instructions.
Let's tackle the run_inference.py
script first. This is where your modules primary logic and functionality should live. There is a TODO
comment near the top of the file.
We've already updated the requirements.txt
file, so we can skip that step. Go ahead and uncomment the import
statements and replace the transformers
line with the DistilBertTokenizer
and DistilBertForSequenceClassification
.
We should refer back to the "How to Get Started With the Model" section of Distilbert's model card to figure out how to use the model.
Let's implement this into our run_inference
script. Scroll down to the main()
function and you'll see another TODO
comment.
Same as before, uncomment and replace AutoTokenizer
with DistilBertTokenizer
and AutoModelForSeq2SeqLM
with DistilBertForSequenceClassification
. This is now functionally identical to the first 2 lines of code from Distilbert's example.
Below that, the tokenizer
and model
are passed into the run_job()
function. Let's scroll back up and take a look at the function. This is where we'll want to implement the rest of the code from Distilbert's example. The inputs
are already functionally identical, so let's adjust the output
.
From the Distilbert model card, copy all of the code below the inputs
variable declaration, and paste it over the output
variable declaration in your modules code.
All we need to do from here is set the output
to the last line we pasted.
That's everything we'll need for the modules source code!
We still need to finish step 2 that the error in the console gave us earlier. Open the run_module.py
script.
Find the TODO
comment and delete the code block underneath.
Before you are able to run your module, we need to build the Docker image. You can run the following command:
You should see the following response in the console:
Open the constants.py
file, it should look like this:
For now, we'll be testing the module locally, so all we need to worry about is the DOCKER_REPO
variable. We'll use MODULE_REPO
when it's time to run the module on Lilypad Network. For help or more information, view the configuration documentation
You should be able to successfully build the Docker image now.
In the modules Dockerfile, you'll find 3 COPY instructions.
These instructions bring the requirements.txt
file, the src
directory, and the models
directory into the Docker image. It's important to remember that any modifications to these files or directories will necessitate a rebuild of the module's Docker image to ensure the changes are reflected in the container.
It's finally time to see your module in action.
Let's start by running it locally.
The CLI should ask you for an input. Enter whatever you like and hit enter. The module will analyze the sentiment of your input and output the results at outputs/result.json
.
You just used a local LLM! π
Before you can run the module on Lilypad Network, you'll need to push the Docker image to Docker Hub.
While the Docker image is being built and pushed, you should configure the rest of the variables in constants.py
. Make sure that you push your code to a public GitHub repository.
Since these variables are only used in scripts and not in any src
code that gets used in the Docker image we won't need to rebuild after making these changes.
The last thing we'll need to do is edit the Lilypad module configuration file, lilypad_module.json.tmpl
. For the purposes of this module, the default configuration is mostly correct. However, the "Image"
field needs to be configured.
Replace the default value with your Docker Hub username, module image, and tag.
Once your Docker image is pushed to Docker Hub and your most recent code is pushed to a public GitHub repository, you can test your module on Lilypad's DemoNet by replacing the --local
flag with --demonet
You can also remove the
--demonet
flag and supply yourWEB3_PRIVATE_KEY
to run the module on Lilypad's IncentiveNet.
You just used an LLM on Lilypad's decentralized network! π
Now anyone who has the Lilypad CLI installed can also run your module: