Creating Your Module

Create your Lilypad module

Getting Started

This guide will walk you through creating a basic sentiment analysis module using create-lilypad-module and distilbert/distilbert-base-uncased-finetuned-sst-2-english (which will be referred to as Distilbert from now on). We will be referring back to the Hugging Face page throughout this guide, so it's best to keep it open and accessible.

Input:

lilypad run --network demonet github.com/DevlinRocha/lilypad-module-sentiment:main --web3-private-key 0ec38dd1ee0898dae8460b269859b4fb3cb519b35d82014c909ec4741c790831 -i input="LILYPAD IS AWESOME"

Output:

{
  "input": "LILYPAD IS AWESOME",
  "sentiment": "POSITIVE",
  "status": "success"
}

View the final source code for the module

If you prefer to follow along with a video guide, you can view our live workshop below! 👇

Prerequisites

To build and run a module on Lilypad Network, you'll need to have the Lilypad CLI, Python and Docker on your machine, as well as GitHub and Docker Hub accounts.

For this guide, we'll be using create-lilypad-module which requires pip and uses Python.

Downloading The Model

The first thing you'll need for your module is a local model to use.

A basic outline for downloading a model from Hugging Face is provided in scripts/download_models.py. The structure of the script and the methods for downloading a model can differ between models and libraries. It’s important to tailor the process to the specific requirements of the model you're working with.

You can get started by attempting to run the download_models.py script.

python -m scripts.download_models

Since the script hasn't been properly configured yet, it will return an error and point you to the file.

❌ Error: Model download script is not configured.
👉 /scripts/download_models.py

Open scripts/download_models.py and you will see some TODO comments with instructions. Let's go through them each in order. You can remove each TODO comment after completing the task.

# TODO: Update ../requirements.txt
# from transformers import AutoTokenizer, AutoModelForSequenceClassification

First we have a reminder to update our requirements.txt file, which is used by the Dockerfile to install the module's dependencies. In the next line is a commented out import statement.

To find the dependencies that our model requires, we can refer back to Distilbert's Hugging Face page and click on the "Use this model" dropdown, where you will see the 🤗 Transformers library as an option. Click it.

Most (but not all) models that utilize machine learning use the 🤗 Transformers library, which provides APIs and tools to easily download and train pretrained models.

You should see a handy modal explaining how to use the model with the Transformers library. For most models, you'd want to use this. However, Distilbert has a specific tokenizer and model class. Close the modal and scroll to the How to Get Started With the Model section of the model card. We're going to use this instead.

For now, let's look at the top 2 lines of the provided code block:

import torch
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification

Notice that torch is also being used. Copy the transformers import statement and paste it over the existing import statement in our download_models.py file.

Now open requirements.txt:

# torch==2.4.1
# transformers==4.47.1

These are 2 of the most common libraries when working with models. Similar to the import statement in the download_models.py file, they are provided by default for convenience, but commented out because although they are common, not every model will use them.

Since this model happens to use both of these libraries, we can uncomment both lines and close the file after saving.

torch is a collection of APIs for extending PyTorch’s core library of operators.

Return to the download_models.py file, and look for the next TODO comment.

# TODO: Set this to your model's Hugging Face identifier
MODEL_IDENTIFIER = ""

If we take a look at the Distilbert Hugging Face page, we can use the copy button next to the name of the module to get the MODULE_IDENTIFIER. Paste that in as the value.

For our use case, it should look like this:

MODEL_IDENTIFIER = "distilbert/distilbert-base-uncased-finetuned-sst-2-english"

You're almost ready to download the model. All you need to do now is replace the following 2 lines after the TODO comment:

# TODO: Initialize `model` and `tokenizer`
# tokenizer = AutoTokenizer.from_pretrained(MODEL_IDENTIFIER)
# model = AutoModelForSequenceClassification.from_pretrained(MODEL_IDENTIFIER)

Instead of using AutoTokenizer and AutoModelForSequenceClassification, replace those with the DistilBertTokenizer and DistilBertForSequenceClassification we imported.

tokenizer = DistilBertTokenizer.from_pretrained(MODEL_IDENTIFIER)
model = DistilBertForSequenceClassification.from_pretrained(MODEL_IDENTIFIER)

The script is now configured! Try running the command again.

python -m scripts.download_models

The models directory should now appear in your project. 🎉

No matter which model you are using, be sure to thoroughly read the model's documentation to learn how to properly download and use the model locally.

Building Your Model

Now for the fun part, it's time to start using the model!

This time we'll get started by running the run_module script.

python -m scripts.run_module

You should see an error with some instructions.

❌ Error: No job configured. Implement the module's job before running the module.
        1. Implement job module
                👉 /src/run_inference.py
        2. Delete this code block
                👉 /scripts/run_module.py

1. Implement Job Module

Let's tackle the run_inference.py script first. This is where your modules primary logic and functionality should live. There is a TODO comment near the top of the file.

# TODO: Update ../requirements.txt
# import torch
# from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

We've already updated the requirements.txt file, so we can skip that step. Go ahead and uncomment the import statements and replace the transformers line with the DistilBertTokenizer and DistilBertForSequenceClassification.

We should refer back to the "How to Get Started With the Model" section of Distilbert's model card to figure out how to use the model.

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]

Let's implement this into our run_inference script. Scroll down to the main() function and you'll see another TODO comment.

# TODO: Initialize `model` and `tokenizer`
# tokenizer = AutoTokenizer.from_pretrained(MODEL_DIRECTORY)
# model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIRECTORY)

Same as before, uncomment and replace AutoTokenizer with DistilBertTokenizer and AutoModelForSeq2SeqLM with DistilBertForSequenceClassification. This is now functionally identical to the first 2 lines of code from Distilbert's example.

Below that, the tokenizer and model are passed into the run_job() function. Let's scroll back up and take a look at the function. This is where we'll want to implement the rest of the code from Distilbert's example. The inputs are already functionally identical, so let's adjust the output.

From the Distilbert model card, copy all of the code below the inputs variable declaration, and paste it over the output variable declaration in your modules code.

inputs = tokenizer(
    input,
    return_tensors="pt",
    truncation=True,
    padding=True,
)

with torch.no_grad():
    logits = model(**inputs).logits

predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]

return output

All we need to do from here is set the output to the last line we pasted.

output = model.config.id2label[predicted_class_id]

return output

That's everything we'll need for the modules source code!

2. Delete Code Block

We still need to finish step 2 that the error in the console gave us earlier. Open the run_module.py script.

Find the TODO comment and delete the code block underneath.

# TODO: Remove the following print and sys.exit statements and create the module job.
print(
    "❌ Error: No job configured. Implement the module's job before running the module.",
    file=sys.stderr,
    flush=True,
)
print("\t1. Implement job module")
print("\t\t👉 /src/run_inference.py")
print("\t2. Delete this code block")
print("\t\t👉 /scripts/run_module.py")
sys.exit(1)

Preparing Your Module

Before you are able to run your module, we need to build the Docker image. You can run the following command:

python -m scripts.docker_build

You should see the following response in the console:

❌ Error: DOCKER_REPO is not set in config/constants.py.

Open the constants.py file, it should look like this:

# TODO: Set the Docker Hub repository before pushing the image.
# Example: "devlinrocha/lilypad-module-sentiment"
DOCKER_REPO = ""

# TODO: Set the tag for the Docker image.
# Example: "latest", "v1.0", or a commit SHA
DOCKER_TAG = "latest"

# TODO: Set the GitHub repository URL where your module is stored.
# Example: "github.com/devlinrocha/lilypad-module-sentiment".
MODULE_REPO = ""

# TODO: Specify the target branch name or commit hash.
# Example: "main" or "c3ed392c11060337cae010862b1af160cd805e67"
TARGET_COMMIT = "main"

For now, we'll be testing the module locally, so all we need to worry about is the DOCKER_REPO variable. We'll use MODULE_REPO when it's time to run the module on Lilypad Network. For help or more information, view the configuration documentation

You should be able to successfully build the Docker image now.

In the modules Dockerfile, you'll find 3 COPY instructions.

COPY requirements.txt .
COPY src /src
COPY models /models

These instructions bring the requirements.txt file, the src directory, and the models directory into the Docker image. It's important to remember that any modifications to these files or directories will necessitate a rebuild of the module's Docker image to ensure the changes are reflected in the container.

Running Your Module

It's finally time to see your module in action.

Local

Let's start by running it locally.

python -m scripts.run_module --local

The CLI should ask you for an input. Enter whatever you like and hit enter. The module will analyze the sentiment of your input and output the results at outputs/result.json.

{
  "input": "LILYPAD IS AWESOME",
  "result": "POSITIVE",
  "status": "success"
}

You just used a local LLM! 🎉

Lilypad Network

Before you can run the module on Lilypad Network, you'll need to push the Docker image to Docker Hub.

python -m scripts.docker_build --push

While the Docker image is being built and pushed, you should configure the rest of the variables in constants.py. Make sure that you push your code to a public GitHub repository.

Since these variables are only used in scripts and not in any src code that gets used in the Docker image we won't need to rebuild after making these changes.

The last thing we'll need to do is edit the Lilypad module configuration file, lilypad_module.json.tmpl. For the purposes of this module, the default configuration is mostly correct. However, the "Image" field needs to be configured.

Replace the default value with your Docker Hub username, module image, and tag.

"Image": "devlinrocha/lilypad-module-sentiment:latest"

Once your Docker image is pushed to Docker Hub and your most recent code is pushed to a public GitHub repository, you can test your module on Lilypad's DemoNet by replacing the --local flag with --demonet

python -m scripts.run_module --demonet

You can also remove the --demonet flag and supply your WEB3_PRIVATE_KEY to run the module on Lilypad's IncentiveNet.

You just used an LLM on Lilypad's decentralized network! 🎉

Now anyone who has the Lilypad CLI installed can also run your module:

lilypad run github.com/<MODULE_REPO>:<TARGET_COMMIT> --web3-private-key <WEB3_PRIVATE_KEY>

PreviousConfiguration NextJS CLI Wrapper

Last updated 9 months ago

Was this helpful?