Use create-lilypad-module to create Lilypad modules
create-lilypad-module
is an officially supported package that provides a simple scaffolding for building Lilypad modules. It offers a modern Docker setup with minimal configuration.
The following guide is using the "Hugging Face" template, which is more advanced.
If you are new to module building, it is highly recommended you use the "Ollama" template first.
There is currently not a guide in our docs for the "Ollama" template. We will work on adding one soon, but the directions in the README should be sufficient.
Getting Started: install and run create-lilypad-module
Folder Structure: output and explanation of create-lilypad-module
files
Configuration: requirements and explanations of Lilypad module configuration
Creating Your Module: a step-by-step guide on how to create a simple Lilypad module using create-lilypad-module
Getting started with `create-lilypad-module`
First, you'll need to install the package:
pip install create-lilypad-module
If you've previously installed create-lilypad-module
, you should to ensure that you're using the latest version:
pip install --upgrade create-lilypad-module
Now run create-lilypad-module
:
create-lilypad-module
The CLI will ask for the name of your project. Alternatively, you can run:
create-lilypad-module project_name
cd project_name
Output
project_name
├── config
│ └── constants.py
├── scripts
│ ├── docker_build.py
│ ├── download_models.py
│ └── run_module.py
├── src
│ └── run_inference.py
├── .dockerignore
├── .env
├── .gitignore
├── Dockerfile
├── lilypad_module.json.tmpl
├── README.md
└── requirements.txt
The folder structure output from using `create-lilypad-module`
After creation, your project should look like this:
project_name
├── config
│ └── constants.py
├── scripts
│ ├── docker_build.py
│ ├── download_models.py
│ └── run_module.py
├── src
│ └── run_inference.py
├── .dockerignore
├── .env
├── .gitignore
├── Dockerfile
├── lilypad_module.json.tmpl
├── README.md
└── requirements.txt
For the module to run, these files must exist with exact filenames:
src/run_inference.py
The Dockerfile
ENTRYPOINT
.
If you change this files name or location, you must also update the ENTRYPOINT
in your Dockerfile
and lilypad_module.json.tmpl
file to match.
config/constants.py
The configuration file that stores the DOCKER_REPO
, DOCKER_TAG
, MODULE_REPO
, and TARGET_COMMIT
.
If you change this files name or location, you must also update the import
statements in scripts/docker_build.py
and scripts/run_module.py
.
Dockerfile
Required to build your module into a Docker image, and push the image to Docker Hub where it can be accessed by Lilypad Network.
requirements.txt
Used by the Dockerfile
to install dependencies required by your module.
Technically, this file can be deleted or renamed, but this naming convention is highly recommended as an industry standard best practice.
lilypad_module.json.tmpl
The Lilypad configuration file.
You can delete or rename the other files.
You may create subdirectories inside src
. For faster builds and smaller Docker images, only files inside src
are copied by Docker. You need to put any files required to run your module inside src
, otherwise Docker won’t copy them.
You can create more top-level directories. They will not be included in the final Docker image so you can use them for things like documentation.
If you have Git installed and your project is not part of a larger repository, then a new repository will be initialized resulting in an additional top-level .git
directory.
Configure your Lilypad module
After bootstrapping your module, additional configuration is required to run it.
.env
WEB3_PRIVATE_KEY = ""
WEB3_PRIVATE_KEY
🚨 DO NOT SHARE THIS KEY 🚨
The private key for the wallet that will be used to run the job.
This is required to run the module on Lilypad Network.
A new development wallet is highly recommended to use for development. The wallet must have enough LP tokens and Arbitrum Sepolia ETH to fund the job.
config/constants.py
DOCKER_REPO = ""
MODULE_REPO = ""
TARGET_COMMIT = ""
DOCKER_REPO
The Docker Hub repository storing the container image of the module code.
This is required to push the image to Docker Hub and run the module on Lilypad Network.
e.g. "<dockerhub_username>/<dockerhub_image>"
DOCKER_TAG
The specific tag of the DOCKER_REPO
containing the module code.
Default: "latest"
MODULE_REPO
The URL for the GitHub repository storing the lilypad_module.json.tmpl
file. The visibility of the repository must be public.
The lilypad_module.json.tmpl
file points to a DOCKER_REPO
and Lilypad runs the module from the image.
e.g. "github.com/<github_username>/<github_repo>"
TARGET_COMMIT
The git branch or commit hash that contains the lilypad_module.json.tmpl
file you want to run.
Use git log
to easily find commit hashes.
Default: "main"
Your module will be bootstrapped with some handy scripts to help you download the model(s) for your module, build and push Docker images, and run your module locally or on Lilypad Network. Some additional configuration may be required.
In the project directory, you can run:
python -m scripts.download_models
A basic outline for downloading a model from Hugging Face is provided, but the structure of the script and the methods for downloading a model can differ between models and libraries. It’s important to tailor the process to the specific requirements of the model you're working with.
Most (but not all) models that utilize machine learning use the 🤗 Transformers library, which provides APIs and tools to easily download and train pretrained models.
No matter which model you are using, be sure to thoroughly read the documentation to learn how to properly download and use the model locally.
python -m scripts.docker_build
Builds and optionally publishes a Docker image for the module to use.
For most use cases, this script should be sufficient and won't require any configuration or modification (aside from setting your DOCKER_REPO
and DOCKER_TAG
).
--push
Flag
Running the script with --push
passed in pushes the Docker image to Docker Hub.
--no-cache
Flag
Running the script with --no-cache
passed in builds the Docker image without using the cache. This flag is useful if you need a fresh build to debug caching issues, force system or dependency updates, pull the latest base image, or ensure clean builds in CI/CD pipelines.
python -m scripts.run_module
This script is provided for convenience to speed up development. It is equivalent to running the Lilypad module with the provided input and private key (unless running the module locally, then no private key is required). Depending on how your module works, you may need to change the default behavior of this script.
--local
Flag
Running the script with --local
passed in runs the Lilypad module Docker image locally instead of on Lilypad's Network.
--demonet
Flag
Running the script with --demonet
passed in runs the Lilypad module Docker image on Lilypad's Demonet.
lilypad_module.json.tmpl
The default lilypad_module.json.tmpl
file is below. Make sure to update the Docker Image to point to your Docker Hub image with the correct tag.
The default
lilypad_module.json.tmpl
should work for low complexity modules. If your module requires additional resources (such as a GPU) make sure to configure the applicable fields.
{
"machine": { "gpu": 1, "cpu": 8000, "ram": 16000 },
"gpus": [{ "vram": 24576 }]
"job": {
"APIVersion": "V1beta1",
"Spec": {
"Deal": { "Concurrency": 1 },
"Docker": {
"Entrypoint": [
"/app/src/run_model", {{ .request }}
],
"Image": "DOCKER_HUB_USERNAME/DOCKER_IMAGE@INDEX_DIGEST"
},
"Engine": "Docker",
"Network": { "Type": "None" },
"Outputs": [{ "Name": "outputs", "Path": "/outputs" }],
"Resources": { "GPU": "1", "CPU": "8", "Memory": "16Gb" },
"Timeout": 1800,
"Verifier": "Noop"
}
}
}
Machine: Specifies the system resources.
GPUs: Specifies the minimum VRAM required.
Job: Specifies the job details.
APIVersion: Specifies the API version for the job.
Metadata: Specifies the metadata for the job.
Spec: Contains the detailed job specifications.
Deal: Sets the concurrency to 1, ensuring only one job instance runs at a time.
Docker: Configures the Docker container for the job
WorkingDirectory: Defines the working directory of the Docker image.
Entrypoint: Defines the command(s) to be executed in the container as part of its initial startup runtime.
EnvironmentVariables: This can be utilised to set env vars for the containers runtime, in the example above we use Go templating to set the INPUT
variable dynamically from the CLI.
Image: Specifies the image to be used (DOCKERHUB_USERNAME
/IMAGE
:TAG
).
Engine: Sets the container runtime (Default: "Docker"
).
Network: Specifies that the container does not require networking (Default: "Type": "None"
).
Outputs: Specifies name and path of the directory that will store module outputs.
Resources: Specify additional resources.
Timeout: Sets the maximum duration for the job. (Default: 600
[10 minutes]).
Create your Lilypad module
This guide will walk you through creating a basic sentiment analysis module using create-lilypad-module
and distilbert/distilbert-base-uncased-finetuned-sst-2-english
(which will be referred to as Distilbert from now on). We will be referring back to the Hugging Face page throughout this guide, so it's best to keep it open and accessible.
Input:
lilypad run --network demonet github.com/DevlinRocha/lilypad-module-sentiment:main --web3-private-key 0ec38dd1ee0898dae8460b269859b4fb3cb519b35d82014c909ec4741c790831 -i input="LILYPAD IS AWESOME"
Output:
{
"input": "LILYPAD IS AWESOME",
"sentiment": "POSITIVE",
"status": "success"
}
If you prefer to follow along with a video guide, you can view our live workshop below! 👇
To build and run a module on Lilypad Network, you'll need to have the Lilypad CLI, Python and Docker on your machine, as well as GitHub and Docker Hub accounts.
For this guide, we'll be using create-lilypad-module
which requires pip
and uses Python
.
The first thing you'll need for your module is a local model to use.
A basic outline for downloading a model from Hugging Face is provided in scripts/download_models.py
. The structure of the script and the methods for downloading a model can differ between models and libraries. It’s important to tailor the process to the specific requirements of the model you're working with.
You can get started by attempting to run the download_models.py
script.
python -m scripts.download_models
Since the script hasn't been properly configured yet, it will return an error and point you to the file.
❌ Error: Model download script is not configured.
👉 /scripts/download_models.py
Open scripts/download_models.py
and you will see some TODO
comments with instructions. Let's go through them each in order. You can remove each TODO
comment after completing the task.
# TODO: Update ../requirements.txt
# from transformers import AutoTokenizer, AutoModelForSequenceClassification
First we have a reminder to update our requirements.txt
file, which is used by the Dockerfile
to install the module's dependencies. In the next line is a commented out import
statement.
To find the dependencies that our model requires, we can refer back to Distilbert's Hugging Face page and click on the "Use this model" dropdown, where you will see the 🤗 Transformers library as an option. Click it.
You should see a handy modal explaining how to use the model with the Transformers
library. For most models, you'd want to use this. However, Distilbert has a specific tokenizer and model class. Close the modal and scroll to the How to Get Started With the Model section of the model card. We're going to use this instead.
For now, let's look at the top 2 lines of the provided code block:
import torch
from transformers import DistilBertTokenizer, DistilBertForSequenceClassification
Notice that torch
is also being used. Copy the transformers
import statement and paste it over the existing import statement in our download_models.py
file.
Now open requirements.txt
:
# torch==2.4.1
# transformers==4.47.1
These are 2 of the most common libraries when working with models. Similar to the import
statement in the download_models.py
file, they are provided by default for convenience, but commented out because although they are common, not every model will use them.
Since this model happens to use both of these libraries, we can uncomment both lines and close the file after saving.
Return to the download_models.py
file, and look for the next TODO
comment.
# TODO: Set this to your model's Hugging Face identifier
MODEL_IDENTIFIER = ""
If we take a look at the Distilbert Hugging Face page, we can use the copy button next to the name of the module to get the MODULE_IDENTIFIER
. Paste that in as the value.
For our use case, it should look like this:
MODEL_IDENTIFIER = "distilbert/distilbert-base-uncased-finetuned-sst-2-english"
You're almost ready to download the model. All you need to do now is replace the following 2 lines after the TODO
comment:
# TODO: Initialize `model` and `tokenizer`
# tokenizer = AutoTokenizer.from_pretrained(MODEL_IDENTIFIER)
# model = AutoModelForSequenceClassification.from_pretrained(MODEL_IDENTIFIER)
Instead of using AutoTokenizer
and AutoModelForSequenceClassification
, replace those with the DistilBertTokenizer
and DistilBertForSequenceClassification
we imported.
tokenizer = DistilBertTokenizer.from_pretrained(MODEL_IDENTIFIER)
model = DistilBertForSequenceClassification.from_pretrained(MODEL_IDENTIFIER)
The script is now configured! Try running the command again.
python -m scripts.download_models
The models
directory should now appear in your project. 🎉
Now for the fun part, it's time to start using the model!
This time we'll get started by running the run_module
script.
python -m scripts.run_module
You should see an error with some instructions.
❌ Error: No job configured. Implement the module's job before running the module.
1. Implement job module
👉 /src/run_inference.py
2. Delete this code block
👉 /scripts/run_module.py
Let's tackle the run_inference.py
script first. This is where your modules primary logic and functionality should live. There is a TODO
comment near the top of the file.
# TODO: Update ../requirements.txt
# import torch
# from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
We've already updated the requirements.txt
file, so we can skip that step. Go ahead and uncomment the import
statements and replace the transformers
line with the DistilBertTokenizer
and DistilBertForSequenceClassification
.
We should refer back to the "How to Get Started With the Model" section of Distilbert's model card to figure out how to use the model.
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]
Let's implement this into our run_inference
script. Scroll down to the main()
function and you'll see another TODO
comment.
# TODO: Initialize `model` and `tokenizer`
# tokenizer = AutoTokenizer.from_pretrained(MODEL_DIRECTORY)
# model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIRECTORY)
Same as before, uncomment and replace AutoTokenizer
with DistilBertTokenizer
and AutoModelForSeq2SeqLM
with DistilBertForSequenceClassification
. This is now functionally identical to the first 2 lines of code from Distilbert's example.
Below that, the tokenizer
and model
are passed into the run_job()
function. Let's scroll back up and take a look at the function. This is where we'll want to implement the rest of the code from Distilbert's example. The inputs
are already functionally identical, so let's adjust the output
.
From the Distilbert model card, copy all of the code below the inputs
variable declaration, and paste it over the output
variable declaration in your modules code.
inputs = tokenizer(
input,
return_tensors="pt",
truncation=True,
padding=True,
)
with torch.no_grad():
logits = model(**inputs).logits
predicted_class_id = logits.argmax().item()
model.config.id2label[predicted_class_id]
return output
All we need to do from here is set the output
to the last line we pasted.
output = model.config.id2label[predicted_class_id]
return output
That's everything we'll need for the modules source code!
We still need to finish step 2 that the error in the console gave us earlier. Open the run_module.py
script.
Find the TODO
comment and delete the code block underneath.
# TODO: Remove the following print and sys.exit statements and create the module job.
print(
"❌ Error: No job configured. Implement the module's job before running the module.",
file=sys.stderr,
flush=True,
)
print("\t1. Implement job module")
print("\t\t👉 /src/run_inference.py")
print("\t2. Delete this code block")
print("\t\t👉 /scripts/run_module.py")
sys.exit(1)
Before you are able to run your module, we need to build the Docker image. You can run the following command:
python -m scripts.docker_build
You should see the following response in the console:
❌ Error: DOCKER_REPO is not set in config/constants.py.
Open the constants.py
file, it should look like this:
# TODO: Set the Docker Hub repository before pushing the image.
# Example: "devlinrocha/lilypad-module-sentiment"
DOCKER_REPO = ""
# TODO: Set the tag for the Docker image.
# Example: "latest", "v1.0", or a commit SHA
DOCKER_TAG = "latest"
# TODO: Set the GitHub repository URL where your module is stored.
# Example: "github.com/devlinrocha/lilypad-module-sentiment".
MODULE_REPO = ""
# TODO: Specify the target branch name or commit hash.
# Example: "main" or "c3ed392c11060337cae010862b1af160cd805e67"
TARGET_COMMIT = "main"
For now, we'll be testing the module locally, so all we need to worry about is the DOCKER_REPO
variable. We'll use MODULE_REPO
when it's time to run the module on Lilypad Network. For help or more information, view the configuration documentation
You should be able to successfully build the Docker image now.
It's finally time to see your module in action.
Let's start by running it locally.
python -m scripts.run_module --local
The CLI should ask you for an input. Enter whatever you like and hit enter. The module will analyze the sentiment of your input and output the results at outputs/result.json
.
{
"input": "LILYPAD IS AWESOME",
"result": "POSITIVE",
"status": "success"
}
You just used a local LLM! 🎉
Before you can run the module on Lilypad Network, you'll need to push the Docker image to Docker Hub.
python -m scripts.docker_build --push
While the Docker image is being built and pushed, you should configure the rest of the variables in constants.py
. Make sure that you push your code to a public GitHub repository.
The last thing we'll need to do is edit the Lilypad module configuration file, lilypad_module.json.tmpl
. For the purposes of this module, the default configuration is mostly correct. However, the "Image"
field needs to be configured.
Replace the default value with your Docker Hub username, module image, and tag.
"Image": "devlinrocha/lilypad-module-sentiment:latest"
Once your Docker image is pushed to Docker Hub and your most recent code is pushed to a public GitHub repository, you can test your module on Lilypad's DemoNet by replacing the --local
flag with --demonet
python -m scripts.run_module --demonet
You can also remove the
--demonet
flag and supply yourWEB3_PRIVATE_KEY
to run the module on Lilypad's IncentiveNet.
You just used an LLM on Lilypad's decentralized network! 🎉
Now anyone who has the Lilypad CLI installed can also run your module:
lilypad run github.com/<MODULE_REPO>:<TARGET_COMMIT> --web3-private-key <WEB3_PRIVATE_KEY>