Build a Job Module
How to build your own compute job for Lilypad
A Lilypad module is a Git repository that allows you to perform various tasks using predefined templates and inputs. This guide will walk you through creating a Lilypad module, including defining a JSON template, handling inputs, and following best practices.
For a more in-depth look at building modules, refer to this end-to-end guide.
Modules on Lilypad
Below are a few examples of modules you can run on Lilypad. From language models to image generators and fun utilities, the network supports a growing list of AI modules.
To view the full list of available modules on Lilypad, please check out the awesome-lilypad repo!
Module Structure
Start by creating a Git repository for your Lilypad module. The module's versions will be represented as Git tags. Below is the basic structure of a Lilypad Module.
your-module/
βββ model-directory # Stores locally downloaded model files
βββ download_model.[py/js/etc] # Script to download model files locally
βββ requirements.txt # Module dependencies
βββ Dockerfile # Container definition
βββ run_script.[py/js/etc] # Main execution script
βββ lilypad_module.json.tmpl # Lilypad configuration
βββ README.md # Documentation
Prepare Your Model
Download model files
Handle all dependencies (
requirements.txt
)Implement input/output through environment variables
Write outputs to
/outputs
directory
1. Download the model locally
To use a model offline, you first need to download it and store it in a local directory. This guarantees that your code can load the model without requiring an internet connection. Here's a simple process to achieve this:
Install required libraries
Use a script to download the model (eg:
python download_model.py
)Verify that the model files are in your directory
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
def download_model():
model_name = "<namespace>/<model_identifier>"
# Ensure you have a directory named 'model' in your current working directory or specify a path
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Save the tokenizer and model
tokenizer.save_pretrained('./model')
model.save_pretrained('./model')
if __name__ == "__main__":
download_model()
2. Create Run Script (run_model.py for example) that will be used in conjunction with Docker
import os
import json
from transformers import AutoModel, AutoTokenizer
def main():
# Load model and tokenizer from local directory
model_path = '/model' # Path to the local model directory
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModel.from_pretrained(model_path)
# Get inputs from environment variables
input_var = os.environ.get('INPUT_VAR', 'default')
# Your model code here
result = your_model_function(input_var, model, tokenizer)
# Save outputs
output_path = '/outputs/result.json'
with open(output_path, 'w') as f:
json.dump({'result': result}, f)
if __name__ == "__main__":
main()
3. Create a Dockerfile that functions with your run script
# Use specific base image
FROM base-image:version
# Set working directory
WORKDIR /workspace
# Install dependencies
RUN apt-get update && apt-get install -y \
your-dependencies && \
rm -rf /var/lib/apt/lists/*
# Install model requirements
RUN pip install your-requirements
# Environment variables for running offline and using the local model
# HF_HOME points to the directory where the model code is
ENV HF_HOME=/model
ENV TRANSFORMERS_OFFLINE=1
# Create necessary directories
RUN mkdir -p /outputs
# Copy execution script
COPY run_script.* /workspace/
# Set entrypoint
ENTRYPOINT ["command", "/workspace/run_script"]
4. Build and Publish Image
To make sure your Docker image is compatible with Lilypad, you need to define the architecture explicitly during the build process. This is particularly important if you are building the image on a system like macOS, which uses a different architecture (darwin/arm64
) than Lilypad's infrastructure (linux/amd64
).
The examples below are for building, tagging and pushing an image to DockerHub, but you can use any platform you prefer for hosting the image.
For Linux: docker buildx build -t <USERNAME>/<MODULE_NAME>:<MODULE_TAG> --push .
For MacOS:
docker buildx build \
--platform linux/amd64 \
-t <USERNAME>/<MODULE_NAME>:<MODULE_TAG> \
--push \
.
5. Create a lilypad_module.json.tmpl Template
{
"machine": {
"gpu": 1, # Set to 0 if GPU not needed
"cpu": 1000, # CPU allocation
"ram": 8000 # Minimum RAM needed to run the module
},
"gpus": [ { "vram": 24576 }, { "vram": 40960 } ] # VRAM in MBs. Solver will default to largest one
"job": {
"APIVersion": "V1beta1",
"Spec": {
"Deal": {
"Concurrency": 1
},
"Docker": {
"Entrypoint": ["command", "/workspace/run_script"],
"WorkingDirectory": "/workspace",
"EnvironmentVariables": [
# Environment variables with defaults
{{ if .var_name }}"VAR_NAME={{ js .var_name }}"{{ else }}"VAR_NAME=default_value"{{ end }}
],
# Specify the Docker image to use for this module
"Image": "repo-owner/repo-name:tag"
},
"Engine": "Docker",
"Network": {
"Type": "None"
},
"Outputs": [
{
"Name": "outputs",
"Path": "/outputs"
}
],
"PublisherSpec": {
"Type": "ipfs"
},
"Resources": {
"GPU": "1" # Must match machine.gpu
},
"Timeout": 1800
}
}
}
Environment Variables
Format in template:
{{ if .variable }}"VARNAME={{ js .variable }}"{{ else }}"VARNAME=default"{{ end }}
Usage in CLI:
lilypad run repo:tag -i variable=value
Formatting your module run command
During development, you will need to use the Git hash to test your module. This allows you to verify that your module functions correctly and produces the expected results.
Below is a working Lilypad module run cmd for reference. (you can use this to run a Lilypad job within the Lilypad CLI):
Test Module before running on Lilypad
Use the following command syntax to run your Module on Lilypad Testnet.
lilypad run github.com/Lilypad-Tech/module-sdxl:6cf06f4038f1cff01a06c4eabc8135fd9835a78a --web3-private-key <your-private-key> -i prompt="a lilypad floating on a pond"
If the job run appears to be stuck after a few minutes (sometimes it takes time for the Module to download to the RP node), cancel the job and try again. Open a ticket in Discord with any issues that persist.
Run Module on Lilypad
lilypad run github.com/noryev/module-sdxl-ipfs:ae17e969cadab1c53d7cabab1927bb403f02fd2a -i prompt="your prompt here"
Examples
Here are some example Lilypad modules for reference:
Cowsay: Lilypad "Hello World" example
Llama2: Text to text
SDXL-turbo pipeline: Text to image generation
Deprecated examples:
lora-training: An example module for LoRa training tasks.
lora-inference: An example module for LoRa inference tasks.
duckdb: An example module related to DuckDB.
These examples can help you understand how to structure your Lilypad modules and follow best practices.
Conclusion
In this guide, we've covered the essential steps to create a Lilypad module, including defining a JSON template, handling inputs, and testing your module. By following these best practices, you can build reliable and reusable modules for Lilypad.
For more information and additional examples, refer to the official Lilypad documentation and the Cowsay example module.
Last updated
Was this helpful?