Anura (from Ancient Greek: ἀν-, an- meaning "without," and οὐρά, ourá meaning "tail") refers to the order of amphibians that includes all modern frogs and toads.
Lilypad's official AI inference API.
To use the Lilypad API, visit the Anura website to get an API key.
If you are using an API client such as Bruno or Postman, you can use our provided collections below.
GET /api/v1/jobs/:id
- Get status and details of a specific job
POST /api/v1/cowsay
- Create a new cowsay job
Request body: {"message": "text to display"}
GET /api/v1/cowsay/:id/results
- Get results of a cowsay job
POST /api/v1/ollama
- Create a new Ollama job
GET /api/v1/ollama/:id/results
- Get results of an Ollama job
POST /api/v1/chat/completions
- Stream chat completions
GET /api/v1/models
- Get available models
To see which models are available:
POST /api/v1/chat/completions
This endpoint provides a streaming interface for chat completions using Server-Sent Events (SSE).
Request Headers
Content-Type: application/json
(required)
Accept: text/event-stream
(recommended for streaming)
Authorization: Bearer <your_api_key>
Request Body
model
string
Required. ID of the model to use (e.g., "llama2:7b")
messages
array
Required. Array of message objects representing the conversation
max_tokens
integer
Maximum number of tokens to generate
temperature
number
Controls randomness: 0 is deterministic, higher values are more random
Response Format
The response is a stream of Server-Sent Events (SSE) with the following format:
Processing updates:
Content delivery:
Completion marker:
Example Request
Response Codes
200 OK
: Request successful, stream begins
400 Bad Request
: Invalid request parameters
401 Unauthorized
: Invalid or missing API key
404 Not Found
: Requested model not found
500 Internal Server Error
: Server error processing request
Response Object Fields
The delta event data contains the following fields:
model
The model used for generation
created_at
Timestamp when the response was created
message
Contains the assistant's response
message.role
Always "assistant" for responses
message.content
The generated text content
done_reason
Reason for completion (e.g., "stop", "length")
done
Boolean indicating if generation is complete
total_duration
Total processing time in nanoseconds
load_duration
Time taken to load the model in nanoseconds
prompt_eval_count
Number of tokens in the prompt
prompt_eval_duration
Time taken to process the prompt in nanoseconds
eval_count
Number of tokens generated
eval_duration
Time taken for generation in nanoseconds
Conversation Context
The API supports multi-turn conversations by including previous messages in the request:
This allows for contextual follow-up questions and maintaining conversation history.
This code enables us to communicate directly with the Lilypad Solver. As we progress towards mainnet, Solver will have functionality to enable jobs to submit information and do authorization with smart contract calls.
First, post a job to Ollama Completions (a one-shot inference command to an LLM).
You should see messages like Job status:1
, which mean the deal is agreed on the Lilypad network. Job status:2 means results are ready (though they should be sent immediately).
You can use another terminal to check job status while the job is running.
Once your job has run, you should get output like this: