Last Updated: 3/9/2026
CLI Reference
Complete reference for the Pie command-line interface.
Overview
pie [OPTIONS] <COMMAND>pie [OPTIONS] <COMMAND>Pie provides commands for:
- Running inferlets —
serve,run - Configuration —
config init,config show,config update - Model management —
model list,model download,model remove - Diagnostics —
doctor - Authentication —
auth
serve
Start the Pie engine server.
pie serve [OPTIONS]pie serve [OPTIONS]Options
| Option | Description |
|---|---|
-c, --config <PATH> | Path to TOML configuration file |
--host <HOST> | Override host address |
--port <PORT> | Override port |
--no-auth | Disable authentication |
-v, --verbose | Enable verbose logging |
--cache-dir <PATH> | Cache directory path |
--log-dir <PATH> | Log directory path |
-i, --interactive | Enable interactive shell mode |
-m, --monitor | Launch real-time TUI monitor |
Examples
# Start server with defaults # Start server with defaults pie serve pie serve # Interactive mode # Interactive modepie serve -ipie serve -i # Monitor mode with custom port # Monitor mode with custom portpie serve -m --port 9000pie serve -m --port 9000 # Disable auth for development # Disable auth for developmentpie serve --no-auth -vpie serve --no-auth -vrun
Run an inferlet with a one-shot Pie engine.
pie run [OPTIONS] [INFERLET] [-- ARGS...]pie run [OPTIONS] [INFERLET] [-- ARGS...]Arguments
| Argument | Description |
|---|---|
INFERLET | Inferlet name from registry (e.g., std/text-completion@0.1.0) |
ARGS... | Arguments passed to the inferlet (after --) |
Options
| Option | Description |
|---|---|
-p, --path <PATH> | Path to a local .wasm inferlet file |
-c, --config <PATH> | Path to TOML configuration file |
--log <PATH> | Path to log file |
Examples
# Run from registry # Run from registrypie run text-completion -- --prompt "Hello world"pie run text-completion -- --prompt "Hello world" # Run local inferlet # Run local inferletpie run --path ./my_inferlet.wasm -- --arg valuepie run --path ./my_inferlet.wasm -- --arg value # With custom config # With custom configpie run -c ./config.toml text-completion -- --prompt "Test"pie run -c ./config.toml text-completion -- --prompt "Test"doctor
Check system health and configuration.
pie doctor pie doctorVerifies:
- Configuration file exists
- Models are available
- GPU/device accessibility
- Backend connectivity
config
Manage configuration files.
config init
Create a default configuration file.
pie config init [OPTIONS]pie config init [OPTIONS]| Option | Description |
|---|---|
--path <PATH> | Custom config file path |
Creates ~/.pie/config.toml with default settings.
config show
Display current configuration.
pie config show [OPTIONS]pie config show [OPTIONS]| Option | Description |
|---|---|
--path <PATH> | Custom config file path |
config update
Update configuration values.
pie config update [OPTIONS]pie config update [OPTIONS]Engine Options
| Option | Description |
|---|---|
--host <HOST> | Network host to bind to |
--port <PORT> | Network port to bind to |
--enable-auth / --disable-auth | Toggle authentication |
--verbose / --no-verbose | Toggle verbose logging |
--cache-dir <PATH> | Cache directory path |
--log-dir <PATH> | Log directory path |
--registry <URL> | Inferlet registry URL |
Model Options
| Option | Description |
|---|---|
--hf-repo <REPO> | HuggingFace model repository |
--device <DEVICES> | Device assignment (e.g., cuda:0 or cuda:0,cuda:1) |
--activation-dtype <DTYPE> | Activation dtype (bfloat16, float16) |
--weight-dtype <DTYPE> | Weight dtype |
--kv-page-size <SIZE> | KV cache page size |
--max-batch-tokens <N> | Maximum batch tokens |
--gpu-mem-utilization <FLOAT> | GPU memory utilization (0.0-1.0) |
--use-cuda-graphs / --no-use-cuda-graphs | Toggle CUDA graphs |
Telemetry Options
| Option | Description |
|---|---|
--telemetry / --no-telemetry | Enable/disable OpenTelemetry |
--telemetry-endpoint <URL> | OTLP endpoint for traces |
Examples
# Change model # Change modelpie config update --hf-repo "Qwen/Qwen2.5-7B-Instruct"pie config update --hf-repo "Qwen/Qwen2.5-7B-Instruct" # Multi-GPU setup# Multi-GPU setuppie config update --device "cuda:0,cuda:1"pie config update --device "cuda:0,cuda:1" # Disable auth # Disable authpie config update --disable-authpie config update --disable-auth # Multiple updates # Multiple updatespie config update --port 9000 --verbose --gpu-mem-utilization 0.9pie config update --port 9000 --verbose --gpu-mem-utilization 0.9model
Manage models from HuggingFace.
model list
List locally cached models.
pie model list pie model listOutput shows:
- ✓ Compatible models (supported by Pie)
- ○ Other cached models
model download
Download a model from HuggingFace.
pie model download <REPO_ID>pie model download <REPO_ID>| Argument | Description |
|---|---|
REPO_ID | HuggingFace repository ID |
Examples
pie model download meta-llama/Llama-3.2-1B-Instructpie model download meta-llama/Llama-3.2-1B-Instructpie model download Qwen/Qwen2.5-7B-Instructpie model download Qwen/Qwen2.5-7B-Instructpie model download deepseek-ai/DeepSeek-R1-Distill-Qwen-7Bpie model download deepseek-ai/DeepSeek-R1-Distill-Qwen-7Bmodel remove
Remove a locally cached model.
pie model remove <REPO_ID>pie model remove <REPO_ID>Prompts for confirmation before deletion.
auth
Authentication management commands.
pie auth <SUBCOMMAND>pie auth <SUBCOMMAND>Manage SSH keys and authentication settings for the Pie server.
Configuration File
The configuration file (~/.pie/config.toml) structure:
[engine][engine]host = "127.0.0.1"host = "127.0.0.1"port = 8080port = 8080enable_auth = trueenable_auth = trueverbose = falseverbose = falsecache_dir = "~/.pie/cache"cache_dir = "~/.pie/cache"log_dir = "~/.pie/logs"log_dir = "~/.pie/logs"registry = "https://registry.pie-project.org/"registry = "https://registry.pie-project.org/" [[model]][[model]]hf_repo = "meta-llama/Llama-3.2-1B-Instruct"hf_repo = "meta-llama/Llama-3.2-1B-Instruct"device = ["cuda:0"]device = ["cuda:0"]activation_dtype = "bfloat16"activation_dtype = "bfloat16"weight_dtype = "bfloat16"weight_dtype = "bfloat16"kv_page_size = 16kv_page_size = 16max_batch_tokens = 8192max_batch_tokens = 8192gpu_mem_utilization = 0.9gpu_mem_utilization = 0.9use_cuda_graphs = trueuse_cuda_graphs = true [telemetry][telemetry]enabled = falseenabled = falseendpoint = "http://localhost:4317"endpoint = "http://localhost:4317"Environment Variables
| Variable | Description |
|---|---|
PIE_HOME | Override default Pie home directory (~/.pie) |
PIE_CONFIG | Override default config file path |
HF_HOME | HuggingFace cache directory |
Exit Codes
| Code | Meaning |
|---|---|
0 | Success |
1 | General error |
130 | Interrupted (Ctrl+C) |