Skip to Content
docsTutorialsServer Mode

Last Updated: 3/9/2026


Skip to main content

Server Mode

While pie run is great for quick tests, production use cases need a persistent server.

Start the Server

Launch Pie in server mode:

pie serve pie serve

Output:

╭─ Pie Engine (server) ────────────────────────╮╭─ Pie Engine (server) ────────────────────────╮│ Host 127.0.0.1:8080 ││ Host 127.0.0.1:8080 ││ Model meta-llama/Llama-3.2-1B-Instruct ││ Model meta-llama/Llama-3.2-1B-Instruct ││ Device cuda:0 ││ Device cuda:0 │ ╰──────────────────────────────────────────────╯ ╰──────────────────────────────────────────────╯ ✓ Backend started on cuda:0✓ Backend started on cuda:0✓ Engine listening on ws://127.0.0.1:8080✓ Engine listening on ws://127.0.0.1:8080

The server is now ready to accept client connections.

Interactive Mode

For development and testing, use interactive mode:

pie serve -ipie serve -i

This gives you a shell to run inferlets directly:

Type 'help' for commands, ↑/↓ for historyType 'help' for commands, ↑/↓ for history pie> run text-completion --prompt "Hello world"pie> run text-completion --prompt "Hello world"Hello world! How are you today?Hello world! How are you today? pie> helppie> helpAvailable commands:Available commands: run <inferlet> [args] - Run an inferlet run <inferlet> [args] - Run an inferlet list - List running instances list - List running instances exit - Shutdown and exit exit - Shutdown and exit

Monitor Mode

For real-time performance monitoring:

pie serve -mpie serve -m

This launches a TUI dashboard showing:

  • Active requests
  • Throughput (tokens/sec)
  • Memory usage
  • Batch statistics

Command-Line Options

OptionDescription
--config, -cPath to config file
--hostOverride host address
--portOverride port
--no-authDisable authentication
--verbose, -vEnable verbose logging
--interactive, -iInteractive shell mode
--monitor, -mTUI monitor mode

Examples:

# Custom port, no auth# Custom port, no authpie serve --port 9000 --no-authpie serve --port 9000 --no-auth # Verbose logging # Verbose loggingpie serve -vpie serve -v # Custom config file # Custom config filepie serve -c /path/to/config.tomlpie serve -c /path/to/config.toml

Connecting Clients

Once the server is running, connect with a client:

from pie import PieClient from pie import PieClient async with PieClient("ws://127.0.0.1:8080") as client: async with PieClient("ws://127.0.0.1:8080") as client: await client.authenticate("username") await client. authenticate("username") # ... use the client # ... use the client

See Client Basics for more.

Graceful Shutdown

Press Ctrl+C to shut down:

^C ^CShutting down...Shutting down... ✓ Shutdown complete ✓ Shutdown complete

Pie will:

  1. Stop accepting new connections
  2. Wait for running inferlets to complete
  3. Terminate backends
  4. Clean up resources

Next Steps