LLM Chat Server
An HTTP server that accepts plain-text prompts via POST /chat and streams the LLM response token by token back to the caller. The LLM connection is declared as a model and shared across all concurrent requests.
Running
melodium run 01_text_llm_chat/Compo.toml --api_key sk-... --model claude-sonnet-4-6$ curl -X POST http://127.0.0.1:8080/chat -d "What is Mélodium?"
Mélodium is a dataflow programming language…How it works
Two models are instantiated at startup:
model server: HttpServer(host=|from_ipv4(|localhost_ipv4()), port=port)
model llm: ChatLlm(api_key=api_key, model=model)ChatLlm wraps RemoteLlm with a fixed system prompt and token limit. The backend field selects the provider ("anthropic", "openai", etc.); switching providers means only changing backend and model in the model definition — the rest of the pipeline is unaffected.
The chat sub-treatment
Each incoming connection is handled by the chat sub-treatment, which is a straight three-step pipeline:
Self.data -> decode.data,text -> llmStream.prompt,token -> encode.text,data -> Self.datadecodeconverts raw request bytes to UTF-8 textllmStreamsends the text as a prompt and emits tokens as aStream<string>as they arriveencodeconverts each token back to bytes and forwards them directly intoconnection.data
Because llmStream emits tokens as a stream, they reach the HTTP response as they are produced — no buffering, no explicit async logic.
Prompt text and LLM errors are independently forwarded to loggers via separate connections, without interrupting the token stream.
Video Explanation
Dependencies
[dependencies]
std = "0.10.1" # core flows, logging, data structures
http = "0.10.1" # HTTP server and client
net = "0.10.1" # IP address helpers
encoding = "0.10.1" # UTF-8 encode / decode
ml = "0.10.1" # LLM, STT, TTS and local model inference