The repository contains a client script in Python for interacting with the LLM model via an HTTP API llama-server. The code allows sending user messages and receiving responses from the model using various API routes (e.g. /completions
or /v1/chat/completions
). Streaming data is also supported.
Used for testing, training and interaction with a local server llama.cpp. This is a simple implementation of a chat client, to understand how to work with the API and process messages.