-
-
Notifications
You must be signed in to change notification settings - Fork 108
2.2.1 Backend: Ollama
Handle:
ollama
URL: http://localhost:33821
Ergonomic wrapper around llama.cpp with plenty of QoL features.
Ollama is one of the default services, so you don't need to specify anything special to start it.
# Will pull images and start the service
harbor up
# [Optional] monitor the logs
harbor logs ollama
- Harbor will automatically connect
ollama
to Open WebUI - Harbor's
ollama
instance is configured to pull default models (see configuration below) - By default, Ollama cache is stored in
~/.ollama
directory, but you can configure it to match your host's Olama cache location (seeOLLAMA_CACHE
in configuration below)
You can discover new models via Ollama's model library.
Management of the models is possible right from the Open WebUI Admin Settings. The models are stored in the global ollama cache on your local machine.
Alternatively, you can use ollama
CLI itself.
# Show the list of available models
harbor ollama list
harbor ollama ls
# Pull a new model
harbor ollama pull phi4
You can also pull models directly from the HuggingFace Hub.
# [Optional] will open HF model search with "llama gguf" query
harbor hf find llama gguf
# Pull the model
# "hf.co" is a special prefix telling Ollama to pull from HuggingFace
# "unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF" is the HuggingFace repository name
# "Q8_0" points to the "DeepSeek-R1-Distill-Llama-8B-Q8_0.gguf" file in the repository
harbor ollama pull hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0
If you encounter any issues with ollama
service, you can check the logs with harbor logs ollama
.
# Check the logs
harbor logs ollama
If you need to restart the service, you can do it with harbor restart ollama
.
# Restart the service
harbor restart ollama
You'll find many more examples and scenarios covered in the troubleshooting guide.
You can use a full ollama
CLI, when ollama
service is running.
# Ollama service should be running to access the cli
harbor ollama --help
# See the environment variables
# supported by ollama service
harbor ollama serve --help
# Access Ollama CLI commands
harbor ollama version
See environment variables configuration guide.
# Env variable reference
harbor ollama serve --help
In addition to that, some high-level configurations are available via harbor config
# See all harbor config properties for ollama
harbor config ls | grep OLLAMA
# Configure ollama version, accepts a docker tag
harbor config set ollama.version 0.3.7-rc5-rocm
# Available configuration options
# Location of the ollama own cache, either global
# or relative to $(harbor home)
OLLAMA_CACHE ~/.ollama
# The port on the host machine where the ollama service
# will be available
OLLAMA_HOST_PORT 33821
# The Docker image to use
OLLAMA_VERSION latest
# This URL will be given to the services connected to Ollama
# you can substitute with your own custom URL to switch
# Harbor to your own Ollama deployment
OLLAMA_INTERNAL_URL http://ollama:11434
# A comma-separated list of models to pull/update on startup
# The default value is also used in default configs for dependent services
OLLAMA_DEFAULT_MODELS mxbai-embed-large:latest
# Maps to the "OLLAMA_CONTEXT_LENGTH" environment variable
# for the container and sets global default ctx length for all models.
# Don't use "harbor config" to set this as it won't be propagated to the env var.
OLLAMA_CONTEXT_LENGTH 4096
Follow Harbor's environment variables guide to set arbitrary variables for ollama
service.
Retrieve the endpoint for ollama
service with:
# For the host machine
harbor url ollama
# For the LAN
harbor url -a ollama
# For the Docker network
harbor url -i ollama
Additionally, you can find a small HTTP playbook in the http-catalog folder.
You can create custom modelfiles to run models with specific parameters. Harbor will run Ollama CLI within the same folder you called it from, so all the files will be available.
# 1. Create a new modelfile at a convenient location
touch mymodel.Modelfile
# 2. Edit the modelfile with your favorite editor
open mymodel.Modelfile
# 3. Import the model into Ollama
harbor ollama create -f mymodel.Modelfile mymodel
The most common use-case is to override default Ollama parameters with custom values, for example to increase the default context length.
You can also source your modelfile from the existing one, to use as a template.
harbor ollama show modelname:latest --modelfile > mymodel.modelfile
Starting from Ollama v0.5.13
you can set a global default context length for all models. To do so with Harbor, you have two options:
# Option 1. Use CLI alias
# - Get current config value
harbor ollama ctx
# - Set the new value (4096 in this case)
harbor ollama ctx 4096
# Option 2. Use "harbor env"
# - Get the current value
harbor env ollama OLLAMA_CONTEXT_LENGTH
# - Set the new value (4096 in this case)
harbor env ollama OLLAMA_CONTEXT_LENGTH 4096
Note that harbor ollama ctx
is synchronised to the env, but not vice versa.
It's possible to switch all related service in Harbor to use an external Ollama instance, to do so, you need to change the OLLAMA_INTERNAL_URL
configuration.
Note
The URL is relative to the Docker network, use host.docker.internal
or 172.17.0.1
instead of localhost
.
# Set the custom Ollama URL
harbor config set ollama.internal_url http://172.17.0.1:11434
By default, Ollama stores the models in the ~/.ollama
directory. You can change the location by setting the OLLAMA_CACHE
environment variable.
# Set the custom cache location
harbor config set ollama.cache /path/to/custom/cache
The path must be either absolute or relative to the harbor home
directory.
When compatible with Ollama's HF integration:
harbor ollama pull hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0
Tip
You can then copy the model to have a more convenient ID:
harbor ollama cp hf.co/unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF:Q8_0 r1-8b
When not compatible, via direct import from file:
# 1. Download the model
harbor hf download flowaicom/Flow-Judge-v0.1-GGUF
# 2. Locate the gguf file
# The gguf file is located in the model directory
h find Flow-Judge-v0.1 | grep .gguf
# /home/user/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf
# 3. Translate the path
# Harbor mounts HF cache to Ollama service
# /home/user/.cache/huggingface -> /root/.cache/huggingface
# The path becomes:
# /root/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf
# 4. Create a modelfile
# You can use any convenient folder to store modelfiles
# By default, Harbor has a directory for modelfiles: ollama/modelfiles
# Below are few _options_ on quickly accessing the directory
harbor vscode # Open Harbor workspace in VS Code and go from there
open $(harbor home) # Open Harbor workspace in default file manager
open $(harbor home)/ollama/modelfiles # This is the directory for modelfiles
code $(harbor home)/ollama/modelfiles # Open the directory in VS Code
# 5. Sample modelfile contents
# TIP: Use original base modelfile as a reference:
# harbor ollama show --modelfile <model name>
# Save as "<your name>.Modelfile" in the modelfiles directory
FROM /root/.cache/huggingface/hub/models--flowaicom--Flow-Judge-v0.1-GGUF/snapshots/3ca...575/flow-judge-v0.1-Q4_K_M.gguf
# 6. Create the model
# 6.1. From Harbor's modelfiles directory
harbor ollama create -f /modelfiles/<your name>.Modelfile <your name>
# 6.2. From current directory
harbor ollama create -f ./<your name>.Modelfile <your name>
# Successful output example
# 13:27:37 [INFO] Service ollama is running. Executing command...
# transferring model data 100%
# using existing layer sha256:939...815
# creating new layer sha256:aaa...169
# writing manifest
# success
# 7. Check the model
harbor ollama run <your name>
# 8. To upload to ollama.com, follow official tutorial
# on sharing the models:
# https://github.com/ollama/ollama/blob/main/docs/import.md#sharing-your-model-on-ollamacom