Skip to content

Commit ba27d9c

Browse files
committed
chore: v0.3.4
1 parent 546aaf4 commit ba27d9c

13 files changed

+65
-20
lines changed

.scripts/seed.ts

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ import * as toml from 'jsr:@std/toml';
66
import * as path from 'jsr:@std/path';
77
import * as collections from "jsr:@std/collections/deep-merge";
88

9-
const VERSION = "0.3.3";
9+
const VERSION = "0.3.4";
1010

1111
type ValueSeed = {
1212
// Path relative to the project root

app/package.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"name": "@avcodes/harbor-app",
33
"private": true,
4-
"version": "0.3.3",
4+
"version": "0.3.4",
55
"type": "module",
66
"scripts": {
77
"dev": "vite",

app/src-tauri/Cargo.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11

22
[package]
33
name = "harbor-app"
4-
version = "0.3.3"
4+
version = "0.3.4"
55
description = "A companion app for Harbor LLM toolkit"
66
authors = ["av"]
77
edition = "2021"

app/src-tauri/tauri.conf.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"$schema": "https://schema.tauri.app/config/2.0.0-rc",
33
"productName": "Harbor",
4-
"version": "0.3.3",
4+
"version": "0.3.4",
55
"identifier": "com.harbor.app",
66
"build": {
77
"beforeDevCommand": "bun run dev",

app/src/serviceMetadata.ts

+2-1
Original file line numberDiff line numberDiff line change
@@ -477,6 +477,7 @@ export const serviceMetadata: Record<string, Partial<HarborService>> = {
477477
llamaswap: {
478478
name: 'llama-swap',
479479
tags: [HST.satellite, HST.api],
480-
wikiUrl: '',
480+
wikiUrl: 'https://github.com/av/harbor/wiki/2.3.40-Satellite-llamaswap',
481+
tooltip: 'Runs multiple llama.cpp servers on demand for seamless switching between them.',
481482
}
482483
};

boost/README.md

+24-5
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
> Handle: `boost`
1+
> Handle: `boost`<br/>
22
> URL: <http://localhost:34131/>
33
4-
![Screenshot of boost bench](../docs/boost.png) <small>`g1` and `rcn` optimizer modules compared to original LLMs. [BBH256](https://gist.github.com/av/18cc8138a0acbe1b30f51e8bb19add90) task, run with [Harbor Bench](../docs/5.1.-Harbor-Bench)</small>
4+
![splash](../docs/harbor-boost.png)
55

66
`boost` is a service that acts as an optimizing LLM proxy. It takes your inputs, and pre-processes them before sending them to the downstream API.
77

@@ -12,14 +12,21 @@ Features that make Harbor's `boost` special:
1212
* 🎭 `boost` can serve as a plain LLM proxy (multiple downstream APIs behind a single endpoint)
1313
* ✍️ `boost` is scriptable, you can write your own modules
1414

15+
![Short overview of boost behavior](../docs/boost-behavior.png)
16+
1517
The main focus, of course are the workflows that can help improve the LLM output in specific scenarios. Here are some examples of what's possible with `boost`:
1618

17-
* When "random" is mentioned in the message, `klmbr` will rewrite 35% of message characters to increase the entropy and produce more diverse completion
18-
* Launch self-reflection reasoning chain when the message ends with a question mark
19+
* Add R1-like reasoning to [any LLM](https://www.reddit.com/r/LocalLLaMA/comments/1ixckba/making_older_llms_llama_2_and_gemma_1_reason/)
20+
* When "random" is mentioned in the message, [`klmbr`](#klmbr---boost-llm-creativity) will rewrite 35% of message characters to increase the entropy and produce more diverse completion
21+
* Launch [self-reflection reasoning](#rcn---recursive-certainty-validation) chain when the message ends with a question mark
1922
* Expand the conversation context with the "inner monologue" of the model, where it can iterate over your question a few times before giving the final answer
2023
* Apply a specific LLM personality if the message contains a specific keyword
24+
* Add external memory to your interactions with a specific model
25+
* Make your LLM [pass a skill check](https://www.reddit.com/r/LocalLLaMA/comments/1jaqylp/llm_must_pass_a_skill_check_to_talk_to_me/) before replying to you
26+
27+
Boost is scriptable, you can provision your own modules with the workflows suitable for your needs. See [Custom Modules](#custom-modules) section for more information.
2128

22-
Moreover, boost is scriptable, you can provision your own modules with the workflows suitable for your needs. See [Custom Modules](#custom-modules) section for more information.
29+
![Screenshot of boost bench](../docs/boost.png) <small>`g1` and `rcn` optimizer modules compared to original LLMs. [BBH256](https://gist.github.com/av/18cc8138a0acbe1b30f51e8bb19add90) task, run with [Harbor Bench](../docs/5.1.-Harbor-Bench)</small>
2330

2431
`boost` operates at the OpenAI-compatible API level, so can be used with any LLM backend that accepts OpenAI API requests. You can also plug `boost` into the UIs that are compatible with OpenAI API.
2532

@@ -43,6 +50,7 @@ Moreover, boost is scriptable, you can provision your own modules with the workf
4350
* [`supersummer` - Super Summarization](#supersummer---super-summarization)
4451
* [`r0` - R1-like reasoning chains](#r0---r1-like-reasoning-chains)
4552
* [`markov` - token completion graph](#markov---token-completion-graph)
53+
* [`dnd` - skill check](#dnd---skill-check)
4654
* Custom Modules (not configurable, mostly examples, but can still be enabled)
4755
* [discussurl](https://github.com/av/harbor/blob/main/boost/src/custom_modules/discussurl.py) - parse mentioned URLs and add them to the context
4856
* [meow](https://github.com/av/harbor/blob/main/boost/src/custom_modules/meow.py) - the model ignores all previous instructions and just meows
@@ -510,6 +518,17 @@ harbor boost modules add markov
510518

511519
There's no configuration for this module yet.
512520

521+
#### `dnd` - skill check
522+
523+
⚠️ This module is experimental and only compatible with Open WebUI as a client due to its support of custom artifacts.
524+
525+
When serving the completion, LLM will first invent a skill check it must pass to address your message. Then, the workflow will roll a dice determining if the model passes the check or not and will guide the model to respond accordingly.
526+
527+
```bash
528+
# Enable the module
529+
harbor boost modules add dnd
530+
```
531+
513532
### API
514533

515534
`boost` works as an OpenAI-compatible API proxy. It'll query configured downstream services for which models they serve and provide "boosted" wrappers in its own API.

compose.x.llamaswap.cdi.yml

+12
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# This file is generated by seed-cdi.ts script,
2+
# any updates will be overwritten.
3+
services:
4+
llamaswap:
5+
deploy:
6+
resources:
7+
reservations:
8+
devices:
9+
- driver: cdi
10+
capabilities: [gpu]
11+
device_ids:
12+
- nvidia.com/gpu=all

compose.x.traefik.llamaswap.yml

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# This file is generated by seed-traefik.ts script,
2+
# any updates will be overwritten.
3+
services:
4+
llamaswap:
5+
labels:
6+
- "traefik.enable=true"
7+
- "traefik.http.routers.llamaswap.rule=Host(`llamaswap.${HARBOR_TRAEFIK_DOMAIN}`)"
8+
- "traefik.http.services.llamaswap.loadbalancer.server.port=${HARBOR_LLAMASWAP_HOST_PORT}"
9+
- "traefik.http.routers.llamaswap.entrypoints=web"
10+
- "traefik.http.routers.llamaswap.service=llamaswap"
11+
12+
networks:
13+
- traefik-public

docs/2.-Services.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -47,15 +47,12 @@ Visual programming for AI language models
4747
- [Open WebUI](https://github.com/av/harbor/wiki/2.1.1-Frontend:-Open-WebUI) <span style="opacity: 0.5;">`Frontend`</span><br/>
4848
widely adopted and feature rich web interface for interacting with LLMs. Supports OpenAI-compatible and Ollama backends, multi-users, multi-model chats, custom prompts, TTS, Web RAG, RAG, and much much more.
4949

50-
- [oterm](https://github.com/av/harbor/wiki/2.1.12-Frontend-oterm) <span style="opacity: 0.5;">`CLI`, `Frontend`</span><br/>
50+
- [oterm](https://github.com/av/harbor/wiki/2.1.12-Frontend-oterm) <span style="opacity: 0.5;">`Frontend`, `CLI`</span><br/>
5151
The text-based terminal client for Ollama.
5252

5353
- [Parllama](https://github.com/av/harbor/wiki/2.1.7-Frontend:-parllama) <span style="opacity: 0.5;">`Frontend`</span><br/>
5454
TUI for Ollama
5555

56-
- [RAGLite](https://github.com/av/harbor/wiki/2.3.39-Satellite-RAGLite) <span style="opacity: 0.5;">`Satellite`, `Frontend`</span><br/>
57-
Python toolkit for Retrieval-Augmented Generation (RAG)
58-
5956
# Backends
6057

6158
This section covers services that provide the LLM inference capabilities.
@@ -172,6 +169,9 @@ LLM proxy that can aggregate multiple inference APIs together into a single endp
172169
- [LitLytics](https://github.com/av/harbor/wiki/2.3.21-Satellite:-LitLytics) <span style="opacity: 0.5;">`Satellite`, `Partial Support`, `Workflows`</span><br/>
173170
Simple analytics platform that leverages LLMs to automate data analysis.
174171

172+
- [llama-swap](https://github.com/av/harbor/wiki/2.3.40-Satellite-llamaswap) <span style="opacity: 0.5;">`Satellite`, `API`</span><br/>
173+
Runs multiple llama.cpp servers on demand for seamless switching between them.
174+
175175
- [lm-evaluation-harness](https://github.com/av/harbor/wiki/2.3.17-Satellite:-lm-evaluation-harness) <span style="opacity: 0.5;">`Satellite`, `CLI`, `Eval`</span><br/>
176176
A de-facto standard framework for the few-shot evaluation of language models.
177177

@@ -208,7 +208,7 @@ Test your prompts, agents, and RAGs. A developer-friendly local tool for testing
208208
- [Qdrant](https://github.com/av/harbor/wiki/2.3.26-Satellite:-Qdrant) <span style="opacity: 0.5;">`Satellite`</span><br/>
209209
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine.
210210

211-
- [RAGLite](https://github.com/av/harbor/wiki/2.3.39-Satellite-RAGLite) <span style="opacity: 0.5;">`Satellite`, `Frontend`</span><br/>
211+
- [RAGLite](https://github.com/av/harbor/wiki/2.3.39-Satellite-RAGLite) <span style="opacity: 0.5;">`Satellite`</span><br/>
212212
Python toolkit for Retrieval-Augmented Generation (RAG)
213213

214214
- [Repopack](https://github.com/av/harbor/wiki/2.3.22-Satellite:-Repopack) <span style="opacity: 0.5;">`Satellite`, `CLI`</span><br/>

docs/2.3.40-Satellite-llamaswap.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
> Handle: `llamaswap`<br/>
44
> URL: [http://localhost:34401](http://localhost:34401)
55
6-
llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server.
6+
llama-swap is a lightweight, transparent proxy server that provides automatic model swapping to llama.cpp's server.
77

88
### Starting
99

harbor.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -4162,7 +4162,7 @@ run_gptme_command() {
41624162
# ========================================================================
41634163

41644164
# Globals
4165-
version="0.3.3"
4165+
version="0.3.4"
41664166
harbor_repo_url="https://github.com/av/harbor.git"
41674167
harbor_release_url="https://api.github.com/repos/av/harbor/releases/latest"
41684168
delimiter="|"

package.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "@avcodes/harbor",
3-
"version": "0.3.3",
3+
"version": "0.3.4",
44
"description": "Effortlessly run LLM backends, APIs, frontends, and services with one command.",
55
"private": false,
66
"author": "av <[email protected]> (https://av.codes)",

pyproject.toml

+2-2
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)