Blama - Blocksense Llama

A wrapper around llama.cpp that provides a server with verifiable inference capabilities. Blama enables verifiable AI inference, ensuring transparency and trust in model outputs.

Features

High Performance: Built on top of the optimized llama.cpp engine
RESTful API: Easy-to-use HTTP server interface
Model Support: Compatible with GGUF format models

Quick Start

Prerequisites

C++ compiler with C++17 support
CMake 3.14+
Git

Usage

Start the server:

./blama-server path/to/your/model.gguf

Make complete text requests:

curl -X POST http://localhost:7331/complete \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": 'The first man to',
    "max_tokens": 100
  }'

Verify completion results:

curl -X POST http://localhost:7331/verify_completion \
  -H "Content-Type: application/json" \
  -d '{
    "request": <Here should be added the request to /complete>,
    "response": <Here should be added the response from /complete>
  }'

API Reference

Verification System

Blama implements a verification system that ensures the model predictions are correct based on the output logits of each token generation.

How It Works

Each inference request generates an array of token step generation results. Each token step has an array of logits (top 10) taken from the context.
The same request + response then is send back for verification
Each verification request will create the same model and fill the context with the response's token steps. During the context filling we'll produce again the same token steps but with the logits from the current context.
Compare the the logits from the request and those returned during context filling. The algorithm can be checked here

Supported Models

Any GGUF-compatible model that is compatible with llama.cpp

Development

Building from Source

# List available presets
cmake --list-presets

# Configure with a preset
cmake --preset debug

# Build with a preset
cmake --build --preset debug

Acknowledgments

llama.cpp for the high-performance inference engine
Meta AI for the Llama model architecture
The open source community for contributions and feedback

Support

Issues: GitHub Issues

Note: This project is under active development. APIs may change between versions.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
cmake		cmake
common		common
docs/design		docs/design
inference		inference
sandbox		sandbox
server		server
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
README.md		README.md
VSOpenFileFromDirFilters.json		VSOpenFileFromDirFilters.json
get_cpm.cmake		get_cpm.cmake

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Blama - Blocksense Llama

Features

Quick Start

Prerequisites

Usage

API Reference

Verification System

How It Works

Supported Models

Development

Building from Source

Acknowledgments

Support

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

blocksense-network/blama

Folders and files

Latest commit

History

Repository files navigation

Blama - Blocksense Llama

Features

Quick Start

Prerequisites

Usage

API Reference

Verification System

How It Works

Supported Models

Development

Building from Source

Acknowledgments

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages