Skip to content
/ wgml Public

Cross-platform GPU LLM inference with WebGPU and wgmath.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE.txt
MIT
LICENSE-MIT.txt
Notifications You must be signed in to change notification settings

dimforge/wgml

Repository files navigation

wgml − GPU local inference every platform

crates.io


wgml is a set of Rust libraries exposing WebGPU shaders and kernels for local Large Language Models (LLMs) inference on the GPU. It is cross-platform and runs on the web. wgml can be used as a rust library to assemble your own transformer from the provided operators (and write your owns on top of it).

Aside from the library, two binary crates are provided:

  • wgml-bench is a basic benchmarking utility for measuring calculation times for matrix multiplication with various quantization formats.
  • wgml-chat is a basic chat GUI application for loading GGUF files and chat with the model. It can be run natively or on the browser. Check out its README for details on how to run it. You can run it from your browser with the online demo.

⚠️ wgml is still under heavy development and might be lacking some important features. Contributions are welcome!


About

Cross-platform GPU LLM inference with WebGPU and wgmath.

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE.txt
MIT
LICENSE-MIT.txt

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published