Interest in Project 14 - Accelerating Inference of NNCF-Compressed LLMs with Triton #29520
arkhamHack
started this conversation in
Google Summer of Code
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @alexsu52 @AlexanderDokuchaev,
I am Avigyan Sinha a junior ML developer and I graduated last year. This project looks really interesting, and I’d love to understand more about the scope and technical expectations.
I have worked with Cuda kernels and LLM pipelines and architecture. I hope to contribute well to this project and would love some guidance.
A few things I’d like to clarify:
This lines up well with my interest in LLM optimization and performance engineering, and I’d love to contribute to making compressed model inference faster across different hardware platforms. Looking forward to your thoughts!
Best,
Avigyan Sinha
Beta Was this translation helpful? Give feedback.
All reactions