Sandermage

Follow

🏠

Working from home

Sandermage

🏠

Working from home

Follow

4 followers · 1 following

Achievements

Achievements

Popular repositories Loading

genesis-vllm-patches genesis-vllm-patches Public

Production-grade runtime patches for vLLM (45+ patches) — Qwen3.6-35B-A3B-FP8 hybrid GDN+MoE on NVIDIA Ampere (SM 80-86). 127 tok/s MTP free-form, 99 tok/s suffix tool-call (max 175). TurboQuant k8…

Python 29 1
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python