llm-d

Welcome to llm-d: a Kubernetes-native high-performance distributed LLM inference framework

llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.

🚀 Quick Start Guide

New to llm-d? Here's how to get started:

Join our Slack 💬 → Get your invite and visit llm-d.slack.com
Explore our code 📂 → GitHub Organization
Join a meeting 📅 → Add calendar
Pick your area 🎯 → Browse Special Interest Groups.

📚 Key Resources

📖 Documentation: llm-d.ai
🏗️ Architecture: Architecture docs
📖 Project Details: PROJECT.md
📦 Releases: GitHub Releases
📅 Upcoming Events: Upcoming Events

💬 Communication Channels

💬 Slack: llm-d Workspace - Daily conversations and Q&A
📂 GitHub: llm-d Organization - Code, issues, and discussions
📧 Google Group: llm-d-contributors - Architecture diagrams and updates
📚 Google Drive: Public Documentation - Meeting recordings and project docs

🗓️ Regular Meetings

All meetings are open to the public! 🌟

📅 Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
🎯 SIG Meetings: Various times throughout the week - See SIG details for schedules

Join to participate, ask questions, or just listen and learn!

🎯 Special Interest Groups (SIGs)

Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:

Inference Scheduler - Intelligent request routing and load balancing
Benchmarking - Performance testing and optimization
PD-Disaggregation - Prefill/decode separation patterns
KV-Disaggregation - KV caching and distributed storage
Installation - Kubernetes integration and deployment
Autoscaling - Traffic-aware autoscaling and resource management
Observability - Monitoring, logging, and metrics

View more SIG Details →

🤝 How to Contribute

Getting Involved

📅 Upcoming Events - Meetups, talks, and conferences
📝 Contributing Guidelines - Complete guide to contributing code, docs, and ideas
👥 Special Interest Groups (SIGs) - Join focused teams working on specific areas
🤝 Code of Conduct - Our community standards and values

Contributing Code

Read Guidelines: Review our Code of Conduct and contribution process
Sign Commits: All commits require DCO sign-off (git commit -s)

Ways to Contribute

🐛 Bug fixes and small features - Submit PRs directly to component repos
🚀 New features with APIs - Require project proposals
📚 Documentation - Help improve guides and examples
🧪 Testing & Benchmarking - Contribute to our test coverage
💡 Experimental features - Start in llm-d-incubation org

🔒 Security & Safety

🛡️ Security Policy - How to report vulnerabilities and security issues
📢 Security Announcements - Join the llm-d-security-announce group for emails about security and major API announcements.

🌐 Connect With Us

Follow llm-d across social platforms for updates, discussions, and community highlights:

💼 LinkedIn: @llm-d
🐦 X (Twitter): @_llm_d_
☁️ Bluesky: @llm-d.ai
🤖 Reddit: r/llm_d
📺 YouTube: @llm-d-project

❓ Need Help?

Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.

💬 Slack: Join our Slack workspace and mention @community-team for quick response
🐛 GitHub Issues: Open an issue for bug reports, feature requests, or general questions
📧 Mailing List: llm-d-contributors for broader community discussions

License: Apache 2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly