Chi Wang: Hands-On LLM Serving and Optimization, Kartoniert / Broschiert
Hands-On LLM Serving and Optimization
- Hosting Llms at Scale
Sie können den Titel schon jetzt bestellen. Versand an Sie erfolgt gleich nach Verfügbarkeit.
- Verlag:
- O'Reilly Media, 06/2026
- Einband:
- Kartoniert / Broschiert
- Sprache:
- Englisch
- ISBN-13:
- 9798341621497
- Artikelnummer:
- 12567153
- Umfang:
- 300 Seiten
- Erscheinungstermin:
- 2.6.2026
- Hinweis
-
Achtung: Artikel ist nicht in deutscher Sprache!
Klappentext
Large language models (LLMs) are rapidly becoming the backbone of AI-driven applications. Without proper optimization, however, LLMs can be expensive to run, slow to serve, and prone to performance bottlenecks. As the demand for real-time AI applications grows, along comes Hands-On Serving and Optimizing LLM Models, a comprehensive guide to the complexities of deploying and optimizing LLMs at scale.
In this hands-on book, authors Chi Wang and Peiheng Hu take a real-world approach backed by practical examples and code, and assemble essential strategies for designing robust infrastructures that are equal to the demands of modern AI applications. Whether you're building high-performance AI systems or looking to enhance your knowledge of LLM optimization, this indispensable book will serve as a pillar of your success.
- Learn the key principles for designing a model-serving system tailored to popular business scenarios
- Understand the common challenges of hosting LLMs at scale while minimizing costs
- Pick up practical techniques for optimizing LLM serving performance
- Build a model-serving system that meets specific business requirements
- Improve LLM serving throughput and reduce latency
- Host LLMs in a cost-effective manner, balancing performance and resource efficiency