Loading…
2025 OCP APAC Summit
Tuesday August 5, 2025 3:45pm - 4:00pm PDT
Inference processing of large language models (LLMs) is computationally intensive, and efficient management and reuse of intermediate data, known as KV Cache, are crucial for performance improvement. In this presentation, we propose a novel architecture leveraging NTT's innovative photonics-based networking technology, "IOWN APN (All-Photonics Network)," to enable low-latency, high-bandwidth sharing of large-scale KV Cache among geographically distributed data centers. By exploiting the unique capabilities of IOWN APN, the proposed KV Cache sharing system significantly enhances inference throughput and improves power efficiency, paving the way for reduced environmental impact and more sustainable operational models for LLM inference. Through this presentation, we aim to engage with the OCP community to discuss the potential for wide-area distributed AI computing based on open standards.
Tuesday August 5, 2025 3:45pm - 4:00pm PDT
TaiNEX 2 - 701 E

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link