Name: KV Cache Sharing over IOWN APN: Building a Sustainable and High-Performance Nation-Wide Distributed AI for LLM Inference
Start: 2025-08-05T15:45:00-0700
End: 2025-08-05T16:00:00-0700

Tuesday August 5, 2025 3:45pm - 4:00pm PDT

TaiNEX 2 - 701 E

Inference processing of large language models (LLMs) is computationally intensive, and efficient management and reuse of intermediate data, known as KV Cache, are crucial for performance improvement. In this presentation, we propose a novel architecture leveraging NTT's innovative photonics-based networking technology, "IOWN APN (All-Photonics Network)," to enable low-latency, high-bandwidth sharing of large-scale KV Cache among geographically distributed data centers. By exploiting the unique capabilities of IOWN APN, the proposed KV Cache sharing system significantly enhances inference throughput and improves power efficiency, paving the way for reduced environmental impact and more sustainable operational models for LLM inference. Through this presentation, we aim to engage with the OCP community to discuss the potential for wide-area distributed AI computing based on open standards.

Tuesday August 5, 2025 3:45pm - 4:00pm PDT
TaiNEX 2 - 701 E

Optical Communication Networks

Need help? View Support Guides
Event questions? Contact Event Planner

2025 OCP APAC Summit

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!