As we continue to push the model sizes for LLMs, distributed training has become an essential component of our AI infrastructure. However, as we build clusters that consume power beyond the capability of a single datacenter region, new networking challenges emerge. This talk will explore the complexities of networking in a multi-region distributed training environment, where data is transmitted across long distances between datacenters. We will discuss the current state of distributed training, the limitations of traditional networking approaches, and the innovative solutions being developed to address these challenges.