As data traffic continues to surge across AI networks, the need for higher bandwidth and efficient signal connectivity is critical. With 200G/lambda generation well on the way to production, focus is quickly moving to the next 400G/lane SerDes, which represents a significant leap in interconnect performance. This advancement enables interconnects capable of reaching 3.2Tbps and beyond by aggregating fewer, faster lanes with the need to balance cost, power consumption and footprint per bit. In this presentation, we delve into the high-speed protocols such as Ethernet, UALink, and Ultra Ethernet – exploring the first use case where 400G/lane SerDes will potentially be deployed. We’ll take a deeper look into different modulation formats with their benefits and challenges. Special attention will be given to the adoption of optical connectivity. We aim to provide a comprehensive overview of the options available and justify their use in modern cloud service architectures.
The rapid growth of AI chips has increased computational demands, and future high-performance computing (HPC) systems are expected to integrate multiple high-power chips, resulting in total power consumption of over 2.5kW and individual chip power densities exceeding 200W/cm². To tackle these challenges, advanced cooling technologies are essential to lower thermal resistance and efficiently dissipate heat. In this paper, we explore innovative structural designs for cold plates that address critical thermal management challenges for next-generation AI systems, as well as the corresponding thermal test vehicle that can generate different power densities.
Recent advances in large-scale AI models have placed increasing pressure on the underlying compute architecture to deliver not only raw performance but also programmability and efficiency at scale. This talk introduces the Tensor Contraction Processor (TCP), a novel architecture that reconceptualizes tensor contraction as the central computational primitive, enabling a broader class of operations beyond traditional matrix multiplication. We will present the motivation behind this architectural shift, its implications for compiler design and runtime scheduling, and findings related to performance and energy efficiency. The discussion will also explore how exposing tensor contraction at the hardware level opens opportunities for more expressive and seamless execution strategies, potentially reducing data movement and improving utilization. We will share key learnings from scaling the chip across servers and racks, highlight intersections with relevant OCP Project areas, and discuss how these insights are informing our product roadmap.
I will go through the topic from a single server node design to the data-centre- level design under economy od scale in terms of mechanical thermal power and such things that we can differentiate ourselves.
As technological progress and innovation continue to shape server products, Wiwynn introduces a reinforced chassis with a novel embossed pattern design to reduce material consumption and carbon footprint. This paper presents the development process, from pattern optimization using Finite Element Analysis (FEA) to real-world static and dynamic mechanical testing for verification. Through this approach, Wiwynn successfully developed an embossed pattern, enabling the replacement of the original heavy chassis with a thinner and lighter design. In currently applications, this innovation has reduced material usage by at least 16.7% and lowered carbon emissions by approximately 15.9%, while achieving a 4.2% cost reduction. This lightweight, cost-effective, and sustainable chassis design reinforces Wiwynn’s commitment to sustainable server solutions and offers potential for further development.
Nepal is in early stages of the digitisation, after the political reform infrastructure, digitisation and egovernance project are widely scaling, however the amount of infrastructure that were supposed to scale has not been done as per the demand. Sustainable solutions and scalability has not been in the priority as still awareness is required however the basic data center design around the fintech and health care technology is widely scaling. Being closely working with Ministries and government projects there are requirements within the government but right direction and roadmap is required for which National level Blue print document to include the AI, Healthcare, Fintech and National interoperability project is in the pipeline. Interoperability layer requires a lot of the resources to be build, health related Interoperability layer has been built but we are looking after the National IOL layer. OpenMRS, OpenStack, OpenHIM, OpenHIE, Ubuntu, Nutanix, Dell are key players.
As the MHS standard continue to grow, the need to complete the remaining elements in the solutions become critical. Intel and UNEEC has been following the Edge -MHS standardization and working on developing off-the-shelf chassis solutions that can easily enable the Edge-MHS building blocks.
Since the number of CPU cores grows significantly nowadays, the demand of hardware partitioning has become evident. Hardware partitioning could improve the security, multi-task ability and resource efficiency of each CPU. In this paper, we’d like to share Wiwynn’s concept of Hardware Partitioning (HPAR) architecture, which can be implemented in multiple CPUs system with single DC-SCM. With assistant BMC’s help, BMC has the access to each CPU and dual socket system can boot up as either single or dual node. The HPAR method creates strict boundaries between each socket, which reduces the risk of unauthorized access or data leakage between partitions. Also, each partition can perform different tasks on one system simultaneously, optimizing the hardware utilization by segmenting workloads.
The future of artificial intelligence (AI) is continuous demanding on higher performance, greater efficiency, and increasing scalability in modern data centers. As a designer of advanced server CPUs and specialized AI accelerators, AMD plays a crucial role in addressing these priorities. AMD delivers leading high-performance computing solutions, from advanced chiplet architecture and server design to rack and data center infrastructure to meet AI market demands.
This session delves into the critical design considerations and emerging challenges associated with immersion cooling for high-speed signals in data centers. Key topics include the electrical characterization of cooling liquids, the performance benefits of improved thermal environments, and the impact of immersion fluids on high-speed interconnects—from individual components to entire signal channels. The discussion also covers design optimization strategies tailored for submerged environments. Finally, the session highlights the current state of industry readiness and the technical hurdles that must be addressed to ensure reliable high-speed signaling under immersion cooling conditions.
As memory capacity and bandwidth demands continue to rise, system designs are pushing toward higher memory density—particularly in dual-socket server platforms. This session will explore the thermal design challenges and considerations involved in supporting a 2-socket, 32-DIMM configuration on the latest Intel® Xeon® platform within a standard 19-inch rack chassis. In such configurations, DIMM pitch is constrained to 0.25"–0.27", significantly increasing the complexity of memory cooling. We will present thermal evaluation results based on Intel-developed CPU and DDR5 Thermal Test Vehicles (TTVs), which simulate real-world heat profiles and airflow interactions.
For decades the motherboard ecosystem has toiled in the service of the steady tic/toc beat of server processor roadmaps. That was then - this is now! Today there are multiple processor lines each within a larger set of processor makers than ever before in the server industry. The complexity of server processor complexes have skyrocketed increasing board layers, design rules and all manner of motherboard attributes.
The DC-MHS standards come at the right time. Motherboards (transformed now to HPMs) can be much more efficiently produced when originated by the processor manufacturers. The advent of the HPM reduces costs, increases diversity of systems and generally allows the ecosystem to innovate around the processor complex including baseboard management. This comes at exactly the time when the design aperture seemed to be closing on server system vendors. DC-MHS standards have created a whole new opportunity to build thriving horizontal ecosystems.
Sean Varley leads the Solutions group at Ampere Computing. His group is responsible for building out vertical solutions on Ampere server platforms which includes strategic business relationships, business planning and solution definition in the rapidly evolving Cloud and Edge server... Read More →
This talk presents the integration of OpenBMC with Arm Fixed Virtual Platforms (FVP) to prototype manageability features aligned with SBMR compliance. It showcases lessons from virtual platform development, sensor telemetry, and Redfish-based remote management, enabling early-stage validation without physical hardware.
Google contributed the Advanced PCIe Enclosure Compatible (APEC) Form Factor to OCP in 2024. APEC (Advanced PCIe Enclosure Compatible) is an electrical mechanical interface standard intended to advance the PCIe CEM standard with increased PCIe lane count, bandwidth, power, and management capability for use cases that need more advanced capabilities. This session will go deeper on what progress we have made, including the test methodology and challenges, and also our next steps to keep this moving forward. To make this happen, Google has developed the end-to-end testing modules to qualify the signals at both PCIe root complex and endpoint based on APEC. We will guide you through how the test module was designed from SI and layout routing considerations toward the goal of test efficiency and automation.
The dimensions of the Intel next platform has experienced an increase compared to the preceding ones, primarily due to the augmentation in pin count to increase the signal to dnoise ratio in both PCI Express 6.0 and DDR5. This alteration creates difficulties in arranging two processors, each of them has a 16 DDR5 channels, on a standard 19-inch rack. In response to this issue, Intel has embarked on a strategic initiative aimed at facilitating the accommodation of this challenge, which involves a proposal to reduce the distance between DDR5 connectors (a.k.a. DIMM pitch) as well as the processor’s keep out zone. To increase the DDR routing space underneath the DIMM connector’s pin-field area after shrinking the DIMM to DIMMM pitch, VIPPO (Via in Pad Plated Over) PCB (Printed Circuit Board) technology is used. These technologies significantly enhance signal quality when embracing the next generation MCRDIMM (Multiplexer Combined Ranks DIMM).
Wiwynn collaborates with Intel through the Open IP program to integrate a 1OU computing server into Intel’s single-phase immersion cooling tank, following the OCP ORv3 standard. The system uses Perstorp’s Synmerse DC synthetic ester coolant to thoroughly evaluate thermal performance under high-power workloads. In this study, CPUs are stressed up to 550W TDP, while researchers examine how variables such as CDU pumping frequency, inlet coolant temperature, and different heatsink types impact cooling effectiveness. Results are compared to those of traditional air cooling systems under similar operating conditions. The goal of this analysis is to optimize immersion cooling approaches, providing valuable insights for improving thermal management in high-performance computing and modern data centers.
Sean Varley leads the Solutions group at Ampere Computing. His group is responsible for building out vertical solutions on Ampere server platforms which includes strategic business relationships, business planning and solution definition in the rapidly evolving Cloud and Edge server... Read More →