ROCm 7.2.0 released

Published by

ROCm 7.2.0 has officially launched, bringing a host of improvements designed to enhance compatibility with the latest AMD hardware, operating systems, and system components. Key features include support for the new RDNA4 architecture GPUs, such as the Radeon AI PRO R9600D and RX 9060 XT LP, while continuing to support RDNA3 models like the RX 7700. The update also introduces additional virtualization support for SLES 15 SP7, particularly for AMD Instinct MI355X and MI350X GPUs.

A significant enhancement in this release is the Node Power Management (NPM) feature, which optimizes power distribution among multiple AMD GPUs in a single server node, improving efficiency for data center environments. Performance enhancements are evident, with users reporting increased throughput and reduced latency when running models like Llama 3.1 on MI355X GPUs. Similar optimizations have been implemented for other models, including GLM-4.6 and DeepEP, specifically targeting AMD Instinct MI300X hardware.

In terms of development tools, ROCm 7.2.0 has refined its HIP runtime environment, enhancing the doorbell mechanism for managing data transfer notifications between CPUs and GPUs. This improvement parallels NVIDIA's CUDA Graph optimizations, facilitating easier management of graph-based workloads. Additional technical improvements include accelerated memory set operations and better handling of asynchronous GPU jobs, collectively enhancing computational throughput.

The update also expands the functionality of APIs, with new HIP API additions for GPU memory management and execution control, as well as improved backend communication through rocSHMEM, enhancing inter-GPU communication within and across nodes. Furthermore, tools optimized for life sciences workloads have been refreshed, and popular deep learning frameworks like JAX have received updates that enhance components like MIOpen and MIGraphX, along with progress on ONNX support.

For troubleshooting and guidance, the release notes include several fixes, such as enhancements to the ROCm Runfile Installer and an expanded examples repository. Documentation has been updated to ensure clarity and support for users navigating the new features.

In summary, ROCm 7.2.0 represents a significant step forward for AMD's software ecosystem, bolstering performance and usability for developers and data scientists alike, while continually expanding its support for cutting-edge hardware and applications. As the demand for advanced computing solutions grows, ROCm's commitment to innovation positions it as a competitive player in the GPU computing landscape

ROCm 7.2.0 released

The latest version of ROCm, 7.2.0, has been released with several enhancements aimed at improving support for new AMD hardware and operating systems. This update includes support for newer graphics processing units (GPUs), such as RDNA4 architecture models, as well as improvements to virtualization support and performance optimizations across different GPU types and applications. The ROCm 7.2.0 release also features technical tweaks like boosted memory set operations and improved handling of jobs run asynchronously by the GPU, which collectively aim to increase computational throughput on AMD graphics processors. Additionally, the update includes new functionality through APIs, refreshed tools for life sciences workloads, and updates to deep learning frameworks like JAX.

ROCm 7.2.0 released @ Linux Compatible