ROCm 6.4.3 released

Published by

AMD has launched ROCm 6.4.3, a notable update aimed at enhancing performance and usability for developers using AMD Radeon PRO and Radeon GPUs. This release tackles various issues, including performance drops in communication operations linked to increased latency in certain RCCL applications and failures in queue preemption due to scheduler constraints in the AMDGPU driver. Additionally, ROCm SMI has been updated to improve GPU data loading, while the ROCm documentation has been refined to better serve a diverse audience, now featuring five new tutorials tailored for AI developers. These tutorials cover advanced topics such as inference techniques, deployment strategies, and GPU optimization.

The update signifies a broader shift within ROCm, as AMD is streamlining components and transitioning certain tools to new repositories to enhance integration and compatibility with existing systems. Key changes include the migration of AMD SMI to the AMDTools repository and the phasing out of tools like ROCTracer and ROCProfiler. The update also emphasizes the deprecation of several features, including AMDGPU wavefront size compiler macros and specific ROCm Object Tooling tools, with plans to integrate their functionalities into newer solutions.

In the realm of deep learning, ROCm continues to support various frameworks, including Taichi and Megablocks, enhancing its ecosystem for high-performance computing. The release notes detail the updates, known issues, and upcoming changes, ensuring developers are well-informed about the evolving landscape of ROCm.

Looking ahead, users can expect further enhancements in future releases, particularly regarding the HIP runtime API, which aims to achieve greater alignment with CUDA APIs and improve overall efficiency. For detailed information on compatibility, changes, and tutorials, developers are encouraged to consult the ROCm documentation.

In summary, ROCm 6.4.3 marks a significant step forward in AMD's commitment to advancing the capabilities of its GPU ecosystem for deep learning and HPC applications. The ongoing updates and strategic changes reflect AMD's focus on providing developers with robust tools and support, paving the way for innovative applications and improved performance in computational tasks

ROCm 6.4.3 released

AMD has released ROCm 6.4.3, a significant release that addresses multiple issues, featuring updates for AMD Radeon PRO and Radeon GPU drivers, enhancements to ROCm SMI, and improvements to ROCm documentation. The update addresses a problem that was leading to performance degradation in communication operations due to heightened latency in specific RCCL applications. The update addresses a problem in the AMDGPU driver's scheduler constraints that may lead to failures in queue preemption during workload execution. The ROCm documentation is being consistently updated to offer clearer and more comprehensive guidance tailored to a diverse range of user needs and use cases.

TThe release includes five new tutorials specifically designed for AI developers, which cover topics such as inference, ChatQnA vLLM deployment and performance evaluation, text-to-video generation with ComfyUI, DeepSeek Janus Pro on CPU or GPU, DeepSeek-R1 with vLLM V1, and GPU development and optimization. AMD ROCm offers a robust ecosystem for deep learning development, featuring support for Taichi, a streamlined library designed for mixture-of-experts training, along with updated information on hardware and library support. The support for the operating system and hardware remains consistent in this release.

ROCm 6.4.3 released @ Linux Compatible