Skip to content

Installing AMD GPU Drivers, ROCm, and HIP on Ubuntu Linux

In this article

This guide outlines the process for installing AMD GPU drivers and the ROCm (Radeon Open Compute) stack, as well as HIP. Using ROCm enables machine learning and AI workloads on modern AMD GPUs, while HIP accelerates graphics processing—for example, in Blender.

Attention

AMD GPUs on HOSTKEY are guaranteed to work ONLY on Ubuntu 24.04 LTS!

System Preparation

Before starting the installation, make sure the system meets the requirements:

  1. OS Check: cat /etc/os-release — the output should contain VERSION_ID="24.04".

  2. Kernel Check: uname -r — you need a Linux kernel version ≥6.13. If necessary, install the latest available mainline kernel:

    sudo add-apt-repository ppa:cappelikan/ppa -y
    sudo apt update && sudo apt install -y mainline
    sudo mainline install-latest
    reboot
    

  3. System Update:

    sudo apt update && sudo apt upgrade -y
    

Manual installation of ROCm

  1. Installation of dependencies:

    sudo apt install -y wget gnupg2 build-essential dkms curl
    
  2. Cleaning old packages (recommended):

    sudo dpkg --configure -a
    sudo apt remove --purge -y rocminfo
    sudo apt purge -y 'rocm*' 'amdgpu*' 'graphics*' 'hip*'
    sudo apt autoremove -y
    sudo apt clean
    sudo rm -rf /etc/apt/sources.list.d/amdgpu* /etc/apt/sources.list.d/rocm* /etc/apt/sources.list.d/graphics*
    sudo apt update
    
  3. Adding ROCm repository "latest":

    sudo install -d -m 0755 /usr/share/keyrings
    wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor | sudo tee /usr/share/keyrings/rocm-archive-keyring.gpg >/dev/null
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/rocm/apt/latest/ noble main" | sudo tee /etc/apt/sources.list.d/rocm.list >/dev/null
    sudo apt update
    
  4. Installing the ROCm stack:

    sudo apt install -y rocm-dev rocm-libs rocm-hip-sdk rocm-smi-lib amd-smi-lib rocminfo
    
  5. Creating a symlink /opt/rocm:

    ROCM_DIR=$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1)
    sudo ln -sfn "$ROCM_DIR" /opt/rocm
    echo "ROCm установлен: $(basename "$ROCM_DIR")"
    
  6. Access rights configuration:

    sudo usermod -aG render,video $USER
    
  7. Configuring paths in ~/.bashrc:

    ROCM_VER=$(basename "$ROCM_DIR" | sed 's/rocm-//')
    cat >> ~/.bashrc << EOF
    
    # AMD ROCm Paths
    if [ -d "/opt/rocm-${ROCM_VER}" ]; then
    export PATH="/opt/rocm-${ROCM_VER}/bin:\$PATH"
    export LD_LIBRARY_PATH="/opt/rocm-${ROCM_VER}/hip/lib:/opt/rocm-${ROCM_VER}/lib:\$LD_LIBRARY_PATH"
    export ROCM_PATH="/opt/rocm-${ROCM_VER}"
    export HIP_CLANG_PATH="/opt/rocm-${ROCM_VER}/llvm/bin"
    fi
    EOF
    source ~/.bashrc
    

Installation Check

After the installation is complete and the system has rebooted, verify that the drivers are working correctly. To start, "wake up" the card with the command

echo on | sudo tee /sys/class/drm/card0/device/power/control
  1. Tool rocminfo:

    rocminfo
    
    The command should list available GPUs and their specifications (HSA agents).

    Example output of rocminfo after successful installation of drivers and ROCm
    ROCk module is loaded
    =====================
    HSA System Attributes
    =====================
    Runtime Version:         1.18
    Runtime Ext Version:     1.14
    System Timestamp Freq.:  1000.000000MHz
    Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
    Machine Model:           LARGE
    System Endianness:       LITTLE
    Mwaitx:                  DISABLED
    XNACK enabled:           NO
    DMAbuf Support:          YES
    VMM Support:             NO
    
    ==========
    HSA Agents
    ==========
    *******
    Agent 1
    *******
    Name:                    AMD Ryzen 9 7950X 16-Core Processor
    Uuid:                    CPU-XX
    Marketing Name:          AMD Ryzen 9 7950X 16-Core Processor
    Vendor Name:             CPU
    Feature:                 None specified
    Profile:                 FULL_PROFILE
    Float Round Mode:        NEAR
    Max Queue Number:        0(0x0)
    Queue Min Size:          0(0x0)
    Queue Max Size:          0(0x0)
    Queue Type:              MULTI
    Node:                    0
    Device Type:             CPU
    Cache Info:
        L1:                      32768(0x8000) KB
    Chip ID:                 0(0x0)
    ASIC Revision:           0(0x0)
    Cacheline Size:          64(0x40)
    Max Clock Freq. (MHz):   5881
    BDFID:                   0
    Internal Node ID:        0
    Compute Unit:            32
    SIMDs per CU:            0
    Shader Engines:          0
    Shader Arrs. per Eng.:   0
    WatchPts on Addr. Ranges:1
    Memory Properties:
    Features:                None
    Pool Info:
        Pool 1
        Segment:                 GLOBAL; FLAGS: FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 2
        Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 3
        Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 4
        Segment:                 GLOBAL; FLAGS: COARSE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
    ISA Info:
    *******
    Agent 2
    *******
    Name:                    gfx1036
    Uuid:                    GPU-XX
    Marketing Name:          AMD Radeon Graphics
    Vendor Name:             AMD
    Feature:                 KERNEL_DISPATCH
    Profile:                 BASE_PROFILE
    Float Round Mode:        NEAR
    Max Queue Number:        128(0x80)
    Queue Min Size:          64(0x40)
    Queue Max Size:          131072(0x20000)
    Queue Type:              MULTI
    Node:                    1
    Device Type:             GPU
    Cache Info:
        L1:                      16(0x10) KB
        L2:                      256(0x100) KB
    Chip ID:                 5710(0x164e)
    ASIC Revision:           1(0x1)
    Cacheline Size:          64(0x40)
    Max Clock Freq. (MHz):   2200
    BDFID:                   2560
    Internal Node ID:        1
    Compute Unit:            2
    SIMDs per CU:            2
    Shader Engines:          1
    Shader Arrs. per Eng.:   1
    WatchPts on Addr. Ranges:4
    Coherent Host Access:    FALSE
    Memory Properties:       APU
    Features:                KERNEL_DISPATCH
    Fast F16 Operation:      TRUE
    Wavefront Size:          32(0x20)
    Workgroup Max Size:      1024(0x400)
    Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
    Max Waves Per CU:        32(0x20)
    Max Work-item Per CU:    1024(0x400)
    Grid Max Size:           4294967295(0xffffffff)
    Grid Max Size per Dimension:
        x                        2147483647(0x7fffffff)
        y                        65535(0xffff)
        z                        65535(0xffff)
    Max fbarriers/Workgrp:   32
    Packet Processor uCode:: 18
    SDMA engine uCode::      1
    IOMMU Support::          None
    Pool Info:
        Pool 1
        Segment:                 GLOBAL; FLAGS: COARSE GRAINED
        Size:                    65490308(0x3e74d84) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:2048KB
        Alloc Alignment:         4KB
        Accessible by all:       FALSE
        Pool 2
        Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
        Size:                    65490308(0x3e74d84) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:2048KB
        Alloc Alignment:         4KB
        Accessible by all:       FALSE
        Pool 3
        Segment:                 GROUP
        Size:                    64(0x40) KB
        Allocatable:             FALSE
        Alloc Granule:           0KB
        Alloc Recommended Granule:0KB
        Alloc Alignment:         0KB
        Accessible by all:       FALSE
    ISA Info:
        ISA 1
        Name:                    amdgcn-amd-amdhsa--gfx1036
        Machine Models:          HSA_MACHINE_MODEL_LARGE
        Profiles:                HSA_PROFILE_BASE
        Default Rounding Mode:   NEAR
        Default Rounding Mode:   NEAR
        Fast f16:                TRUE
        Workgroup Max Size:      1024(0x400)
        Workgroup Max Size per Dimension:
            x                        1024(0x400)
            y                        1024(0x400)
            z                        1024(0x400)
        Grid Max Size:           4294967295(0xffffffff)
        Grid Max Size per Dimension:
            x                        2147483647(0x7fffffff)
            y                        65535(0xffff)
            z                        65535(0xffff)
        FBarrier Max Size:       32
        ISA 2
        Name:                    amdgcn-amd-amdhsa--gfx10-3-generic
        Machine Models:          HSA_MACHINE_MODEL_LARGE
        Profiles:                HSA_PROFILE_BASE
        Default Rounding Mode:   NEAR
        Default Rounding Mode:   NEAR
        Fast f16:                TRUE
        Workgroup Max Size:      1024(0x400)
        Workgroup Max Size per Dimension:
            x                        1024(0x400)
            y                        1024(0x400)
            z                        1024(0x400)
        Grid Max Size:           4294967295(0xffffffff)
        Grid Max Size per Dimension:
            x                        2147483647(0x7fffffff)
            y                        65535(0xffff)
            z                        65535(0xffff)
        FBarrier Max Size:       32
    *** Done ***
    
  2. rocm-smi Tool:

    rocm-smi
    

    Command output result:

    WARNING: AMD GPU device(s) is/are in a low-power state. Check power control/runtime_status
    
    =========================================== ROCm System Management Interface ===========================================
    ===================================================== Concise Info =====================================================
    Device  Node  IDs              Temp    Power    Partitions          SCLK     MCLK     Fan    Perf  PwrCap  VRAM%  GPU%
          (DID,     GUID)  (Edge)  (Avg)    (Mem, Compute, ID)
    ========================================================================================================================
    0       1     0x7551,   64106  67.0°C  184.0W   N/A, N/A, 0         3259Mhz  96Mhz    54.9%  auto  300.0W  31%    100%
    1       2     0x164e,   36957  46.0°C  35.194W  N/A, N/A, 0         N/A      1800Mhz  0%     auto  N/A     3%     0%
    ========================================================================================================================
    ================================================= End of ROCm SMI Log ==================================================
    
  3. amd-smi Tool:

    amd-smi
    

    Result of the command output (note that another GPU, built into the processor, is also displayed):

    +------------------------------------------------------------------------------+
    | AMD-SMI 26.2.0+021c61fc    amdgpu version: 6.18.1-061801 ROCm version: 7.1.1 |
    | VBIOS version: 00158746                                                      |
    | Platform: Linux Baremetal                                                    |
    |-------------------------------------+----------------------------------------|
    | BDF                        GPU-Name | Mem-Uti   Temp   UEC       Power-Usage |
    | GPU  HIP-ID  OAM-ID  Partition-Mode | GFX-Uti    Fan               Mem-Usage |
    |=====================================+========================================|
    | 0000:03:00.0    AMD Radeon Graphics | 68 %     82 °C   0           285/300 W |
    |   0       0     N/A             N/A | 95 %    52.94           10406/32624 MB |
    |-------------------------------------+----------------------------------------|
    | 0000:0a:00.0 ...X 16-Core Processor | N/A        N/A   0             N/A/0 W |
    |   1       1     N/A             N/A | N/A        N/A               15/512 MB |
    +-------------------------------------+----------------------------------------+
    +------------------------------------------------------------------------------+
    | Processes:                                                                   |
    |  GPU        PID  Process Name          GTT_MEM  VRAM_MEM  MEM_USAGE     CU % |
    |==============================================================================|
    |    0      12335  ollama                 2.0 MB    9.7 GB    10.1 GB  N/A     |
    |    1      12335  ollama                 2.0 MB   35.2 KB      0.0 B  N/A     |
    +------------------------------------------------------------------------------+
    

All of this will allow you to see the current load, temperature, and power consumption of the GPUs.

Note

amd-smi gradually replaces rocm-smi as the primary monitoring utility in newer ROCm releases.

Working with Docker

If you are using Docker, you need to install the tooling to pass GPUs into containers:

sudo apt install -y rocm-gdb rocm-container-toolkit
sudo systemctl restart docker

One‑Click Automatic Installation

Bash script for full process automation. It detects the latest ROCm version, installs the drivers, the rocminfo utility, and configures paths. Copy it and paste it into your server’s command line and run it.

#!/bin/bash
set -euo pipefail

# Universal AMD GPU + ROCm ("latest") installer for Ubuntu 24.04+

# FLAGS (enable/disable steps here)

DO_APT_UPGRADE=1

DO_OS_POLICY_CHECK=1                 # Enforce policy: only Ubuntu 24.04 LTS
ALLOWED_UBUNTU_VERSIONS=("24.04")

DO_KERNEL_POLICY_CHECK=1          # Enforce policy "kernel >= REQUIRED_KERNEL_MM"
DO_INSTALL_MAINLINE_KERNEL=1      # If kernel is lower, try installing a mainline kernel
REQUIRED_KERNEL_MM="6.13"

DO_GRUB_PARAMS=0                  # Add GRUB params (conservatively disabled by default)
GRUB_PARAMS=("amdgpu.gpu_recovery=1" "amdgpu.runpm=0" "amdgpu.ppfeaturemask=0xffffffff")

DO_PURGE_OLD_PACKAGES=1           # Remove old rocm/amdgpu/hip packages (best effort)
DO_SETUP_ROCM_REPO=1              # Add ROCm repository
DO_INSTALL_ROCM=1                 # Install rocm-dev/rocm-libs/...
DO_LINK_OPT_ROCM=1                # Make /opt/rocm -> /opt/rocm-X.Y.Z (if found)

DO_USER_GROUPS=1                  # Add user to render,video
DO_BASHRC_PATH=1                  # Add /opt/rocm/bin to PATH via ~/.bashrc

DO_OLLAMA_AMDGPU_IDS_WORKAROUND=1 # Create amdgpu.ids link for some Ollama builds
DO_GPU_POWER_CONTROL_ON=1         # Best effort: set power/control=on (if available)


# Start
echo "Starting AMD ROCm installation..."

# Dependency checks
for cmd in lspci wget gpg curl lsb_release; do
  if ! command -v "$cmd" >/dev/null 2>&1; then
    echo "Missing dependency: $cmd"
    exit 1
  fi
done


# Check this is Ubuntu (robust: don't grep raw file)

. /etc/os-release
if [[ "${ID:-}" != "ubuntu" ]]; then
  echo "This script is intended for Ubuntu. Exiting."
  exit 1
fi


# Restrict script to specific Ubuntu releases (only 24.04)
# ALLOWED_UBUNTU_VERSIONS=("24.04")
if [[ "${DO_OS_POLICY_CHECK}" -eq 1 ]]; then
  UBUNTU_VERSION="$(lsb_release -rs)"  # e.g. 24.04 [web:62]

  ok=0
  for v in "${ALLOWED_UBUNTU_VERSIONS[@]}"; do
    if [[ "${UBUNTU_VERSION}" == "${v}" ]]; then
      ok=1
      break
    fi
  done

  if [[ "${ok}" -ne 1 ]]; then
    echo "Unsupported Ubuntu version: ${UBUNTU_VERSION}"
    echo "Allowed versions: ${ALLOWED_UBUNTU_VERSIONS[*]}"
    exit 1
  fi
fi

# Detect AMD GPU (vendor 1002)
AMD_GPU_LINES="$(lspci -nn | grep -iE 'vga|3d' | grep -i '1002:' || true)"
if [[ -z "${AMD_GPU_LINES}" ]]; then
  echo "No AMD GPU detected (vendor 1002)."
  exit 1
fi
echo "AMD GPUs detected:"
echo "${AMD_GPU_LINES}"

# Update/upgrade
if [[ "${DO_APT_UPGRADE}" -eq 1 ]]; then
  sudo apt update
  sudo apt upgrade -y
fi

# Kernel check/upgrade (script policy)
KERNEL_INSTALLED=0
echo "Current kernel: $(uname -r)"

if [[ "${DO_KERNEL_POLICY_CHECK}" -eq 1 ]]; then
  KERNEL_VERSION="$(uname -r)"
  KERNEL_MM="$(echo "${KERNEL_VERSION}" | sed -nE 's/^([0-9]+)\.([0-9]+).*/\1.\2/p')"

  req_major="${REQUIRED_KERNEL_MM%.*}"
  req_minor="${REQUIRED_KERNEL_MM#*.}"
  cur_major="${KERNEL_MM%.*}"
  cur_minor="${KERNEL_MM#*.}"

  KERNEL_OK=0
  if [[ "${cur_major}" -gt "${req_major}" ]] || \
     [[ "${cur_major}" -eq "${req_major}" && "${cur_minor}" -ge "${req_minor}" ]]; then
    KERNEL_OK=1
  fi

  if [[ "${KERNEL_OK}" -ne 1 ]]; then
    echo "Kernel is older than required by this script policy (>= ${REQUIRED_KERNEL_MM})."
    if [[ "${DO_INSTALL_MAINLINE_KERNEL}" -eq 1 ]]; then
      echo "Installing latest mainline kernel..."
      sudo add-apt-repository ppa:cappelikan/ppa -y 2>/dev/null || true
      sudo apt update
      sudo apt install -y mainline pkexec
      sudo mainline install-latest
      echo "Mainline kernel installed. Reboot required to activate it."
      KERNEL_INSTALLED=1
    else
      echo "Mainline kernel install is disabled by flag DO_INSTALL_MAINLINE_KERNEL=0. Continuing."
    fi
  fi
fi

# Optional: GRUB parameters (append-only)
if [[ "${DO_GRUB_PARAMS}" -eq 1 ]]; then
  GRUB_FILE="/etc/default/grub"
  GRUB_CHANGED=0

  for param in "${GRUB_PARAMS[@]}"; do
    if ! sudo grep -qE "GRUB_CMDLINE_LINUX_DEFAULT=.*\b${param}\b" "${GRUB_FILE}"; then
      sudo cp -a "${GRUB_FILE}" "${GRUB_FILE}.backup.$(date +%F-%H%M%S)"
      sudo sed -i -E "s/^(GRUB_CMDLINE_LINUX_DEFAULT=\")([^\"]*)\"/\1\2 ${param}\"/" "${GRUB_FILE}"
      echo "Added GRUB param: ${param}"
      GRUB_CHANGED=1
    else
      echo "GRUB param already present: ${param}"
    fi
  done

  if [[ "${GRUB_CHANGED}" -eq 1 ]]; then
    sudo update-grub
    echo "GRUB updated."
  fi
else
  echo "Skipping GRUB parameters (DO_GRUB_PARAMS=0)."
fi

# Best effort: purge old packages/repos
if [[ "${DO_PURGE_OLD_PACKAGES}" -eq 1 ]]; then
  echo "Removing previous ROCm/AMDGPU packages (best effort)..."
  sudo dpkg --configure -a || true
  sudo apt remove --purge -y rocminfo || true
  sudo apt purge -y 'rocm*' 'amdgpu*' 'graphics*' 'hip*' || true
  sudo apt autoremove -y || true
  sudo apt clean || true
  sudo rm -rf /etc/apt/sources.list.d/amdgpu* /etc/apt/sources.list.d/rocm* /etc/apt/sources.list.d/graphics* || true
  sudo apt update || true
else
  echo "Skipping purge old packages (DO_PURGE_OLD_PACKAGES=0)."
fi

# Add ROCm "latest" repository
if [[ "${DO_SETUP_ROCM_REPO}" -eq 1 ]]; then
  echo "Setting up ROCm 'latest' repository..."

  . /etc/os-release

  UBUNTU_CODENAME="${UBUNTU_CODENAME:-${VERSION_CODENAME:-}}"
  if [[ -z "${UBUNTU_CODENAME}" ]]; then
  echo "Cannot detect Ubuntu codename (UBUNTU_CODENAME/VERSION_CODENAME)."
  exit 1
  fi

  sudo install -d -m 0755 /usr/share/keyrings
  wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key \
    | gpg --dearmor \
    | sudo tee /usr/share/keyrings/rocm-archive-keyring.gpg >/dev/null

  echo "deb [arch=amd64 signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/rocm/apt/latest/ ${UBUNTU_CODENAME} main" \
    | sudo tee /etc/apt/sources.list.d/rocm.list >/dev/null

  # Pin repo.radeon.com above Ubuntu
  sudo tee /etc/apt/preferences.d/rocm-pin-600 >/dev/null <<'EOF'
Package: *
Pin: origin repo.radeon.com
Pin-Priority: 600
EOF

else
  echo "Skipping ROCm repo setup (DO_SETUP_ROCM_REPO=0)."
fi

# Install ROCm packages
if [[ "${DO_INSTALL_ROCM}" -eq 1 ]]; then
  echo "Installing ROCm stack..."
  sudo apt update
  sudo apt install -y -o Dpkg::Options::="--force-overwrite" \
    rocm-dev rocm-libs rocm-hip-sdk rocm-smi-lib rocminfo
else
  echo "Skipping ROCm install (DO_INSTALL_ROCM=0)."
fi

# /opt/rocm -> /opt/rocm-X.Y.Z
if [[ "${DO_LINK_OPT_ROCM}" -eq 1 ]]; then
  INSTALLED_ROCM_DIR="$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1 || true)"
  if [[ -n "${INSTALLED_ROCM_DIR}" ]]; then
    REAL_VERSION="$(echo "${INSTALLED_ROCM_DIR}" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' || echo latest)"
    sudo ln -sfn "${INSTALLED_ROCM_DIR}" /opt/rocm
    echo "ROCm detected: ${REAL_VERSION} (${INSTALLED_ROCM_DIR}); linked /opt/rocm -> ${INSTALLED_ROCM_DIR}"
  else
    echo "No /opt/rocm-X.Y.Z directory found; leaving /opt/rocm as-is."
  fi
else
  echo "Skipping /opt/rocm symlink (DO_LINK_OPT_ROCM=0)."
fi

# User groups: render,video
if [[ "${DO_USER_GROUPS}" -eq 1 ]]; then
  TARGET_USER="${SUDO_USER:-$USER}"
  sudo usermod -aG render,video "${TARGET_USER}" || true
  echo "User added to groups: render, video (${TARGET_USER}). Re-login or reboot required."
else
  echo "Skipping user groups (DO_USER_GROUPS=0)."
fi

# PATH + LD_LIBRARY_PATH in ~/.bashrc
if [[ "${DO_BASHRC_PATH}" -eq 1 ]]; then
  TARGET_USER="${SUDO_USER:-$USER}"
  TARGET_HOME="$(getent passwd "${TARGET_USER}" | cut -d: -f6)"
  TARGET_BASHRC="${TARGET_HOME}/.bashrc"
  MARKER="AMD ROCm Paths"

  if [[ ! -f "${TARGET_BASHRC}" ]]; then
    sudo -u "${TARGET_USER}" touch "${TARGET_BASHRC}" || true
  fi

  # Determining the installed ROCm version
  ROCM_VERSION_DIR="$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1 || true)"
  if [[ -n "${ROCM_VERSION_DIR}" ]]; then
    ROCM_VERSION="$(basename "${ROCM_VERSION_DIR}" | sed 's/rocm-//')"
    echo "Using ROCm version: ${ROCM_VERSION} (${ROCM_VERSION_DIR})"
  else
    ROCM_VERSION="unknown"
    echo "Warning: No /opt/rocm-X.Y.Z found; using generic paths"
  fi

  if ! grep -q "${MARKER}" "${TARGET_BASHRC}" 2>/dev/null; then
    cat >> "${TARGET_BASHRC}" <<EOF

# ${MARKER}
if [ -d "/opt/rocm-${ROCM_VERSION}" ]; then
  export PATH="/opt/rocm-${ROCM_VERSION}/bin:\$PATH"
  export LD_LIBRARY_PATH="/opt/rocm-${ROCM_VERSION}/hip/lib:/opt/rocm-${ROCM_VERSION}/lib:\$LD_LIBRARY_PATH"
  export ROCM_PATH="/opt/rocm-${ROCM_VERSION}"
  export HIP_CLANG_PATH="/opt/rocm-${ROCM_VERSION}/llvm/bin"
fi
EOF
    echo "Added full ROCm paths (PATH+LD_LIBRARY_PATH) to ${TARGET_BASHRC}"
  else
    echo "ROCm PATH block already present in ${TARGET_BASHRC}"
  fi

  # Apply to the current session
  if [[ -n "${ROCM_VERSION_DIR}" ]]; then
    export PATH="${ROCM_VERSION_DIR}/bin:${PATH}"
    export LD_LIBRARY_PATH="${ROCM_VERSION_DIR}/hip/lib:${ROCM_VERSION_DIR}/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
    export ROCM_PATH="${ROCM_VERSION_DIR}"
    export HIP_CLANG_PATH="${ROCM_VERSION_DIR}/llvm/bin"
  fi
else
  echo "Skipping .bashrc PATH (DO_BASHRC_PATH=0)."
fi

# AMD ROCm Paths
export PATH="/opt/rocm/bin:$PATH"
EOF
    echo "Added ROCm PATH to ${TARGET_BASHRC}"
  else
    echo "ROCm PATH block already present in ${TARGET_BASHRC}"
  fi
else
  echo "Skipping .bashrc PATH (DO_BASHRC_PATH=0)."
fi

# Workaround for amdgpu.ids (some Ollama builds)
if [[ "${DO_OLLAMA_AMDGPU_IDS_WORKAROUND}" -eq 1 ]]; then
  if [[ -f /usr/share/libdrm/amdgpu.ids ]]; then
    sudo mkdir -p /opt/amdgpu/share/libdrm
    sudo ln -sf /usr/share/libdrm/amdgpu.ids /opt/amdgpu/share/libdrm/amdgpu.ids
    echo "Created compatibility link: /opt/amdgpu/share/libdrm/amdgpu.ids -> /usr/share/libdrm/amdgpu.ids"
  else
    echo "amdgpu.ids not found at /usr/share/libdrm/amdgpu.ids; skipping workaround."
  fi
else
  echo "Skipping Ollama amdgpu.ids workaround (DO_OLLAMA_AMDGPU_IDS_WORKAROUND=0)."
fi

# Best effort: power/control=on
if [[ "${DO_GPU_POWER_CONTROL_ON}" -eq 1 ]]; then
  if [[ -w /sys/class/drm/card0/device/power/control ]]; then
    echo on | sudo tee /sys/class/drm/card0/device/power/control >/dev/null
    echo "Set /sys/class/drm/card0/device/power/control = on"
  else
    echo "No write access to /sys/class/drm/card0/device/power/control; skipping."
  fi
else
  echo "Skipping GPU power control (DO_GPU_POWER_CONTROL_ON=0)."
fi

# Final
echo "Installation finished."
if [[ "${KERNEL_INSTALLED}" -eq 1 ]]; then
  echo "Reboot required to activate the new kernel."
else
  echo "Reboot recommended to apply group membership changes."
fi

echo "After reboot, verify:"
echo "  rocminfo"
echo "  amd-smi (if installed)"

Attention

After running the script, a server reboot using the sudo reboot command is mandatory. This is necessary to activate new user groups and kernel modules.


Some of the content on this page was created or translated using AI.

question_mark
Is there anything I can help you with?
question_mark
AI Assistant ×