Ana içeriğe geç

AMD GPU Sürücülerinin, ROCm ve HIP'in Ubuntu Linux Üzerinde Kurulması

Bu makalede

Bu kılavuz, AMD GPU sürücülerinin ve ROCm (Radeon Open Compute) yığını ile HIP'in kurulumu sürecini özetler. ROCm, modern AMD GPU'larında makine öğrenmesi ve AI iş yüklerini etkinleştirirken, HIP grafik işleme hızlandırır—örneğin Blender'da.

Dikkat

AMD GPU'ları HOSTKEY üzerinde sadece Ubuntu 24.04 LTS üzerinde çalışır!

Sistem Hazırlığı

Kuruluma başlamadan önce, sistemin gereksinimleri karşıladığından emin olun:

  1. OS Kontrolü: cat /etc/os-release — çıktı VERSION_ID="24.04" içermelidir.
  2. Çekirdek Kontrolü: uname -r — Linux çekirdeği sürümünüz ≥6.13 olmalı. Gerekirse en son mevcut mainline çekirdeği kurun:

    sudo add-apt-repository ppa:cappelikan/ppa -y
    sudo apt update && sudo apt install -y mainline
    sudo mainline install-latest
    reboot
    
  3. Sistem Güncellemesi:

    sudo apt update && sudo apt upgrade -y
    

ROCm El ile Kurulumu

  1. Bağımlılıkların Kurulumu:

    sudo apt install -y wget gnupg2 build-essential dkms curl
    
  2. Eski paketleri temizleme (önerilir):

    sudo dpkg --configure -a
    sudo apt remove --purge -y rocminfo
    sudo apt purge -y 'rocm*' 'amdgpu*' 'graphics*' 'hip*'
    sudo apt autoremove -y
    sudo apt clean
    sudo rm -rf /etc/apt/sources.list.d/amdgpu* /etc/apt/sources.list.d/rocm* /etc/apt/sources.list.d/graphics*
    sudo apt update
    
  3. ROCm deposu ekleme "latest":

    sudo install -d -m 0755 /usr/share/keyrings
    wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key | gpg --dearmor | sudo tee /usr/share/keyrings/rocm-archive-keyring.gpg >/dev/null
    echo "deb [arch=amd64 signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/rocm/apt/latest/ noble main" | sudo tee /etc/apt/sources.list.d/rocm.list >/dev/null
    sudo apt update
    
  4. ROCm yığını kurulumu:

    sudo apt install -y rocm-dev rocm-libs rocm-hip-sdk rocm-smi-lib amd-smi-lib rocminfo
    
  5. Bir sembolik link oluşturma /opt/rocm:

    ROCM_DIR=$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1)
    sudo ln -sfn "$ROCM_DIR" /opt/rocm
    echo "ROCm установлен: $(basename "$ROCM_DIR")"
    
  6. Erişim hakları yapılandırması:

    sudo usermod -aG render,video $USER
    
  7. Yolları ~/.bashrc içinde yapılandırma:

    ROCM_VER=$(basename "$ROCM_DIR" | sed 's/rocm-//')
    cat >> ~/.bashrc << EOF
    
    # AMD ROCm Paths
    if [ -d "/opt/rocm-${ROCM_VER}" ]; then
    export PATH="/opt/rocm-${ROCM_VER}/bin:\$PATH"
    export LD_LIBRARY_PATH="/opt/rocm-${ROCM_VER}/hip/lib:/opt/rocm-${ROCM_VER}/lib:\$LD_LIBRARY_PATH"
    export ROCM_PATH="/opt/rocm-${ROCM_VER}"
    export HIP_CLANG_PATH="/opt/rocm-${ROCM_VER}/llvm/bin"
    fi
    EOF
    source ~/.bashrc
    

Kurulum Kontrolü

Kurulum tamamlandıktan ve sistem yeniden başlatıldıktan sonra, sürücülerin düzgün çalıştığını doğrulayın. Başlamak için, kartı "uyandırmak" komutuyla uyandırın.

echo on | sudo tee /sys/class/drm/card0/device/power/control
  1. Araç rocminfo:

    rocminfo
    
    Komut, mevcut GPU'ları ve özelliklerini (HSA ajanları) listelemelidir.

    Driver ve ROCm kurulumunun başarılı bir şekilde tamamlanmasının ardından rocminfo'nun örnek çıktısı
    ROCk module is loaded
    =====================
    HSA System Attributes
    =====================
    Runtime Version:         1.18
    Runtime Ext Version:     1.14
    System Timestamp Freq.:  1000.000000MHz
    Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
    Machine Model:           LARGE
    System Endianness:       LITTLE
    Mwaitx:                  DISABLED
    XNACK enabled:           NO
    DMAbuf Support:          YES
    VMM Support:             NO
    
    ==========
    HSA Agents
    ==========
    *******
    Agent 1
    *******
    Name:                    AMD Ryzen 9 7950X 16-Core Processor
    Uuid:                    CPU-XX
    Marketing Name:          AMD Ryzen 9 7950X 16-Core Processor
    Vendor Name:             CPU
    Feature:                 None specified
    Profile:                 FULL_PROFILE
    Float Round Mode:        NEAR
    Max Queue Number:        0(0x0)
    Queue Min Size:          0(0x0)
    Queue Max Size:          0(0x0)
    Queue Type:              MULTI
    Node:                    0
    Device Type:             CPU
    Cache Info:
        L1:                      32768(0x8000) KB
    Chip ID:                 0(0x0)
    ASIC Revision:           0(0x0)
    Cacheline Size:          64(0x40)
    Max Clock Freq. (MHz):   5881
    BDFID:                   0
    Internal Node ID:        0
    Compute Unit:            32
    SIMDs per CU:            0
    Shader Engines:          0
    Shader Arrs. per Eng.:   0
    WatchPts on Addr. Ranges:1
    Memory Properties:
    Features:                None
    Pool Info:
        Pool 1
        Segment:                 GLOBAL; FLAGS: FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 2
        Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 3
        Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
        Pool 4
        Segment:                 GLOBAL; FLAGS: COARSE GRAINED
        Size:                    130980620(0x7ce9b0c) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:4KB
        Alloc Alignment:         4KB
        Accessible by all:       TRUE
    ISA Info:
    *******
    Agent 2
    *******
    Name:                    gfx1036
    Uuid:                    GPU-XX
    Marketing Name:          AMD Radeon Graphics
    Vendor Name:             AMD
    Feature:                 KERNEL_DISPATCH
    Profile:                 BASE_PROFILE
    Float Round Mode:        NEAR
    Max Queue Number:        128(0x80)
    Queue Min Size:          64(0x40)
    Queue Max Size:          131072(0x20000)
    Queue Type:              MULTI
    Node:                    1
    Device Type:             GPU
    Cache Info:
        L1:                      16(0x10) KB
        L2:                      256(0x100) KB
    Chip ID:                 5710(0x164e)
    ASIC Revision:           1(0x1)
    Cacheline Size:          64(0x40)
    Max Clock Freq. (MHz):   2200
    BDFID:                   2560
    Internal Node ID:        1
    Compute Unit:            2
    SIMDs per CU:            2
    Shader Engines:          1
    Shader Arrs. per Eng.:   1
    WatchPts on Addr. Ranges:4
    Coherent Host Access:    FALSE
    Memory Properties:       APU
    Features:                KERNEL_DISPATCH
    Fast F16 Operation:      TRUE
    Wavefront Size:          32(0x20)
    Workgroup Max Size:      1024(0x400)
    Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
    Max Waves Per CU:        32(0x20)
    Max Work-item Per CU:    1024(0x400)
    Grid Max Size:           4294967295(0xffffffff)
    Grid Max Size per Dimension:
        x                        2147483647(0x7fffffff)
        y                        65535(0xffff)
        z                        65535(0xffff)
    Max fbarriers/Workgrp:   32
    Packet Processor uCode:: 18
    SDMA engine uCode::      1
    IOMMU Support::          None
    Pool Info:
        Pool 1
        Segment:                 GLOBAL; FLAGS: COARSE GRAINED
        Size:                    65490308(0x3e74d84) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:2048KB
        Alloc Alignment:         4KB
        Accessible by all:       FALSE
        Pool 2
        Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
        Size:                    65490308(0x3e74d84) KB
        Allocatable:             TRUE
        Alloc Granule:           4KB
        Alloc Recommended Granule:2048KB
        Alloc Alignment:         4KB
        Accessible by all:       FALSE
        Pool 3
        Segment:                 GROUP
        Size:                    64(0x40) KB
        Allocatable:             FALSE
        Alloc Granule:           0KB
        Alloc Recommended Granule:0KB
        Alloc Alignment:         0KB
        Accessible by all:       FALSE
    ISA Info:
        ISA 1
        Name:                    amdgcn-amd-amdhsa--gfx1036
        Machine Models:          HSA_MACHINE_MODEL_LARGE
        Profiles:                HSA_PROFILE_BASE
        Default Rounding Mode:   NEAR
        Default Rounding Mode:   NEAR
        Fast f16:                TRUE
        Workgroup Max Size:      1024(0x400)
        Workgroup Max Size per Dimension:
            x                        1024(0x400)
            y                        1024(0x400)
            z                        1024(0x400)
        Grid Max Size:           4294967295(0xffffffff)
        Grid Max Size per Dimension:
            x                        2147483647(0x7fffffff)
            y                        65535(0xffff)
            z                        65535(0xffff)
        FBarrier Max Size:       32
        ISA 2
        Name:                    amdgcn-amd-amdhsa--gfx10-3-generic
        Machine Models:          HSA_MACHINE_MODEL_LARGE
        Profiles:                HSA_PROFILE_BASE
        Default Rounding Mode:   NEAR
        Default Rounding Mode:   NEAR
        Fast f16:                TRUE
        Workgroup Max Size:      1024(0x400)
        Workgroup Max Size per Dimension:
            x                        1024(0x400)
            y                        1024(0x400)
            z                        1024(0x400)
        Grid Max Size:           4294967295(0xffffffff)
        Grid Max Size per Dimension:
            x                        2147483647(0x7fffffff)
            y                        65535(0xffff)
            z                        65535(0xffff)
        FBarrier Max Size:       32
    *** Done ***
    
  2. Araç rocm-smi:

    rocm-smi
    

    Komut çıktısı sonucu:

    WARNING: AMD GPU device(s) is/are in a low-power state. Check power control/runtime_status
    
    =========================================== ROCm System Management Interface ===========================================
    ===================================================== Concise Info =====================================================
    Device  Node  IDs              Temp    Power    Partitions          SCLK     MCLK     Fan    Perf  PwrCap  VRAM%  GPU%
          (DID,     GUID)  (Edge)  (Avg)    (Mem, Compute, ID)
    ========================================================================================================================
    0       1     0x7551,   64106  67.0°C  184.0W   N/A, N/A, 0         3259Mhz  96Mhz    54.9%  auto  300.0W  31%    100%
    1       2     0x164e,   36957  46.0°C  35.194W  N/A, N/A, 0         N/A      1800Mhz  0%     auto  N/A     3%     0%
    ========================================================================================================================
    ================================================= End of ROCm SMI Log ==================================================
    
  3. Araç amd-smi:

    amd-smi
    

    Komut çıktısının sonucu (diğer bir GPU'nun, işlemciye entegre olanın da görüntülendiğini unutmayın):

    +------------------------------------------------------------------------------+
    | AMD-SMI 26.2.0+021c61fc    amdgpu version: 6.18.1-061801 ROCm version: 7.1.1 |
    | VBIOS version: 00158746                                                      |
    | Platform: Linux Baremetal                                                    |
    |-------------------------------------+----------------------------------------|
    | BDF                        GPU-Name | Mem-Uti   Temp   UEC       Power-Usage |
    | GPU  HIP-ID  OAM-ID  Partition-Mode | GFX-Uti    Fan               Mem-Usage |
    |=====================================+========================================|
    | 0000:03:00.0    AMD Radeon Graphics | 68 %     82 °C   0           285/300 W |
    |   0       0     N/A             N/A | 95 %    52.94           10406/32624 MB |
    |-------------------------------------+----------------------------------------|
    | 0000:0a:00.0 ...X 16-Core Processor | N/A        N/A   0             N/A/0 W |
    |   1       1     N/A             N/A | N/A        N/A               15/512 MB |
    +-------------------------------------+----------------------------------------+
    +------------------------------------------------------------------------------+
    | Processes:                                                                   |
    |  GPU        PID  Process Name          GTT_MEM  VRAM_MEM  MEM_USAGE     CU % |
    |==============================================================================|
    |    0      12335  ollama                 2.0 MB    9.7 GB    10.1 GB  N/A     |
    |    1      12335  ollama                 2.0 MB   35.2 KB      0.0 B  N/A     |
    +------------------------------------------------------------------------------+
    

Bu, GPU'ların mevcut yükünü, sıcaklığını ve güç tüketimini görmenizi sağlar.

Not

amd-smi, yeni ROCm sürümlerinde ana izleme aracını yavaş yavaş rocm-smi yerine alıyor.

Docker ile Çalışma

Docker kullanıyorsanız, GPU'ları konteynarlara geçirmek için gerekli araçları kurmanız gerekir:

sudo apt install -y rocm-gdb rocm-container-toolkit
sudo systemctl restart docker

Tek‑Tık Otomatik Kurulum

Bash betiği tam süreç otomasyonu için. En son ROCm sürümünü tespit eder, sürücüleri, rocminfo aracını kurar ve yolları yapılandırır. Betiği kopyalayın ve sunucunuzun komut satırına yapıştırın, ardından çalıştırın.

#!/bin/bash
set -euo pipefail

# Universal AMD GPU + ROCm ("latest") installer for Ubuntu 24.04+

# FLAGS (enable/disable steps here)

DO_APT_UPGRADE=1

DO_OS_POLICY_CHECK=1                 # Enforce policy: only Ubuntu 24.04 LTS
ALLOWED_UBUNTU_VERSIONS=("24.04")

DO_KERNEL_POLICY_CHECK=1          # Enforce policy "kernel >= REQUIRED_KERNEL_MM"
DO_INSTALL_MAINLINE_KERNEL=1      # If kernel is lower, try installing a mainline kernel
REQUIRED_KERNEL_MM="6.13"

DO_GRUB_PARAMS=0                  # Add GRUB params (conservatively disabled by default)
GRUB_PARAMS=("amdgpu.gpu_recovery=1" "amdgpu.runpm=0" "amdgpu.ppfeaturemask=0xffffffff")

DO_PURGE_OLD_PACKAGES=1           # Remove old rocm/amdgpu/hip packages (best effort)
DO_SETUP_ROCM_REPO=1              # Add ROCm repository
DO_INSTALL_ROCM=1                 # Install rocm-dev/rocm-libs/...
DO_LINK_OPT_ROCM=1                # Make /opt/rocm -> /opt/rocm-X.Y.Z (if found)

DO_USER_GROUPS=1                  # Add user to render,video
DO_BASHRC_PATH=1                  # Add /opt/rocm/bin to PATH via ~/.bashrc

DO_OLLAMA_AMDGPU_IDS_WORKAROUND=1 # Create amdgpu.ids link for some Ollama builds
DO_GPU_POWER_CONTROL_ON=1         # Best effort: set power/control=on (if available)


# Start
echo "Starting AMD ROCm installation..."

# Dependency checks
for cmd in lspci wget gpg curl lsb_release; do
  if ! command -v "$cmd" >/dev/null 2>&1; then
    echo "Missing dependency: $cmd"
    exit 1
  fi
done


# Check this is Ubuntu (robust: don't grep raw file)

. /etc/os-release
if [[ "${ID:-}" != "ubuntu" ]]; then
  echo "This script is intended for Ubuntu. Exiting."
  exit 1
fi


# Restrict script to specific Ubuntu releases (only 24.04)
# ALLOWED_UBUNTU_VERSIONS=("24.04")
if [[ "${DO_OS_POLICY_CHECK}" -eq 1 ]]; then
  UBUNTU_VERSION="$(lsb_release -rs)"  # e.g. 24.04 [web:62]

  ok=0
  for v in "${ALLOWED_UBUNTU_VERSIONS[@]}"; do
    if [[ "${UBUNTU_VERSION}" == "${v}" ]]; then
      ok=1
      break
    fi
  done

  if [[ "${ok}" -ne 1 ]]; then
    echo "Unsupported Ubuntu version: ${UBUNTU_VERSION}"
    echo "Allowed versions: ${ALLOWED_UBUNTU_VERSIONS[*]}"
    exit 1
  fi
fi

# Detect AMD GPU (vendor 1002)
AMD_GPU_LINES="$(lspci -nn | grep -iE 'vga|3d' | grep -i '1002:' || true)"
if [[ -z "${AMD_GPU_LINES}" ]]; then
  echo "No AMD GPU detected (vendor 1002)."
  exit 1
fi
echo "AMD GPUs detected:"
echo "${AMD_GPU_LINES}"

# Update/upgrade
if [[ "${DO_APT_UPGRADE}" -eq 1 ]]; then
  sudo apt update
  sudo apt upgrade -y
fi

# Kernel check/upgrade (script policy)
KERNEL_INSTALLED=0
echo "Current kernel: $(uname -r)"

if [[ "${DO_KERNEL_POLICY_CHECK}" -eq 1 ]]; then
  KERNEL_VERSION="$(uname -r)"
  KERNEL_MM="$(echo "${KERNEL_VERSION}" | sed -nE 's/^([0-9]+)\.([0-9]+).*/\1.\2/p')"

  req_major="${REQUIRED_KERNEL_MM%.*}"
  req_minor="${REQUIRED_KERNEL_MM#*.}"
  cur_major="${KERNEL_MM%.*}"
  cur_minor="${KERNEL_MM#*.}"

  KERNEL_OK=0
  if [[ "${cur_major}" -gt "${req_major}" ]] || \
     [[ "${cur_major}" -eq "${req_major}" && "${cur_minor}" -ge "${req_minor}" ]]; then
    KERNEL_OK=1
  fi

  if [[ "${KERNEL_OK}" -ne 1 ]]; then
    echo "Kernel is older than required by this script policy (>= ${REQUIRED_KERNEL_MM})."
    if [[ "${DO_INSTALL_MAINLINE_KERNEL}" -eq 1 ]]; then
      echo "Installing latest mainline kernel..."
      sudo add-apt-repository ppa:cappelikan/ppa -y 2>/dev/null || true
      sudo apt update
      sudo apt install -y mainline pkexec
      sudo mainline install-latest
      echo "Mainline kernel installed. Reboot required to activate it."
      KERNEL_INSTALLED=1
    else
      echo "Mainline kernel install is disabled by flag DO_INSTALL_MAINLINE_KERNEL=0. Continuing."
    fi
  fi
fi

# Optional: GRUB parameters (append-only)
if [[ "${DO_GRUB_PARAMS}" -eq 1 ]]; then
  GRUB_FILE="/etc/default/grub"
  GRUB_CHANGED=0

  for param in "${GRUB_PARAMS[@]}"; do
    if ! sudo grep -qE "GRUB_CMDLINE_LINUX_DEFAULT=.*\b${param}\b" "${GRUB_FILE}"; then
      sudo cp -a "${GRUB_FILE}" "${GRUB_FILE}.backup.$(date +%F-%H%M%S)"
      sudo sed -i -E "s/^(GRUB_CMDLINE_LINUX_DEFAULT=\")([^\"]*)\"/\1\2 ${param}\"/" "${GRUB_FILE}"
      echo "Added GRUB param: ${param}"
      GRUB_CHANGED=1
    else
      echo "GRUB param already present: ${param}"
    fi
  done

  if [[ "${GRUB_CHANGED}" -eq 1 ]]; then
    sudo update-grub
    echo "GRUB updated."
  fi
else
  echo "Skipping GRUB parameters (DO_GRUB_PARAMS=0)."
fi

# Best effort: purge old packages/repos
if [[ "${DO_PURGE_OLD_PACKAGES}" -eq 1 ]]; then
  echo "Removing previous ROCm/AMDGPU packages (best effort)..."
  sudo dpkg --configure -a || true
  sudo apt remove --purge -y rocminfo || true
  sudo apt purge -y 'rocm*' 'amdgpu*' 'graphics*' 'hip*' || true
  sudo apt autoremove -y || true
  sudo apt clean || true
  sudo rm -rf /etc/apt/sources.list.d/amdgpu* /etc/apt/sources.list.d/rocm* /etc/apt/sources.list.d/graphics* || true
  sudo apt update || true
else
  echo "Skipping purge old packages (DO_PURGE_OLD_PACKAGES=0)."
fi

# Add ROCm "latest" repository
if [[ "${DO_SETUP_ROCM_REPO}" -eq 1 ]]; then
  echo "Setting up ROCm 'latest' repository..."

  . /etc/os-release

  UBUNTU_CODENAME="${UBUNTU_CODENAME:-${VERSION_CODENAME:-}}"
  if [[ -z "${UBUNTU_CODENAME}" ]]; then
  echo "Cannot detect Ubuntu codename (UBUNTU_CODENAME/VERSION_CODENAME)."
  exit 1
  fi

  sudo install -d -m 0755 /usr/share/keyrings
  wget -qO- https://repo.radeon.com/rocm/rocm.gpg.key \
    | gpg --dearmor \
    | sudo tee /usr/share/keyrings/rocm-archive-keyring.gpg >/dev/null

  echo "deb [arch=amd64 signed-by=/usr/share/keyrings/rocm-archive-keyring.gpg] https://repo.radeon.com/rocm/apt/latest/ ${UBUNTU_CODENAME} main" \
    | sudo tee /etc/apt/sources.list.d/rocm.list >/dev/null

  # Pin repo.radeon.com above Ubuntu
  sudo tee /etc/apt/preferences.d/rocm-pin-600 >/dev/null <<'EOF'
Package: *
Pin: origin repo.radeon.com
Pin-Priority: 600
EOF

else
  echo "Skipping ROCm repo setup (DO_SETUP_ROCM_REPO=0)."
fi

# Install ROCm packages
if [[ "${DO_INSTALL_ROCM}" -eq 1 ]]; then
  echo "Installing ROCm stack..."
  sudo apt update
  sudo apt install -y -o Dpkg::Options::="--force-overwrite" \
    rocm-dev rocm-libs rocm-hip-sdk rocm-smi-lib rocminfo
else
  echo "Skipping ROCm install (DO_INSTALL_ROCM=0)."
fi

# /opt/rocm -> /opt/rocm-X.Y.Z
if [[ "${DO_LINK_OPT_ROCM}" -eq 1 ]]; then
  INSTALLED_ROCM_DIR="$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1 || true)"
  if [[ -n "${INSTALLED_ROCM_DIR}" ]]; then
    REAL_VERSION="$(echo "${INSTALLED_ROCM_DIR}" | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' || echo latest)"
    sudo ln -sfn "${INSTALLED_ROCM_DIR}" /opt/rocm
    echo "ROCm detected: ${REAL_VERSION} (${INSTALLED_ROCM_DIR}); linked /opt/rocm -> ${INSTALLED_ROCM_DIR}"
  else
    echo "No /opt/rocm-X.Y.Z directory found; leaving /opt/rocm as-is."
  fi
else
  echo "Skipping /opt/rocm symlink (DO_LINK_OPT_ROCM=0)."
fi

# User groups: render,video
if [[ "${DO_USER_GROUPS}" -eq 1 ]]; then
  TARGET_USER="${SUDO_USER:-$USER}"
  sudo usermod -aG render,video "${TARGET_USER}" || true
  echo "User added to groups: render, video (${TARGET_USER}). Re-login or reboot required."
else
  echo "Skipping user groups (DO_USER_GROUPS=0)."
fi

# PATH + LD_LIBRARY_PATH in ~/.bashrc
if [[ "${DO_BASHRC_PATH}" -eq 1 ]]; then
  TARGET_USER="${SUDO_USER:-$USER}"
  TARGET_HOME="$(getent passwd "${TARGET_USER}" | cut -d: -f6)"
  TARGET_BASHRC="${TARGET_HOME}/.bashrc"
  MARKER="AMD ROCm Paths"

  if [[ ! -f "${TARGET_BASHRC}" ]]; then
    sudo -u "${TARGET_USER}" touch "${TARGET_BASHRC}" || true
  fi

  # Determining the installed ROCm version
  ROCM_VERSION_DIR="$(ls -d /opt/rocm-[0-9]* 2>/dev/null | sort -V | tail -n 1 || true)"
  if [[ -n "${ROCM_VERSION_DIR}" ]]; then
    ROCM_VERSION="$(basename "${ROCM_VERSION_DIR}" | sed 's/rocm-//')"
    echo "Using ROCm version: ${ROCM_VERSION} (${ROCM_VERSION_DIR})"
  else
    ROCM_VERSION="unknown"
    echo "Warning: No /opt/rocm-X.Y.Z found; using generic paths"
  fi

  if ! grep -q "${MARKER}" "${TARGET_BASHRC}" 2>/dev/null; then
    cat >> "${TARGET_BASHRC}" <<EOF

# ${MARKER}
if [ -d "/opt/rocm-${ROCM_VERSION}" ]; then
  export PATH="/opt/rocm-${ROCM_VERSION}/bin:\$PATH"
  export LD_LIBRARY_PATH="/opt/rocm-${ROCM_VERSION}/hip/lib:/opt/rocm-${ROCM_VERSION}/lib:\$LD_LIBRARY_PATH"
  export ROCM_PATH="/opt/rocm-${ROCM_VERSION}"
  export HIP_CLANG_PATH="/opt/rocm-${ROCM_VERSION}/llvm/bin"
fi
EOF
    echo "Added full ROCm paths (PATH+LD_LIBRARY_PATH) to ${TARGET_BASHRC}"
  else
    echo "ROCm PATH block already present in ${TARGET_BASHRC}"
  fi

  # Apply to the current session
  if [[ -n "${ROCM_VERSION_DIR}" ]]; then
    export PATH="${ROCM_VERSION_DIR}/bin:${PATH}"
    export LD_LIBRARY_PATH="${ROCM_VERSION_DIR}/hip/lib:${ROCM_VERSION_DIR}/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}"
    export ROCM_PATH="${ROCM_VERSION_DIR}"
    export HIP_CLANG_PATH="${ROCM_VERSION_DIR}/llvm/bin"
  fi
else
  echo "Skipping .bashrc PATH (DO_BASHRC_PATH=0)."
fi

# AMD ROCm Paths
export PATH="/opt/rocm/bin:$PATH"
EOF
    echo "Added ROCm PATH to ${TARGET_BASHRC}"
  else
    echo "ROCm PATH block already present in ${TARGET_BASHRC}"
  fi
else
  echo "Skipping .bashrc PATH (DO_BASHRC_PATH=0)."
fi

# Workaround for amdgpu.ids (some Ollama builds)
if [[ "${DO_OLLAMA_AMDGPU_IDS_WORKAROUND}" -eq 1 ]]; then
  if [[ -f /usr/share/libdrm/amdgpu.ids ]]; then
    sudo mkdir -p /opt/amdgpu/share/libdrm
    sudo ln -sf /usr/share/libdrm/amdgpu.ids /opt/amdgpu/share/libdrm/amdgpu.ids
    echo "Created compatibility link: /opt/amdgpu/share/libdrm/amdgpu.ids -> /usr/share/libdrm/amdgpu.ids"
  else
    echo "amdgpu.ids not found at /usr/share/libdrm/amdgpu.ids; skipping workaround."
  fi
else
  echo "Skipping Ollama amdgpu.ids workaround (DO_OLLAMA_AMDGPU_IDS_WORKAROUND=0)."
fi

# Best effort: power/control=on
if [[ "${DO_GPU_POWER_CONTROL_ON}" -eq 1 ]]; then
  if [[ -w /sys/class/drm/card0/device/power/control ]]; then
    echo on | sudo tee /sys/class/drm/card0/device/power/control >/dev/null
    echo "Set /sys/class/drm/card0/device/power/control = on"
  else
    echo "No write access to /sys/class/drm/card0/device/power/control; skipping."
  fi
else
  echo "Skipping GPU power control (DO_GPU_POWER_CONTROL_ON=0)."
fi

# Final
echo "Installation finished."
if [[ "${KERNEL_INSTALLED}" -eq 1 ]]; then
  echo "Reboot required to activate the new kernel."
else
  echo "Reboot recommended to apply group membership changes."
fi

echo "After reboot, verify:"
echo "  rocminfo"
echo "  amd-smi (if installed)"

Dikkat

Komut dosyasını çalıştırdıktan sonra, sudo reboot komutunu kullanarak bir sunucu yeniden başlatması zorunludur. Bu, yeni kullanıcı gruplarını ve çekirdek modüllerini etkinleştirmek için gereklidir.


Bu sayfadaki içeriğin bazı bölümleri AI kullanılarak oluşturulmuş veya çevrilmiştir.