NVIDIA Driver and CUDA Installation on Ubuntu Linux¶
In this article
This instructional guide details the procedure for installing NVIDIA graphics card drivers and CUDA on the subsequent operating systems: Ubuntu 22.04, Ubuntu 24.04.
Attention
For proper operation of Tesla series graphics cards (e.g., NVIDIA Tesla T4), ensure that the server's BIOS has the parameter 'above 4G decoding' or 'large/64bit BARs' or 'Above 4G MMIO BIOS assignment' enabled.
System Preparation¶
-
Update the system:
-
For RTX4xxx series, A100, and H100 on Ubuntu 22.04, you need to update the kernel version. You can also update the kernel version for older graphics cards:
Installing CUDA and Nvidia Drivers¶
CUDA is a parallel computing platform and programming model developed by NVIDIA that allows developers to use the capabilities of modern GPUs for general computing, data analysis, and machine learning applications.
-
Install the gcc compiler, necessary for compiling CUDA:
-
Download and install CUDA. For Ubuntu 24.04, replace
ubuntu2204
withubuntu2404
in the path ofwget
: -
Set environment variables for your frameworks and applications to detect CUDA in your
.bashrc
:echo 'export PATH="/sbin:/bin:/usr/sbin:/usr/bin:${PATH}:/usr/local/cuda/bin"' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64\${LD_LIBRARY_PATH:+:\${LD_LIBRARY_PATH}}' >> ~/.bashrc source ~/.bashrc
Attention
You must run these commands for all users who need to use CUDA.
-
Check the installation of drivers on your video card:
You should get output similar to this:
user@48567:~$ nvidia-smi Fri May 10 15:58:17 2024 +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA RTX A4000 Off | 00000000:07:00.0 Off | Off | | 41% 31C P8 15W / 140W | 3MiB / 16376MiB | 0% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+ +-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+
Attention
If you received a message like
modprobe: ERROR: could not insert 'nvidia': Device or resource busy
during installation, you need to remove thenouveau
kernel module and enable the use ofnvidia
modules.Note
You can find the latest instructions for installing Nvidia GPU drivers on Ubuntu here.
-
Install
nvidia-cuda-toolkit
: -
Check the CUDA installation:
After a successful installation, you should get output similar to this:
Attention
If you encounter an error like Failed to initialize NVML: Driver/library version mismatch after installation, you need to re-initialize the Nvidia kernel modules by removing them and running nvidia-smi
again.
Installing NVIDIA modules for Docker¶
If you're using Docker containers, don't forget to install the nvidia-docker2
package:
One-Click Installation of Drivers and CUDA 12¶
You can use this script for automatic installation of drivers and CUDA 12:
```bash
!/bin/bash¶
Update and upgrade the system using apt¶
sudo apt update sudo apt upgrade -y
Check Ubuntu 22.04 and update kernel¶
lsb_release=\((lsb_release -a | grep "22.04") if [[ -n "\)lsb_release" ]]; then
# Check if there's a video card with Nvidia (10de) H100 model (23xx)
lspci_output=$(lspci -nnk | awk '/\[10de:23[0-9a-f]{2}\]/ {print $0}')
if [[ -n "$lspci_output" ]]; then
echo "H100 detected"
# If yes install the necessary kernel package
sudo apt install -y linux-generic-hwe-22.04
fi
# Check if there's a video card with Nvidia (10de) A100 model (20xx)
lspci_output=$(lspci -nnk | awk '/\[10de:20[0-9a-f]{2}\]/ {print $0}')
if [[ -n "$lspci_output" ]]; then
echo "A100 detected"
# If yes install the necessary kernel package
sudo apt install -y linux-generic-hwe-22.04
fi
fi
Install GCC compiler for CUDA install¶
sudo apt install gcc -y
Get the release version of Ubuntu¶
RELEASE_VERSION=$(lsb_release -rs | sed 's/\([0-9]\+\).\([0-9]\+\)/\1\2/')
Download and install CUDA package for Ubuntu and Nvidia drivers¶
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu${RELEASE_VERSION}/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb
Update and upgrade the system again to ensure all packages are installed correctly¶
sudo apt update sudo apt install cuda -y sudo apt install nvidia-cuda-toolkit -y
Add PATH and LD_LIBRARY_PATH environment variables for CUDA in .bashrc file¶
echo 'export PATH="/sbin:/bin:/usr/sbin:/usr/bin:${PATH}:/usr/local/cuda/bin"' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc source ~/.bashrc
Initialize kernel modules without reboot¶
sudo rmmod -f nouveau sudo nvidia-smi
nvcc -V
Installing Docker binding for Nvidia¶
if command -v docker &> /dev/null; then echo "Docker is installed." sudo apt install -y nvidia-docker2 sudo systemctl restart docker else echo "Docker is not installed." fi ``