Deployment Overview of PyTorch on Server¶
Prerequisites and Basic Requirements¶
The deployment process requires a server running the Ubuntu operating system. The system must have access to the internet to download packages and dependencies. The following conditions must be met:
- Operating System: Ubuntu 22.04 (implied by the HWE kernel package
linux-generic-hwe-22.04). - Privileges: Root access is required to install system packages, configure drivers, and create user accounts.
- Hardware: The server may include an NVIDIA H100 GPU (PCI ID
10de:2330). If present, specific kernel packages are installed automatically. - Ports: Standard ports for package management and SSH are required, though no specific application ports are defined in the configuration.
File and Directory Structure¶
The application and its supporting files are organized within the home directory of the user account. The following paths are utilized:
/home/user/: The home directory for theuseraccount where the application resides./home/user/venv/: The directory containing the Python virtual environment./home/user/pytorch_install.sh: The executable script used to initialize the virtual environment and install PyTorch./home/user/pytorch.sh: The executable script used to activate the virtual environment./root/user_credentials: A file containing the generated password for theuseraccount./usr/local/cuda/: The installation directory for the CUDA toolkit.
Application Installation Process¶
The installation is performed via a shell script that configures the system, installs drivers, and sets up the Python environment. The process includes the following steps:
- System Updates: All packages are updated to their latest versions, and unused packages are purged.
- Driver Installation:
- If an NVIDIA H100 GPU is detected, the
linux-generic-hwe-22.04kernel package is installed. - The
ubuntu-drivers-commonpackage is installed to detect the recommended NVIDIA driver. - The recommended NVIDIA driver package is installed automatically.
- The
gcccompiler is installed to support CUDA.
- If an NVIDIA H100 GPU is detected, the
- CUDA Toolkit Installation:
- The CUDA keyring is downloaded and installed for the specific Ubuntu release version.
- The
cudapackage is installed via the APT repository.
- User Account Creation:
- A new user named
useris created with the home directory/home/user. - A random 8-character password is generated and stored in
/root/user_credentials. - The
useraccount is added to thesudogroup.
- A new user named
- Library Installation:
- The
libnvinfer5-devpackage is installed. - Python 3.10,
python3-pip, andpython3-venvare installed.
- The
- Environment Configuration:
- Environment variables for CUDA (
PATHandLD_LIBRARY_PATH) are appended to the~/.bashrcfile of theuseraccount. - The
pytorch_install.shscript is created to handle the virtual environment setup.
- Environment variables for CUDA (
Access Rights and Security¶
Security and access controls are configured as follows:
- User Account: A dedicated
useraccount is created with a randomly generated password. - Sudo Access: The
useraccount is added to thesudogroup, granting administrative privileges. - File Permissions:
- The
install_script.shis created with permissionsu=rwx,g=r,o=r. - The
pytorch_install.shandpytorch.shscripts are made executable usingchmod +x.
- The
- Password Storage: The user password is stored in the
/root/user_credentialsfile, which is readable only by the root user.
Databases¶
No database configuration, connection settings, or storage locations are defined in the provided deployment scripts.
Docker Containers and Their Deployment¶
The deployment does not utilize Docker containers, docker run, docker compose, or container-related scripts. The application is installed directly on the host operating system.
Proxy Servers¶
No proxy server configuration (such as Nginx, Traefik, or Certbot) is included in the provided deployment data.
Permission Settings¶
File and directory permissions are set during the installation process:
- The
install_script.shlocated in/rootis set tou=rwx,g=r,o=r. - The
pytorch_install.shandpytorch.shscripts in/home/userare set to be executable. - The
useraccount owns the files within/home/user.
Starting, Stopping, and Updating¶
The application is managed through shell scripts rather than a system service manager.
-
Initial Setup: To install PyTorch and set up the virtual environment, run the following command as the
This script creates a virtual environment, activates it, installsuseraccount:torch,torchvision, andtorchaudio, and runs a test script. -
Activating the Environment: To activate the virtual environment for subsequent use, run:
This script sources thevenv/bin/activatefile. -
Verification: After activation, verify the installation and CUDA availability with:
-
Updating: To update the application, the
pytorch_install.shscript can be re-run to reinstall dependencies within the virtual environment. System-level updates are handled via standardaptcommands.