Servers powered by AMD Radeon RX 7900 XTX based on the RDNA 3 architecture and AMD R9700 based on RDNA 4 deliver high performance for artificial intelligence, 3D rendering and big data processing. This powerful and cost-effective solution is ideal for business workloads of any complexity.
Order a GPU server with pre-installed software and get a ready-to-use environment in minutes.
The AMD Radeon RX 7900 combines cutting-edge technology with high performance. It offers excellent power efficiency, ample video memory, and support for advanced features like second-generation Ray Tracing and Infinity Cache. This makes it an ideal choice for professional workloads, including 3D rendering, artificial intelligence (AI), and scientific computing.
Powered by RDNA 3 architecture and equipped with 24 GB GDDR6 memory, the card effortlessly handles demanding tasks such as 4K gaming and complex 3D graphics.
The graphics card features 6,144 stream processors, delivering exceptional performance for both gaming and professional applications.
Ample capacity for handling complex calculations.
Enhanced memory bandwidth for improved performance.
Delivers realistic graphics and enhanced visualization.
Optimized for various professional applications.
Lower power consumption reduces operating costs.
Perfect for multi-GPU server configurations.
High performance at a more competitive price than similar NVIDIA solutions. Using multiple AMD GPUs in a single server offers a cost-effective alternative to high-end NVIDIA cards.
The AMD Radeon AI PRO R9700 is designed for artificial intelligence workloads, local inference, 3D rendering, and large-scale machine learning models. Built on the RDNA 4 architecture, the GPU features 32 GB of GDDR6 memory and supports modern AI frameworks through the AMD ROCm platform. With high compute performance, multi-GPU scalability, and PCIe 5.0 support, it is well suited for professional workstations and AI servers with demanding computational workloads.
Modern AMD architecture with improved performance for AI workloads and professional computing.
Large VRAM capacity suitable for local LLM deployment, generative AI, and large dataset processing.
Hardware AI accelerators deliver high performance for inference and machine learning workloads.
Compatibility with popular AI frameworks including PyTorch, TensorFlow, and ONNX Runtime.
High-speed data transfer for modern server and workstation platforms.
Reduced memory latency and improved overall GPU performance.
Suitable for servers and workstations with multiple GPUs for AI and HPC workloads.
Up to 1531 TOPS INT4 and up to 95.7 TFLOPS FP16 for AI workloads and accelerated computing.
The AMD Radeon AI PRO R9700 offers large VRAM capacity and strong AI performance at a more affordable price compared to several professional NVIDIA solutions.
| AMD Radeon RX 7900 XTX | Nvidia RTX 4090 | |
| Llama 3.3 70B (2K context, 54 Gb VRAM). Q4 in Ollama | Response: 12 token/s | Response: 17 token/s |
| Gemma 2 27B (2K context - 28 Gb VRAM). Q4 in Ollama | Response: 32 token/s | Response: 40 token/s |
| Gemma 2 27B (8K context — 41 Gb VRAM). Q4 in Ollama | Response: 33 token/s | Response: 42 token/s |
| Phi4 14B (12 Gb VRAM) 2K context. Q4 in Ollama | Response: 48 token/s | Response: 76 token/s |
| Qwen25-32b-Instruct. Fp16 in vLLM | End-to-End Request Latency (30 workers): 10 s | End-to-End Request Latency (30 workers): 10 s |
| Qwen25-32b-Instruct. Fp16 in vLLM | Combined Token Throughput (30 workers): 710 token/s | Combined Token Throughput (30 workers): 750 token/s |
| Qwen25-32b-Instruct. Fp16 in vLLM | Time to First Token (30 workers): 1.5 s | Time to First Token (30 workers): 2.3 s |
| Qwen25-32b-Instruct. Fp16 in vLLM | Inter-Token Latency (30 workers): 0.037s | Inter-Token Latency (30 workers): 0.037s |
| Qwen25-32b-Instruct. Fp16 in vLLM | Request per Second (30 workers): 2.1 request/s | Request per Second (30 workers): 2.3 request/s |
| Qwen25-32b-Instruct. Fp16 in vLLM | Tokens per Second (30 workers): 27 tokens/s | Tokens per Second (30 workers): 27.5 tokens/s |
The results below are based on publicly available benchmarks and reviews using different hardware configurations, drivers, and software environments.
| AMD Radeon AI PRO R9700 | NVIDIA RTX 4090 | |
| DeepSeek-R1 14B | 53.5 token/s | ~60-75 token/s* |
| DeepSeek-R1 32B | 26.3 token/s | ~40-65 token/s* |
| GPT-OSS 20B | 102.4 token/s | ~110-140 token/s* |
| VRAM Capacity | 32 GB | 24 GB |
| Power Consumption (TDP) | 300W | 450W |
| Software Platform | ROCm | CUDA |
* Results may vary depending on CPU, RAM capacity, drivers, batch size, quantization, and inference framework.
GPU servers for data science
e-Commerce hosting
Finance and FinTech
Private cloud
Rendering, 3D Design and visualization
Managed colocation
GPU servers for Deep Learning
Wide range of pre-configured servers with instant delivery and sale
The card does not use CUDA cores, as CUDA is NVIDIA's proprietary technology. Instead, it features 6,144 Stream Processors, which serve a similar function in AMD's GPU architecture.
Yes, the card supports OpenCL, ROCm and vLLM, which can be used for training, inference, chatbots and video recognition. The card is also compatible with popular machine learning frameworks like PyTorch and TensorFlow. Its performance with FP16 models is comparable to the NVIDIA RTX 4090, though it does not yet support FP8 models. For certain workloads, especially in multi-GPU configurations, the RX 7900 XTX is an excellent option, offering strong performance at a lower cost than NVIDIA alternatives.
The primary limitation is the lack of full CUDA support, which can make some AI frameworks and software less compatible out of the box. This may require software adaptation or the use of an emulator like ZLUDA to run CUDA-based applications on AMD hardware.