Dedicated Servers
  • Instant
  • Custom
  • Single CPU servers
  • Dual CPU servers
  • Servers with 4th Gen CPUs
  • Servers with AMD Ryzen and Intel Core i9
  • Storage Servers
  • Servers with 10Gbps ports
  • Hosting virtualization nodes
  • GPU
  • Sale
  • VPS
    GPU
  • Dedicated GPU server
  • VM with GPU
  • Tesla A100 80GB & H100 Servers
  • Sale
    Apps
    Cloud
  • VMware and RedHat's oVirt Сlusters
  • Proxmox VE
  • Colocation
  • Colocation in the Netherlands
  • Remote smart hands
  • Services
  • Intelligent DDoS protection
  • Network equipment
  • IPv4 and IPv6 address
  • Managed servers
  • SLA packages for technical support
  • Monitoring
  • Software
  • VLAN
  • Announcing your IP or AS (BYOIP)
  • USB flash/key/flash drive
  • Traffic
  • Hardware delivery for EU data centers
  • About
  • Careers at HOSTKEY
  • Server Control Panel & API
  • Data Centers
  • Network
  • Speed test
  • Hot deals
  • Sales contact
  • Reseller program
  • Affiliate Program
  • Grants for winners
  • Grants for scientific projects and startups
  • News
  • Our blog
  • Payment terms and methods
  • Legal
  • Abuse
  • Looking Glass
  • The KYC Verification
  • Hot Deals

    03.06.2022

    NVIDIA A5500: real power or just a facelift

    server one

    One of the new products featured at the GTC 2022 conference was the RTX A5500 graphics card, which expands the range of NVIDIA professional graphics accelerators. It is based on Ampere architecture with second-generation RT cores and third-generation tensor cores. It features 24 GB of GDDR6 memory with ECC error correction and 768 GB/s peak bandwidth.

    The 8nm RTX A5500 graphics chip includes 10,240 CUDA cores, 80 RT cores and 320 tensor cores. NVIDIA notes that the performance in single precision operations (FP32) is 34.1 Tflops, and in half precision operations (FP16) - 272.8 Tflops.

    All this, as they say, is on paper. Let's see what the video card is really capable of, made possible by the fact that Hostkey has recently been able to build a machine equipped with the card.

    HOSTKEY
    Rent GPU servers with instant deployment or a server with a custom configuration with professional-grade NVIDIA RTX 5500 / 5000 / A4000 cards. VPS with dedicated GPU cards are also available . The GPU card is dedicated to the VM and cannot be used by other clients. GPU performance in virtual machines matches GPU performance in dedicated servers.

    Encoding

    Comparing the RTX A5000 and RTX A4000, we were convinced that neither an increase in CPU frequency nor the amount of video memory had much effect on the video cards' encoding block performance. Readers also rightly noticed that we used an automatic quantization setting (hence the quality of the resulting video) instead of the ready-made h264 codec preset, and also we skipped the encoding necessary for 60 fps streaming.

    Let's repeat the same tests on RTX A5500 and first of all we will run 1080p stream encoding at 30 fps. If we compare the A5000 results, then it (just like the A4000) managed only 14 streams.

    The A5500 performed better, and at 14 threads it clearly had a safety margin (NVIDIA promises up to 16 threads). At the same time, the video card consumed 5 W less power and had a lower video core temperature (+35° C vs. +47° C for A5000), though it used 500 MB more video memory.

    Output from nvidia-smi dmon -s pucm

    gpu pwr gtemp mtemp sm mem enc dec mclk pclk fb bar1
    Idx W C C % % % % MHz MHz MB MB
    0 92 35 - 13 3 100 0 7600 1890 4141 32
    gpu Idx 0
    pwr W 92
    gtemp C 35
    mtemp C -
    sm % 13
    mem % 3
    enc % 100
    dec % 0
    mclk MHz 7600
    pclk MHz 1890
    fb MB 4141
    bar1 MB 32

    Ffmpeg output gives us the following:

    frame = 1051 fps = 32 q = 33.0 size = 9472 kB time = 00:00:34.93 bitrate = 2221.2 kbits/s speed = 1.07x

    The adapter obviously cannot handle 16 video streams:

    gpu pwr gtemp mtemp sm mem enc dec mclk pclk fb bar1
    Idx W C C % % % % MHz MHz MB MB
    0 96 44 - 13 4 100 0 7600 1905 4732 32
    gpu Idx 0
    pwr W 96
    gtemp C 44
    mtemp C -
    sm % 13
    mem % 4
    enc % 100
    dec % 0
    mclk MHz 7600
    pclk MHz 1905
    fb MB 4732
    bar1 MB 32

    frame = 901 fps =28 q= 26.0 size = 7680 kB time = 00:00:29.93 bitrate = 2101.8 kbits/s speed = 0.917x

    It started experiencing frame loss, and the picture was filled with artifacts (defects), as the codec could not keep up and automatically degraded the quality (parameter q jumping from 26 to 50).

    Let's try to record video in high quality. We set the parameters corresponding to the high profile for the h264 codec: it is considered basic for digital broadcasting and video on optical media, especially for high definition television (it is also used for Blu-ray video discs and DVB HDTV broadcasting).

    Once again running 14 threads, the load on the video card increases, but the card can handle it:

    gpu pwr gtemp mtemp sm mem enc dec mclk pclk fb bar1
    Idx W C C % % % % MHz MHz MB MB
    0 95 43 - 13 4 100 0 7600 1890 4141 32
    gpu Idx 0
    pwr W 95
    gtemp C 43
    mtemp C -
    sm % 13
    mem % 4
    enc % 100
    dec % 0
    mclk MHz 7600
    pclk MHz 1890
    fb MB 4141
    bar1 MB 32

    Ffmpeg output:

    frame = 968 fps = 32 q = 23.0 size = 7680 kB time = 00:00:32.16 bitrate = 1955.9 kbits/s speed = 1.07x

    Let’s try 4K at 30 fps. The card can handle three streams in high profile without any problems:

    frame = 257 fps = 37 q = 33.0 size = 2304 kB time = 00:00:08.46 bitrate = 2229.3 kbits/s speed = 1.2x

    On four streams, it drops slightly (as you remember, the A5000 with four streams and the automatic quality setting was able to give only 25-26 frames with artifacts):

    frame = 985 fps = 30 q = 37.0 size = 7424 kB time = 00:00:32.73 bitrate = 1858.0 kbits/s speed = 0.995x

    The hardware was as follows:

    gpu pwr gtemp mtemp sm mem enc dec mclk pclk fb bar1
    Idx W C C % % % % MHz MHz MB MB
    0 89 32 - 9 4 100 0 7600 1920 1659 11
    gpu Idx 0
    pwr W 89
    gtemp C 32
    mtemp C -
    sm % 9
    mem % 4
    enc % 100
    dec % 0
    mclk MHz 7600
    pclk MHz 1920
    fb MB 1659
    bar1 MB 11

    In fact, the video card runs at a higher frequency than when encoding video in FullHD, but its main cores are not stressed by the load (the chip remains cold, as well as the video memory).

    Streaming 4K at 60 frames per second dropped to two streams as expected, but we didn’t use a cartoon, but rather a gameplay recording from Doom Eternal, which caused some problems for the hardware decoder. A5500 handled it, but it was pushed to the limit, and the worst thing was that encoding in AV1 is not available in hardware, and when broadcasting via VLC with Ubuntu 20.04 we failed to get 60 fps as the stream had dropped down to 30 fps. We created a workaround from ffmpeg and the broadcast server:

    frame = 240 fps = 61 q = 32.0 size = 2304 kB time = 00:00:09.48 bitrate = 3991.0 kbits/s speed = 1.03x

    Conclusion: the encoders in the RTX A5500 have been improved, and under equal conditions it outperforms the A5000, rendering a subjectively better picture and working at lower frequencies.

    CUDA/RT/Tensor Cores

    And what about the rest of the units? We compared the RTX A5500 with the A5000 in several other tests (you can read more about the exact methods in a previous article):

    1. A test of mining capabilities (with PhoenixMiner).
    2. A test of machine learning capabilities. Here, we trained a neural network on each of the cards to determine whether a cat or a dog is depicted in the photograph, using 100 epochs for this purpose.
    3. V-Ray 5 Benchmark test for rendering both on CPU + GPU (CUDA test) and purely on a GPU (RTX test).
    4. LuxMark test in three different scenes, checking the speed in OpenCL on the GPU.
    5. Test Blender in different scenes in OptiX mode using the full capabilities of the RTX.

    Summary Table

    NVIDIA GPU Mining speed, MH ML test 100 epoch V-Ray 5 Benchmark (vpaths/vrays) LuxMark Blender
    RTX A5000 86.66 9 min. 9s V-Ray GPU CUDA — 1381 vpaths

    V-Ray GPU RTX — 2288 vrays
    Lux ball — 74 795
    Hotel — 15 794
    Mic — 45 640
    Monster — 2312
    Junkshop — 1331
    Classroom — 1148
    RTX A5500 87.319 8 min. 59s V-Ray GPU CUDA — 1594

    vpaths V-Ray GPU RTX — 2613 vrays
    Lux ball — 78 554
    Hotel — 16 219
    Mic — 48 832
    Monster — 2468
    Junkshop — 1388
    Classroom — 1223
    NVIDIA GPU RTX A5000 RTX A5500
    Mining speed, MH 86.66 87.319
    ML test 100 epoch 9 min. 9s 8 min. 59s
    V-Ray 5 Benchmark (vpaths/vrays) V-Ray GPU CUDA — 1381 vpaths

    V-Ray GPU RTX — 2288 vrays
    V-Ray GPU CUDA — 1594

    vpaths V-Ray GPU RTX — 2613 vrays
    LuxMark Lux ball — 74 795
    Hotel — 15 794
    Mic — 45 640
    Lux ball — 78 554
    Hotel — 16 219
    Mic — 48 832
    Blender Monster — 2312
    Junkshop — 1331
    Classroom — 1148
    Monster — 2468
    Junkshop — 1388
    Classroom — 1223

    RTX A5500 shows better performance in rendering, but here everything depends on optimization: in V-Ray 5 we have a 13-14% gap, in LuxMark - 5-7%, with similar figures of 5-7% in Blender. Taking into account a margin of error of 1-2% percent depending on the run, the final performance gain is not very impressive.

    The A5500 is at least 15% faster in machine learning, but for miners it will be an unpleasant surprise to find almost the same hash rate on both cards. Note, however, that this solution is positioned by the manufacturer for professionals in graphics and neural networks.

    Concluions

    Alas, a miracle did not happen, and the actual performance increase is 5-10% depending on the task, and in the cases of mining and encoding, there was nary an increase at all. On the plus side, we saw lower power consumption, better cooling due to the lower heat dissipation needs of the video chip, as well as the larger amount of video memory, which should have a positive effect on intensive tasks. Whether or not it is worth the money is up to the buyer, and you can order a dedicated server with an NVIDIA RTX A5500 from us if you wish to check it out yourself.

    Rent GPU servers with instant deployment or a server with a custom configuration with professional-grade NVIDIA RTX 5500 / 5000 / A4000 cards. VPS with dedicated GPU cards are also available . The GPU card is dedicated to the VM and cannot be used by other clients. GPU performance in virtual machines matches GPU performance in dedicated servers.

    Other articles

    17.04.2024

    How to choose the right server with suitable CPU/GPU for your AI?

    Let's talk about the most important components that influence the choice of server for artificial intelligence.

    04.04.2024

    VPS, Website Hosting or Website Builder? Where to host a website for business?

    We have compared website hosting options, including VPS, shared hosting, and website builders.

    15.03.2024

    How AI is fighting the monopoly in sports advertising with GPUs and servers

    AI and AR technologies allow sports advertising to be customized to different audiences in real time using cloud-based GPU solutions.

    06.03.2024

    From xWiki to static-HTML. How we “transferred” documentation

    Choosing a platform for creating a portal with internal and external documentation. Migration of documents from cWiki to Material for MkDocs

    05.02.2024

    Test Build: Supermicro X13SAE-F Intel Core i9-14900KF 6.0 GHz

    Test results of a computer assembly based on the Supermicro X13SAE-F motherboard and the new Intel Core i9-14900KF processor overclockable up to 6.0 GHz.

    HOSTKEY Dedicated servers and cloud solutions Pre-configured and custom dedicated servers. AMD, Intel, GPU cards, Free DDoS protection amd 1Gbps unmetered port 30
    4.3 67 67
    Upload