Categories
Uncategorized

Using eGPUs with Metal

For those (like me) who need raw GPU power but only possess a laptop and do not want to buy a beefy desktop machine, the solution seems to be an external GPU (eGPU). While Nvidia eGPUs are not supported by macOS at the moment, there are plenty of GPU options from AMD.

In looking for the perfect GPU, I stopped at none other than the AMD Radeon RX Vega 64 which is the high end GPU they offer currently and the 2nd most performant among all GPUs on the consumer market. It is topped only by one Nvidia GPU in terms of TFLOPS performance.

A teraflops (TFLOPS) chip is able to run one trillion floating-point operations per second. TFLOPS is a key performance metric in scientific computing, machine learning and any other areas that need compute-intensive work.

The AMD Radeon RX Vega 64 has 4096 cores able to provide 12 TFLOPS (single precision) and sits conveniently right in between the Nvidia Geforce GTX 1080 ti with 3584 cores (11 TFLOPS) and the new Nvidia Geforce RTX 2080 ti with 4352 cores (13 TFLOPS).

Of course, eGPUs are also accelerating graphics applications and games, lets you connect additional monitors and VR headsets. Keep in mind that eGPUs only work with Thunderbolt 3-equipped Macs running macOS High Sierra 10.13.4 or later. For more informations about compatibility read the Use an external graphics processor with your Mac webpage.

I went ahead and purchased a Razer Core X enclosure because for Vega 64 it is recommended to have a power source of at least 600W and there aren’t many such boxes available. The Sonnet eGFX Breakaway box 650 is lighter but more expensive. Razer Core X is as wide and tall as a 15” Macbook Pro as you can see below:

alt text

Those 650W are not used entirely by the GPU, by the way. Vega 64 requires only 295W actually. The additional power is available for cases when you overclock your GPU so that it runs even faster, but not only for that – you get to also charge your Mac via the Thunderbolt 3 cable so you do not need the Mac charger anymore while connected to the eGPU. Also, make no mistake, the Vega 64 is almost as wide and heavy as the enclosure – a real monster!

alt text

As soon as you connect the eGPU to your Mac via a Thunderbolt 3 cable and power up the enclosure, you will notice the new eGPU icon in the menu bar:

alt text

In the Activity Monitor if you open the GPU History view you will see all your GPUs listed – integrated, discrete or external:

alt text

In the System Information app, under Graphics/Displays, you will see all your GPUs listed as well, along with some basic information:

alt text

If you right click on a game or application that needs a GPU and click Get Info, you will notice the “Prefer external GPU” option:

alt text

I installed Geekbench 4 so I can run a few benchmarks. The trial version lets you run benchmarks and store results online only and it only lets you run the OpenCL tests. The full version allows for Dropbox integration, saving the results locally and running Metal tests as well.

alt text

Running a test only takes a minute to complete:

alt text

As expected, the biggest score was obtained for a Metal test running on Vega 64. I had the following scores in decreasing order of scores:

– Metal on Radeon RX Vega 64 – 137651 
– OpenCL on Radeon RX Vega 64 – 135711 
– Metal on Radeon Pro 450 – 41602 
– OpenCL on Radeon Pro 450 – 41578 
– Metal on Intel HD 530 – 21888 
– OpenCL on Intel HD 530 – 20878 
– OpenCL on quad-core CPU – 13867

The next obvious step is to run some Metal code on these GPUs. Finally! In a playground add this code snippet:

import Metal

let devices = MTLCopyAllDevices()

for device in devices {
    print(device.name)
    print("Is device low power? \(device.isLowPower).")
    print("Is device external? \(device.isRemovable).")
    print("Maximum threads per group: \(device.maxThreadsPerThreadgroup).")
    print("Maximum buffer length: \(Float(device.maxBufferLength) / 1024 / 1024 / 1024) GB.")
}

Run the playground and see a similar output:

AMD Radeon RX Vega 64
Is device low power? false.
Is device external? true.
Maximum threads per group: MTLSize(width: 1024, height: 1024, depth: 1024).
Maximum buffer length: 4.5 GB.

AMD Radeon Pro 450
Is device low power? false.
Is device external? false.
Maximum threads per group: MTLSize(width: 1024, height: 1024, depth: 1024).
Maximum buffer length: 1.5 GB.

Intel(R) HD Graphics 530
Is device low power? true.
Is device external? false.
Maximum threads per group: MTLSize(width: 256, height: 256, depth: 256).
Maximum buffer length: 2.0 GB.

You can query your devices for many more attributes and features such as memory availability, programmable sample positions support, raster order groups support and so on. For more information, see the MTLDevice webpage.

Apple provides two sample code projects to help you with GPU management in both rendering and compute pipelines:

There are also a few webpages with useful information about resource storage modes, about managing multiple displays and GPUs, about GPU bandwidth, about adding/removing external GPUs, and so on:

Until next time!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s