News Graphics card flaw enables data theft in AMD, Apple, and Qualcomm chips by exploiting GPU memory

Status
Not open for further replies.
Well... At least they still have to get control of your system before this exploit is possible.

Is the exploit, in action, detectable?
 
The article actually fails to touch on the most important part. I think this particular attack is more significant for LLMs and ML models, as it underlines the "overlooked" security risks in ML development stacks.

Although, basically LeftoverLocals can be used to attack any app that uses the GPU's local memory, such as image processing or drawing, but data leakage from large-scale language models (LLMs) is of particular and pressing concern here by the researchers.

As you can read in the blog, the researchers particularly highlighted the effects on the use of large language models and machine learning applications. The vulnerability is basically allowing hackers to access an AI model’s output by 'eavesdropping' on the kernels it uses to process user queries.

Trail of Bits showed that the output of LLM can be reconstructed with high accuracy through a PoC, as they were able to steal 181MB of data from an LLM run on an AMD GPU Radeon RX 7900 XT, enough to fully reproduce the response of a 7B (7 billion parameters) model.

So basically this Data leakage permits eavesdropping on LLM sessions more like, and affects ML models and applications on GPU platforms.

Especially considering that most deep neural network (DNN) computations heavily rely on local memory, the implications could be vast, at least for now, which might impact ML implementations across embedded and data-center domains.

But the good thing is that for this vulnerability to be exploited, it requires the attacker to have access to the target device with the vulnerable GPU, so for an average user/consumer, this attack vector isn't something to worry about IMO, as any attacker would need to already have established some amount of operating system access on the target’s device first.

Escalated privileges are not required though.

However, Apple hasn't clarified the situation with other impacted devices yet, like the Apple MacBook Air 3rd Generation with its A12 processor.

You meant to say A12-based iPad Air ?
 
Last edited by a moderator:
But the good thing is that for this vulnerability to be exploited, it requires the attacker to have access to the target device with the vulnerable GPU, so for an average user/consumer, this attack vector isn't something to worry about IMO, as any attacker would need to already have established some amount of operating system access on the target’s device first.
If all you need to do is look at the contents of the GPU's local memory, you should be able to do that using APIs like WebGL and WebGPU, I think. Although that could probably be mitigated by the browser, it would otherwise enable code on a website to scrape data from your GPU.

I might be wrong, but I think it's probably worth looking into.
 
If all you need to do is look at the contents of the GPU's local memory, you should be able to do that using APIs like WebGL and WebGPU, I think. Although that could probably be mitigated by the browser, it would otherwise enable code on a website to scrape data from your GPU.

I might be wrong, but I think it's probably worth looking into.
They do discuss that in the original trail of bits article:
"We note that it appears that browser GPU frameworks (e.g., WebGPU) are not currently impacted, as they insert dynamic memory checks into GPU kernels."
 
Yeah, carrying out an attack from a browser via WebGPU is difficult because this API adds dynamic array bounds checks to GPU processes that operate when accessing local memory.

Graphics card flaw enables data theft in AMD, Apple, and Qualcomm chips by exploiting GPU memory


I wouldn't call it a flaw actually, per se, but more like the said vulnerability is due to insufficient isolation of local GPU memory and failure to clean up local memory after processes on the GPU are executed.

Thereby allowing an attacker's process to identify data remaining in local memory after another process executed or read data from a process currently running.

Unlike CPUs which typically isolate memory in a way bypassing exploits like this; GPUs sometimes do not.

I mean, as we all know that the local memory in a GPU is a separate, and a faster memory area tied to a compute unit and also acting as an analogue of a processor's cache. Thus local memory is being used instead of global memory to store intermediate computations.

So it all boils down to launching a handler (kernel) on the GPU that periodically copies the contents of the available local graphics card memory into VRAM (global).

And of course, since local memory is not cleared when switching between processors running on the GPU and is shared between different processes within the same GPU compute unit, residual data from other processes can be found in it. Hence the possibility of an attack vector.

it’s possible to use a GPU's local memory to connect two GPU kernels together, even if the two kernels aren’t on the same application or used by the same person. The attacker can then use GPU compute apps such as OpenCL, Vulkan or Metal to write a GPU kernel that dumps uninitialized local memory into the target device.
 
Last edited by a moderator:
Yeah, carrying out an attack from a browser via WebGPU is difficult because this API adds dynamic array bounds checks to GPU processes that operate when accessing local memory.
It's not only bounds-checking, though. It would also need to ensure the local memory structures are initialized - either with 0's or some other data, before they can be read.

I wouldn't call it a flaw actually, per se, but more like the said vulnerability is due to insufficient isolation of local GPU memory and failure to clean up local memory after processes on the GPU are executed.
Eh, it's kind of a flaw not to wipe local memory when swapping in wavefronts or warps from another process to use a compute unit or SM. It seems like the sort of thing that might be fixable via firmware, however.

Unlike CPUs which typically isolate memory in a way bypassing exploits like this; GPUs sometimes do not.
They didn't used to, but now have MMUs so GPU threads (normally) cannot spy on each other or arbitrary addresses in system memory.
 
Status
Not open for further replies.