News Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores

Admin · Jul 24, 2023

Intel announced its new AVX10 ISA that will bring support for AVX-512 capabilities to both p-cores and e-cores on its future processors.

Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores : Read more

Kamen Rider Blade · Jul 24, 2023

Is it me, or should this have been called AVX 4 instead of AVX 10?

There was AVX or
AVX 1
AVX 2 (Mostly 256b)
AVX 3 (512b Extensions)

This update should've been called AVX 4?

cyrusfox · Jul 24, 2023

I am just glad they have a strategy to unify ISA with the new hybrid chips. Seems really myopic this wasn't solved ages ago. So horrible to have to fuse off capable silicon features due to not working out the details how small and large cores would work together.

Deleted member 2731765 · Jul 24, 2023

Kamen Rider Blade said:
Is it me, or should this have been called AVX 4 instead of AVX 10?

There was AVX or
AVX 1
AVX 2 (Mostly 256b)
AVX 3 (512b Extensions)

This update should've been called AVX 4?

There is no such thing as AVX3. We only have AVX, AVX2 and AVX-512 x86 ISAs. AVX10 is just a superset of AVX-512. AVX10 will enable AVX-512 capabilities across both Performance and Efficient core designs with hybrid processors.

AVX10 contains all the richness of AVX-512 and additional features/capabilities while being able to work for both P and E cores, respectively.

Edit:

AVX10 actually has 2 subsets, AVX10/256, similar to AVX2, and AVX10/512 which is similar to AVX-512.

thestryker · Jul 25, 2023

cyrusfox said:
I am just glad they have a strategy to unify ISA with the new hybrid chips. Seems really myopic this wasn't solved ages ago. So horrible to have to fuse off capable silicon features due to not working out the details how small and large cores would work together.

I think this is mostly the manufacturing delays coming into play. RPL was never originally supposed to exist, and now we're getting a refresh of it.

I'm curious if MTL has any implementation (like ADL did with AVX 512) since the cores in it and Granite Rapids are the same.

hotaru251 · Jul 25, 2023

didnt ppl get workarounds to run avx-512 on their hybrid chips then intel said "no" and shut that method down?

TerryLaze · Jul 25, 2023

hotaru251 said:
didnt ppl get workarounds to run avx-512 on their hybrid chips then intel said "no" and shut that method down?

Well not really, you had to turn the hybrid into a classic CPU by turning off the e-cores and for intel it was more important for people to get used to the hybrid approach then for them to get avx-512.

cyrusfox said:
I am just glad they have a strategy to unify ISA with the new hybrid chips. Seems really myopic this wasn't solved ages ago. So horrible to have to fuse off capable silicon features due to not working out the details how small and large cores would work together.

It won't be unified, the e-cores will still only be able to do avx-256 and they will have the thread director or whatever make it work.
They could have done this on older CPUs, they could do this now on all hybrid CPUs.

Diving deeper, the AVX10 (Advanced Instruction Extensions 10) ISA is a superset of AVX-512 and comes with all of the features of the AVX-512 ISA for processors with both 256-bit and 512-bit vector register sizes.

Findecanor · Jul 25, 2023

Kamen Rider Blade said:
AVX 3 (512b Extensions)

This update should've been called AVX 4?

This is not really an extension of AVX-512, but rather a step back and a start-over.

Many in the programmer community have been asking for CPUs with a short-vector version of AVX-512 for a while, often using the provisional name "AVX-256".
The set isn't just about extending the width and number of registers. For instance, the use of boolean vectors for conditional load/store per lane is a big thing.

bit_user · Jul 25, 2023

Kamen Rider Blade said:
Is it me, or should this have been called AVX 4 instead of AVX 10?

There was AVX or
AVX 1
AVX 2 (Mostly 256b)
AVX 3 (512b Extensions)

This update should've been called AVX 4?

The problem is that you think too logically. These things are decided by marketing people, and they probably feel like "AVX-512" sounds a lot like AVX5. So, they want to call it AVX10 to make it sound way better (even though it's not).

I mean, why did Nvidia go from 700, 900, 1000, 2000, 3000, 4000? It's because "one better" doesn't sound like much, once you get above 10. You want each generation to sound a lot better, even if it's not (as in this case).

Oh, and by the way, I wouldn't even call it AVX4. If we're being logical, then 10.1 is basically just a way to tell software whether the AVX registers are 256 bits or 512 bits, apart from whether or not the AVX-512 instruction set is itself supported. 10.1 really doesn't add any real functionality that doesn't already exist in AVX-512.

bit_user · Jul 25, 2023

Metal Messiah. said:
AVX10 actually has 2 subsets, AVX10/256, similar to AVX2, and AVX10/512 which is similar to AVX-512.

No, you had it right the first time. As it stands today, AVX-512 instructions can operate on 128 bit, 256 bit, or 512 bit operands. AVX10.1 is just rebranding AVX-512, while adding an additional variable to indicate whether the implementation supports all 3 operand sizes, or whether it supports only the first two.

bit_user · Jul 25, 2023

thestryker said:
RPL was never originally supposed to exist,

Really??? Source?

thestryker said:
I'm curious if MTL has any implementation (like ADL did with AVX 512) since the cores in it and Granite Rapids are the same.

Yeah, good question... except that Meteor Lake's CPU tile is slated for the Intel 4 process node, while Granite Rapids is slated for Intel 3. I know the nodes are similar, but I don't know if their layout-compatible.

bit_user · Jul 25, 2023

TerryLaze said:
It won't be unified, the e-cores will still only be able to do avx-256 and they will have the thread director or whatever make it work.

I'm betting that their hybrid CPUs with AVX10 will implement at 256-bits on both the P-cores and E-cores, unless you know differently. Going hybrid-ISA creates more headaches than it's worth.

bit_user · Jul 25, 2023

Findecanor said:
This is not really an extension of AVX-512, but rather a step back and a start-over.

Many in the programmer community have been asking for CPUs with a short-vector version of AVX-512 for a while, often using the provisional name "AVX-256".

The sad part is that they didn't make it truly variable-length, like ARM's SVE. I believe each instruction still has an explicit operand length indicator, which means you need two versions of your code, to handle both the 256-bit and 512-bit cases.

bit_user · Jul 25, 2023

Wow, I sure wasn't expecting APX to be tacked on to the end of the article, like that. IMO, that's a much more consequential change than AVX10. That probably should've gotten its own article.

TerryLaze · Jul 25, 2023

bit_user said:
I'm betting that their hybrid CPUs with AVX10 will implement at 256-bits on both the P-cores and E-cores, unless you know differently. Going hybrid-ISA creates more headaches than it's worth.

That IS what I said.
e-cores will only be able to do 256 so it will not be unified, if you run 512 it will only run on the p-cores, if you run 256 or below it will probably run on all cores.
Unified would mean that all cores can do all the same things.

bit_user · Jul 25, 2023

TerryLaze said:
That IS what I said.
e-cores will only be able to do 256 so it will not be unified, if you run 512 it will only run on the p-cores, if you run 256 or below it will probably run on all cores.
Unified would mean that all cores can do all the same things.

Okay, well that's not consistent with the official Intel Technical paper, which says:

"The converged version of the Intel AVX10 vector ISA will include Intel AVX-512 vector instructions with an
AVX512VL feature flag, a maximum vector register length of 256 bits, as well as eight 32-bit mask registers and
new versions of 256-bit instructions supporting embedded rounding. This converged version will be supported on
both P-cores and E-cores. While the converged version is limited to a maximum 256-bit vector length, Intel AVX10
itself is not limited to 256 bits, and optional 512-bit vector use is possible on supporting P-cores. Thus, Intel AVX10
carries forward all the benefits of Intel AVX-512 from the Intel® Xeon® with P-core product lines"
Source: https://cdrdv2.intel.com/v1/dl/getContent/784343

They key word is "converged version", which seems to be a shorthand for AVX10/256. They are very clear about hybrid CPUs supporting this converged version, meaning even their P-cores will support only 256-bit.

It's the Xeon P-cores which they're saying will support 512-bit.

Kamen Rider Blade · Jul 25, 2023

bit_user said:
The problem is that you think too logically. These things are decided by marketing people, and they probably feel like "AVX-512" sounds a lot like AVX5. So, they want to call it AVX10 to make it sound way better (even though it's not).

I mean, why did Nvidia go from 700, 900, 1000, 2000, 3000, 4000? It's because "one better" doesn't sound like much, once you get above 10. You want each generation to sound a lot better, even if it's not (as in this case).

Oh, and by the way, I wouldn't even call it AVX4. If we're being logical, then 10.1 is basically just a way to tell software whether the AVX registers are 256 bits or 512 bits, apart from whether or not the AVX-512 instruction set is itself supported. 10.1 really doesn't add any real functionality that doesn't already exist in AVX-512.

This is why I "HATE" when marketing gets to make the naming decisions.

"Marketing" shouldn't be allowed to make names on the technical side of a product.

Leave that to the engineers and let them name it.

"Marketing" should only be about their job, selling it to the expected audience base.

TerryLaze · Jul 25, 2023

bit_user said:
Okay, well that's not consistent with the official Intel Technical paper, which says:

"The converged version of the Intel AVX10 vector ISA will include Intel AVX-512 vector instructions with an
AVX512VL feature flag, a maximum vector register length of 256 bits, as well as eight 32-bit mask registers and
new versions of 256-bit instructions supporting embedded rounding. This converged version will be supported on
both P-cores and E-cores. While the converged version is limited to a maximum 256-bit vector length, Intel AVX10
itself is not limited to 256 bits, and optional 512-bit vector use is possible on supporting P-cores. Thus, Intel AVX10
carries forward all the benefits of Intel AVX-512 from the Intel® Xeon® with P-core product lines"
Source: https://cdrdv2.intel.com/v1/dl/getContent/784343

They key word is "converged version", which seems to be a shorthand for AVX10/256. They are very clear about hybrid CPUs supporting this converged version, meaning even their P-cores will support only 256-bit.

It's the Xeon P-cores which they're saying will support 512-bit.

Did you like post the quote without reading all of it?!?!?!
This is the end of it which says that the p-cores, and only those, will have full 512, unless you think hat "supporting p-cores" won't be in the future desktop CPUs even though they laser fused avx off in previous versions, was there any talk about having designed avx completely out of them??? Because I didn't hear anything of the sort.

Intel AVX10 itself is not limited to 256 bits, and optional 512-bit vector use is possible on supporting P-cores. Thus, Intel AVX10

carries forward all the benefits of Intel AVX-512 from the Intel® Xeon® with P-core product lines"

bit_user · Jul 25, 2023

TerryLaze said:
Did you like post the quote without reading all of it?!?!?!

Yes, I read it. I think it's clear enough, but here's an excerpt from the Architecture Specification, providing further insight into their plans for 512-bit support:

A “converged” version of Intel AVX10 with maximum vector lengths of 256 bits and 32-bit opmask registers will be supported across all Intel processors, while 512-bit vector registers and 64-bit opmasks will continue to be supported on some P-core processors.

Source: https://cdrdv2.intel.com/v1/dl/getContent/784267

I think "some P-core processors" means their P-core -only Xeons.

TerryLaze said:
This is the end of it which says that the p-cores, and only those, will have full 512, unless you think hat "supporting p-cores" won't be in the future desktop CPUs

You seem to be overlooking this part:

"This converged version will be supported on both P-cores and E-cores. While the converged version is limited to a maximum 256-bit vector length"

There really doesn't seem to be any ambiguity, at this point. I think Intel just nailed shut the coffin on having 512-bit in their client processors.

thestryker · Jul 25, 2023

bit_user said:
Yeah, good question... except that Meteor Lake's CPU tile is slated for the Intel 4 process node, while Granite Rapids is slated for Intel 3. I know the nodes are similar, but I don't know if their layout-compatible.

Seeing as it doesn't sound like it'll be ready for e-cores yet I'm sure it would probably be disabled if it's there, but still curious.

bit_user said:
Really??? Source?

https://twitter.com/x/status/1569233082212818944

View: https://twitter.com/IanCutress/status/1569233082212818944?t=8h9BOCT0QX5Y-OE9EnCY-g&s=19

It has popped up elsewhere, but I don't think it ever got any of its own articles.

bit_user · Jul 25, 2023

thestryker said:
https://twitter.com/x/status/1569233082212818944
View: https://twitter.com/IanCutress/status/1569233082212818944?t=8h9BOCT0QX5Y-OE9EnCY-g&s=19

It has popped up elsewhere, but I don't think it ever got any of its own articles.

Since I don't have a Twitter account, can you please copy-and-paste the quote?

TerryLaze · Jul 25, 2023

bit_user said:
I think "some P-core processors" means their P-core -only Xeons.

The only reason they only mention the xeon CPUs here is that at the moment those are the only ones with active avx-512 support.
If desktop CPUs will have avx10 .1 .2 whatever then they will have avx-512, that's what carries forward means, it will carry over to anything with avx10 support.

Thus, Intel AVX10
carries forward all the benefits of Intel AVX-512 from the Intel® Xeon® with P-core product lines"

Also here no distinction is being made, they don't say only on xeon p-cores.

Apart from a few special cases, those instructions will be supported at all vector lengths, with 128-bit and 256-bit vector lengths being supported across all processors, and 512-bit vector lengths additionally supported on P-core processors.

thestryker · Jul 25, 2023

bit_user said:
Since I don't have a Twitter account, can you please copy-and-paste the quote?

You should be able to view direct links without one, but (and PRL is just RPL mistype obviously):

OK sorry reconfirmed this.
#IntelTechTour : PRL only exists because MTL wasn't going to be ready on time. RPL dev started 2 yr ago. GPU RTL and IO RTL hasn't changed from ADL. 41% improved MT perf of RPL over ADL, 15% ST, based on SPECint207.- Isic Silas, Intel Corp VP of CCG.
Sep 12, 2022

Ogotai · Jul 26, 2023

seems anandtech also has a post about this. and Gavin posted this :

"
To clarify the two different quotes. AVX-512 will still be there as it's a superset, hence the backward compatibility that AVX10 offers. Having x86 backward compatibility is important.

AVX10 will replace AVX-512 going forward, and developers, where applicable, can recompile to ensure compatibility and leverage the efficiency and performance bonuses.

Intel has alluded to divulging whether or not 512-bit wide vectors will be supported on chips and cores going forward, but they have committed to support 256-bit at the very least. "

bit_user · Jul 26, 2023

thestryker said:
You should be able to view direct links without one, but (and PRL is just RPL mistype obviously):

That used to be the case, but didn't Elon start requiring a sign-in just to read tweets?

Anyway, I've been blocking Twitter in my routing tables for ages, since you can use their ad network without even visiting Twitter.com and I wanted to ensure they got no ad revenue from me. That goes back to the pre-Elon era, when they took a stance on targeted political ads I didn't like. That practice is poisonous for democracy.

News Intel's New AVX10 Brings AVX-512 Capabilities to E-Cores

Administrator

Distinguished

Distinguished

Deleted member 2731765

Guest

Judicious

Splendid

Titan

Distinguished

Titan

Titan

Titan

Titan

Titan

Titan

Titan

Titan

Distinguished

Titan

Titan

Judicious

Titan

Titan

Judicious

Reputable

Titan

Share this page