AMD's Future Chips & SoC's: News, Info & Rumours.

Page 93 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.
I see your point, they COULD add more cores in that space.
However they would have to slow the speed down making it un competitive with 3rd gen ryzen.

As we saw with the 9900k, intels already pushing that architecture to its limits. The cpu draws soo much power and gets soo hot, adding more cores would only worsen this.
 
It's more than just that.

The iGPU in their CPUs at this point is a liability on the overall design. Something they did not see coming thanks to how lackluster AMD's FX line was. What I mean here is the iGPU is not just "pasted" into the design. There's pathways eating up silicon and IO that could be better used for more cache, wider path-ways or even just saving up space for even higher speeds. They've also been sitting in the EMIB tech for far too long. It's time they actually use it on something interesting.

Cheers!
 

InvalidError

Titan
Moderator
Lol, what? No. That's completely wrong. Intel doesn't completely turn off the iGPU and it's always running, albeit at a very low power, but it still IS there.
When the IGP is unused, most of it is in the powered-down state where it consumes no power whatsoever. All you have left is the supervisor stuff needed to wake the IGP up if needed, which should be less than 0.5W.
 
It's more than just that.

The iGPU in their CPUs at this point is a liability on the overall design. Something they did not see coming thanks to how lackluster AMD's FX line was. What I mean here is the iGPU is not just "pasted" into the design. There's pathways eating up silicon and IO that could be better used for more cache, wider path-ways or even just saving up space for even higher speeds. They've also been sitting in the EMIB tech for far too long. It's time they actually use it on something interesting.

Cheers!
Didnt really think of that.
Either way, removing the gpu and slightly re designing the chip isnt going to make up for intels core deficit since they cant easily add fast clocked cores on there existing architecture.
 
Didnt really think of that.
Either way, removing the gpu and slightly re designing the chip isnt going to make up for intels core deficit since they cant easily add fast clocked cores on there existing architecture.
And I didn't say they would revolutionize the market if they did it. It would just give them some breathing room until they can get proper new stuff out the door.

Cheers!
 
https://www.overclock3d.net/news/so..._2019_update_includes_amd_zen_optumisations/1

So every single Ryzen owner update to 1903 as long as you aren't using raid drivers as something is wrong with that at the moment Microsoft finally after 2 years fixed their scheduler with Ryzen!
Oh, that's nice. Ironically enough, I would imagine there's going to be a plus for FX users as well. Not that it matters, lel.

I wonder if it's noticeable though... Let's wait for benchmarks :p

Cheers!
 

InvalidError

Titan
Moderator
I hope this hopes threadripper. Will this help 4c4t ryzen cpus?
I forget how the ccx is layed out in those.
2200G/2400G/3200G/3400G have only a single CCX so core grouping should have no effect whatsoever on those.
The other CPUs have cores evenly distributed between CCXes, which means 2+2, 3+3 and 4+4. We'll see when reviewers get around to re-testing with the May update installed.
 

jaymc

Distinguished
Dec 7, 2007
614
9
18,985
Hey guy's how's things :)

Has anyone got a good websight with a decent comparison between node sizes naming/marketing terms and actual pitch sizes etc... ? I know Intel's 10nm has been downgraded now to arount 12nm. I can't remember TSMC's 7nm is more 9.8nm or so ? ? ?

Cheers,
Jay
 

goldstone77

Distinguished
Aug 22, 2012
2,245
14
19,965
Hey guy's how's things :)

Has anyone got a good websight with a decent comparison between node sizes naming/marketing terms and actual pitch sizes etc... ? I know Intel's 10nm has been downgraded now to arount 12nm. I can't remember TSMC's 7nm is more 9.8nm or so ? ? ?

Cheers,
Jay
Intel, TSMC, and Samsung are all claiming around 100 MTx/mm2 for their newest processes at high density. TSMC 7n is 67 MTx/mm2 for the process they are using for Ryzen 3K, and Intel's UHP, Ultra High Performance or desktop CPUs, will be close to the same metrics as TSMC.
Here is a link showing what Samsung says their metrics are: https://semiwiki.com/semiconductor/259664-samsung-foundry-update-2019/
intel-10nm-cells-density.png


Edit: Intel might be using just HP for Desktop, I can't remember off the top of my head.
2nd Edit: I was right, UHP for desktop.
 
Last edited:
  • Like
Reactions: jaymc
Tying threads to specific cores causes all sorts of issues the minute you run into another program that does the same exact thing. It's very bad practice to do so on a preemptive OS, though this is generally the preferred way to schedule threads on an embedded system.

The problem (focusing on Windows here) is at some point, your thread is getting bumped for a thread that has higher priority. And when it goes rescheduled, it is the lowest priority thread at that instant that gets bumped; it is not guaranteed (and statistically unlikely) that the bumped thread ends up back on the same core it started on.

Linux handles this better by using thread pools, but runs into the same problems if overall system workload changes over a short period as thread allocation would have to be reshuffled.

Under windows xp a thread would get thrown about between different cores. This meant cache wad worthless as each cote had its own cache and coherency wasnt a thing...yet.

With windows 7 the schedular fixed this issue and favored returning the thread to the core it was last running on. Thus retrieval of the cache was more likely.

A thread isnt hard coded to a core per say. The core is picked when the thread is scheduled to run based on which core is least busy. I noticed even physical cores are preferred and then hyper threaded cores logical proc filled. (Unless labeled as a low priority background task). However after a thread is assigned to a core it tends to stick to it to avoid cache thrash.

That said this all happened when itel was clearly dominant and microsoft designed the thread assingments around the NUMA cores of intel. Crossing numa cores (ccx) on amd kills performance.

Linux did it right and it was clearly demonstrated when the Linux version of the same threaded program practically decimated the windows equivalent.
 

jaymc

Distinguished
Dec 7, 2007
614
9
18,985
Intel, TSMC, and Samsung are all claiming around 100 MTx/mm2 for their newest processes at high density. TSMC 7n is 67 MTx/mm2 for the process they are using for Ryzen 3K, and Intel's UHP, Ultra High Performance or desktop CPUs, will be close to the same metrics as TSMC.
Here is a link showing what Samsung says their metrics are: https://semiwiki.com/semiconductor/259664-samsung-foundry-update-2019/
intel-10nm-cells-density.png


Edit: Intel might be using just HP for Desktop, I can't remember off the top of my head.
2nd Edit: I was right, UHP for desktop.

Thanks Mate.
 
Hey guy's how's things :)

Has anyone got a good websight with a decent comparison between node sizes naming/marketing terms and actual pitch sizes etc... ? I know Intel's 10nm has been downgraded now to arount 12nm. I can't remember TSMC's 7nm is more 9.8nm or so ? ? ?

Cheers,
Jay
Intels 12nm is restricted to the bottom cobalt layer. From what i read.

Intel used to rate thier node based on the largest required feature. Like a gate or pitch.

Samsung glofo and tamc have been rating their cores based on SMALLEST feature. (The opposite of intel).

That said 7nm is indeed a node lead over intels best. Even their 10nm will have issues competing. At this point 10nm is a lost cause. But because they invested so much they really cant stop. Especially with 14nm shortages. So using 10nm on laptops makes sense. Alleviates some of the shor
The 400-series and older chipsets are physically lacking the bits required to support PCIe 4.0, so 4.0 on chipsets is simply impossible and there is absolutely nothing that can be certified about them for 4.0.

If anything gets upgraded to 4.0 on those boards, it will be the lanes connected directly to the CPU if the socket and board have the necessary signal integrity to make that work when a Zen 2 or newer CPU which does have PCIe 4.0 capability on its PCIe lanes is installed. It is up to individual board manufacturers to decide which of their boards have a good enough shot at working PCIe 4.0 to enable it in BIOS. You can expect even some lowly A320 board to get PCIe 4.0 on the x16 slot. The x4 NVMe slot will depend heavily on where it is located.

Yes it should be up to the mfg to support pcie 4 on older boards. But with all the new boards coming out mfg need an excuse for people to buy the new stock.
 

InvalidError

Titan
Moderator
Yes it should be up to the mfg to support pcie 4 on older boards. But with all the new boards coming out mfg need an excuse for people to buy the new stock.
Which is partly why I don't believe in the value of long-life sockets. Two or three years in, you end up with too many reasons to ditch older platforms anyway. I imagine AMD would have been able to get away with far fewer than 15 substrate layers on Ryzen 3k to make it work if it used a new socket designed from the ground up with chiplets and their physical layout on the substrate in mind.
 
Under windows xp a thread would get thrown about between different cores. This meant cache wad worthless as each cote had its own cache and coherency wasnt a thing...yet.

With windows 7 the schedular fixed this issue and favored returning the thread to the core it was last running on. Thus retrieval of the cache was more likely.

A thread isnt hard coded to a core per say. The core is picked when the thread is scheduled to run based on which core is least busy. I noticed even physical cores are preferred and then hyper threaded cores logical proc filled. (Unless labeled as a low priority background task). However after a thread is assigned to a core it tends to stick to it to avoid cache thrash.

That said this all happened when itel was clearly dominant and microsoft designed the thread assingments around the NUMA cores of intel. Crossing numa cores (ccx) on amd kills performance.

Linux did it right and it was clearly demonstrated when the Linux version of the same threaded program practically decimated the windows equivalent.

As of Vista (plus enhancements in later OS releases), Windows does bias threads to their most recent core if that core is available for use. As thread count increases, this becomes less likely to occur. Windows will never bump a running thread that has higher priority then thread(s) that are waiting to run.

Linux's approach does have downsides; thread pools can break down in situations where thread workload is not flat, or in other words, when threads do a lot of work over a short period. I've seen Linux do some really weird things in that situation (generally re-assigning threads to try and load-balance all CPU cores). That being said, it tends to do much better when workloads are stable and in high-thread systems.
 
  • Like
Reactions: digitalgriffin
As of Vista (plus enhancements in later OS releases), Windows does bias threads to their most recent core if that core is available for use. As thread count increases, this becomes less likely to occur. Windows will never bump a running thread that has higher priority then thread(s) that are waiting to run.

Linux's approach does have downsides; thread pools can break down in situations where thread workload is not flat, or in other words, when threads do a lot of work over a short period. I've seen Linux do some really weird things in that situation (generally re-assigning threads to try and load-balance all CPU cores). That being said, it tends to do much better when workloads are stable and in high-thread systems.

I love these kind of CS problems. They are fun to think about.
 
  • Like
Reactions: NightHawkRMX
The Linux kernel allows you to select how "sensitive" it is to the priority shuffling as well. You need to re-compile and use a non-gen version, but for the important stuff you'd do it anyway :p

Less sensitive, that means it won't bump threads to the side when more work comes and more sensitive it will be bouncing quite a lot. I can't remember the name of the setting from the top of my head, but I do remember playing with it when testing load in my old Athlon64 X2 and Gentoo.

Cheers!
 
Just out of curiosity, how many of you guys/gals have actually done any serious circuit design?
I am not at all familiar with on-die silicon level CPU design other than on a theoretical level but I have some experience with practical circuit design and I am thinking most of the people judging the AMD and Intel engineers who are actually doing the design work on modern CPU's at the silicon level do not.
I have not read all the posts in this thread so if you posted and do indeed have experience designing CPU's and I missed your post please forgive me.
But to the rest of you I can only say if you can design a better CPU for us, please get off the net and do it.
We really would appreciate it. 😁

I used to debug CPUs for NASA when I was designing satellites.

But ive also taken high level ee classes and circuit design theory classes as well as architecture. So all and all several thousand pages of text on cpu design examining the internals from basic sram to cache design topologies and super scaling.

I also program down at the register level for Arduino.

To be honest I reverse engineer things for fun. I like figuring out what makes things work and how to solve old problems new ways. Im working in AI now just to solve these kinds of new problems.
 
Last edited:
  • Like
Reactions: DMAN999

goldstone77

Distinguished
Aug 22, 2012
2,245
14
19,965
Wikichips just posted detailed information about TSMC 7nm today, the process AMD's Ryzen 3K chips are built on. TSMC's 7nm HP is slightly smaller than previously calculated by David Schor, 64.98 MTr/mm² vs 67MTr/mm². Intel's 10nm UHP 67.18 MTr/mm². Also, TSMC's 7nm HD 91.2 MTr/mm² vs. Intel's 10nm HD 100.76 MTr/mm². Also, it's worth mentioning that SRAM cell size is becoming an advantage when shrinking to create more performance. Caches are taking up a large portion of these smaller chips. TSMC's HD SRAM 0.027 µm² vs. Intel's HD SRAM 0.0312 µm².
tsmc-7nm-density.png

vlsi-2018-tsmc-7nm-hd.png