Are intels CPUS with integrated graphics and hyperthreading really a big deal? AMD equivalent?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

nickspc4116

Distinguished
Aug 12, 2013
15
0
18,510
I Have been reading up on hyper threading and integrated graphics on Intel cpus and am wondering if its really all its cracked up to be? I know im late to the party on this subject but some clarifacation would be excellent. Currently using an amd fx 6300.

Is it worth the money to upgrade to an intel core w/e processor? Or is there an amd equivalent to the hyper threading/ integrated gpu that intel has. I am willing to spend the money on a good cpu being it intel or amd. I just want to run next gen games flawless.
 

Yeah, but the thing is that older game is no point to have 250 fps instead of 190. Well a lot of people still have 60hz monitor (me too) and more than 60 fps is waste. So for old games = i5. For the newest tech i7/fx-83x0. In 2-3 years games will be most likely use more than 6 cores then i5 fall off
 
Hyper Threading basically has no effect on game performance since games do not use Hyper Threading. However... (there always seems to be a "however")... When comparing a dual core Pentium CPU to a dual core Core i3 of the same CPU generation, the Core i3 does in fact provide better performance.

The main difference between the Pentium and the Core i3 is Hyper Threading. It would seems that Windows uses Hyper Threading for the background processes while games use the actual cores. Many different game benchmarks have shown a decent enough performance difference between to CPUs. However, when there are 4 cores, then Hyper Threading more or less does not make a difference. That's why when people are looking at an Intel CPU for a gaming rig the recommended CPU is a Core i5, not the more expensive Core i7.
 



Thats incorrect. Games can and do use Hyperthreading.

The benefit from HT on a dual core is apparent if the game is designed for 4 threads. Its just less apparent on Quads with HT because most havent been designed to utilize 8 threads effectively.
 


You'll need to provide me with game benchmarks to back up your statement.

All the benchmarks I have seen have not really shown any difference. The only exception was Battlefield 4 Beta multiplayer mode where a core i7 did in fact perform better than a core i5. However, upon retail release (and much more optimization) the end product showed hardly any difference between the core i5 and core i7.
 


You said it your self about the i3 performing better, I just disagree with the assumption that the performance comes from windows background processes utilizing the extra threads (there is no way that would make a significant difference).

Look at the difference between the i3 and the Pentium. I know the i3 has a 300mhz clock advantage, but there is no way it would get that much of a boost just from that, crysis 3 is clearly benefiting from the HT on the i3.
Crysis-3-Medium-FPS.png



Same you can see here, make a big difference on the i3, practically no difference on the i7 (probably because the game uses 4 threads).
FarCry3_CPU.PNG

 
Huh... okay, I suppose the I have not been "looking at the right game benchmarks" since I don't recall there being anything greater than a 10% performance difference between a Pentium and a Core i3.

However, for a quad core Intel i5 and i7 Hyper Threading really does not make difference at all. I think most people not looking to build a low priced gaming rig with an Intel CPU would be looking at the core i5 / i7 CPUs.
 


Yeah I would agree with that. I have no doubt in the near future we will see more games that utilize 8 threads efficiently, but at the moment a strong quad is the sweet spot.

No doubt before then the i5 will get HT (or just get 6 cores with no HT) and the standard desktop i7 will maybe grow to 6, as the socket 2011 should be growing to 8 with Haswell-E, this is purely speculation though.
 


I don't think the things you're saying are accurate. The software, games or whatnot, does not need to do anything special to use hyperthreading. The hardware is seen by the OS as twice many physical cores even though they share the underlying ALU and FPU. Hyperthreading is simply thread switching done in hardware. If you have two threads sharing one core without hyperthreading, then the operating system has to switch them constantly on that core. This cost of thread switching in software is not high, but it's still a cost. Hyperthreading simply allows two different processes or threads to run on two separate logical "cores" with the OS not having to do the switching in software. The switching is done in hardware which gains a little bit of performance. It's a neat idea. I think every CPU should come with it. Having said that it's hard to think of a lot of desktop applications needing that many threads. In fact, I doubt there are many games that can load even physical four cores at once. The benchmarks show the Core i5 gets better FPS than Core i3, but you have to keep in mind that some of it is due to Core i5's better single thread performance thanks to turbo.



 


Software applications can detect the machine's hardware configuration, and can easily configure their own process/thread count and process/thread affinity accordingly. This does not however mean that they actually do so. It's entirely up to the application designer to perform such due diligence. Many applications take either a fixed approach that runs a predetermined number of threads concurrently, or take a naive approach that scales concurrency to the number of logical processors. The latter approach can cause problems in some real-time workloads such as games as a single microprocessor with SMT (Hyperthreading) will appear to be logically equivalent to a pair of similar microprocessors without SMT.
 


There's no such thing as a "hyper threaded thread". If Hyperthreading is disabled, each physical core is exposed as a single logical processor. If Hyperthreading is enabled, each physical core is exposed as a pair of logical processors. Each thread is reflective of a single frontend. They are not arranged in a superior/subordinate order, they are both fully equivalent. It is entirely up to the Kernel scheduler to make optimal use of the exposed logical processors in order optimize a desired parameter. This parameter may be either real time constraints, power consumption, execution throughput, or most likely some combination thereof. When optimal real time performance is desired, idling one of the threads is the best approach so that the complementary thread can be dominant. When optimal throughput is desired, breaking the workload down into at least one chunk per logical processor ensure that the backend of each core stays as busy as possible.

There is nothing whatsoever preventing software from obtaining a clear view of the logical processor arrangement if the OS provides such a mechanism, but most operating systems do expose key configuration parameters including the number of installed CPUs (sockets), number of physical cores per CPU, and the number of threads per core. The FPGA design software that I use insists on telling me this every time I compile a design. There is no need for time-constrained software to know exactly which hardware thread belongs to which physical core (although there is nothing preventing an OS from exposing just that), the kernel will schedule threads in the most efficient fashion. It is however up to the application to ensure that the optimal number of threads are created.
 
Hyper-threading is simply(simple version deal with it) allowing the fetch to deal with 2 threads.
these 2 threads share all resources of that core.

My statements still stand true, there is no software that can detect which "thead" is a "hyper-thread".

 


Your "simple" version of Hyperthreading is simply wrong. Go do some research on Simultaneous Multi Threading and come back here because you clearly have no idea what you are talking about. SMT duplicates most, if not all, of the front end including all hardware necessary to track two separate logical states. The back end execution pipes and associated execution units, which do the gruntwork, are what is shared.

Here's a protip for communicating on the internet, stating that "my statement still stands true" does not make one's statement true. Repeating it after being called out by someone very highly educated on the subject matter (I'm a computer engineer, I deal with this stuff for a living) just makes you look like a fool.
 
It's true, if you knew anything about hyper-threading.
As you are not putting double the data through the decoder, you are also not increasing the cycles for the FPU or ALU.
Hyper-threading allows the fetch to "have 2 list(one for each threads)" sort of speak.
Each instruction from each "list" will be executed as normal. They still share every resources.

And yes, by statement still stand true, as there are no hyper-threaded threads, as both threads will be handle equal.

I'm an educated data-technician have studied computer and is currently working with Bash-scripting.
 


None of what you just said makes any sense, and your analysis of SMT is still horribly wrong. It's pretty clear that you're just talking out of your ass hoping that someone who doesn't know better will believe you. Stop it.
 
This from Wikipedia makes sense also:
Hyper-threading works by duplicating certain sections of the processor—those that store the architectural state—but not duplicating the main execution resources. This allows a hyper-threading processor to appear as the usual "physical" processor and an extra "logical" processor to the host operating system (HTT-unaware operating systems see two "physical" processors), allowing the operating system to schedule two threads or processes simultaneously and appropriately. When execution resources would not be used by the current task in a processor without hyper-threading, and especially when the processor is stalled, a hyper-threading equipped processor can use those execution resources to execute another scheduled task.
 


Ah yes, the good old Wikipedia reference. There's a good reason why any sensible school teacher will caution students against using it, even when it's not full of crap it can still smell a bit.

That quotation gets the gist of the technology correct, but I take issue with this line:

"This allows a hyper-threading processor to appear as the usual "physical" processor and an extra "logical" processor to the host operating system"

All modern platforms expose both their physical and logical configurations. This includes the number of systems per cluster (for more complex machines that span multiple devices), number of sockets per system, number of populated sockets (installed CPU packages), number of microprocessors per CPU package (for those lovely multichip-modules such as the Core 2 Quads and newer AMD Opterons), number of ISA cores per microprocessor, and the number of hardware threads per ISA core. The logical processors concept is simply an abstract way of linearizing and simplifying interaction with the various hardware threads without worrying about the total layout of the system. A more correct statement would be that a single-socket system with a quad core microprocessor with SMT/HT can be viewed as either a single CPU, four physical cores, or eight logical proccessors. It should not be viewed as "four physical cores with four extra logical processors" as that is somewhat misleading.

When this information is pulled together this forms a tree, and the information within this tree can be used by the kernel scheduler to make sane scheduling decisions about active threads (threads that are either running, or ready to run). The same information can be used by user applications to make sane threading decisions and how it would like the scheduler to treat its threads in order to optimize performance. Professional, wide scale applications make liberal use of these mechanisms to optimize performance. Contrary to what vmN keeps repeating, all of this information is available to user mode applications in most modern operating systems (see cpuset(7), among other things) but since games are generally targeted at commercial users who almost always have a single node single socket system there's very little reason to optimize inter-node or inter-socket latency. They simply detect the number of physical cores, or spawn a fixed number of threads, without giving a damn as to whether or not the platform has SMT/HT. Occasionally a game developer will do something stupid and detect the number of logical processors, which can result in decreased performance on Hyperthreaded machines.
 
I think you misunderstood/I didn't clarify correctly my quote underneath

What I meant was, there are no software that can detect which threads that would have supposed to be the "hyper-threads".
 


I don't think that you understand what those are then. Hyperthreading is simply Intel's implementation of Simultaneous Multi Threading. SMT couples more than one front end (the part that tracks the machine state, including the CPU registers, program counter, fetch and decode logic) to a single backend (which performs execution, reordering, and retiring), allowing for the backend to execute instructions from two or more running threads during the same cycle just as two or more entirely separate cores would, but with the limitation of only a single core's worth of execution resources. As long as the threads do not contend for the same resources, the backend will see greater utilization. From the perspective of system looking at the logical processors (which is what the schedulers do), the presentation is the same for a dual-core two-way SMT microprocessor and a quad-core non-SMT microprocessor. Both expose 4 completely independent logical processors, but all else being equal the quad core will win out due to the extra execution resources. This is why older operating systems had trouble with Hyperthreading when it was introduced, they looked at only the logical organization of the machine, which was fine from an SMP perspective (SMP = Symmetric Multi Processing, AKA multi-core/multi-socket) but not sufficient for an SMT perspective. The physical organization of each microprocessor needed to be considered as well in order to make the best scheduling decisions. This was repeated again a couple of years ago when AMD released the FX series microprocessors, which do not use SMT and expose as many threads as there are physical cores but do have some shared resources between cores on the same module which impacts performance when the load is not balanced across modules.

Intel's implementation of SMT is known as Hyperthreading and couples two front ends to the same backend. It has changed a bit since it was first introduced with the second generation of Pentium 4 microprocessors in the early 2000s, but the idea is still the same. IBM also uses SMT in their POWER microprocessors. POWER7 based microprocessors have 4 frontends per core, and POWER8 based microprocessors have 8 frontends per core. This means that a 12 core POWER8 microprocessor will appear as 96 logical processors.

As for the second part, about there not being any software capable of detecting which threads are "supposed to be the hyper-threads", that's provably false. Each logical processor in a system is identified by a unique APIC ID (APIC stands for Advanced Programmable Interrupt Controller) and the APIC ID of each logical processor contains within it the physical processor ID. If the physical processor ID is the same for two logical processors, then they share the same core. Accessing this is done through the CPUID instruction which is unprivileged and can thus be performed by any application regardless of operating system.
 


I'm pretty sure there is a language barrier between us.

"If the physical processor ID is the same for two logical processors, then they share the same core."
This is what I meant, really this very line, you will end up with 2 identical IDs, not one 1 ID for the physical cores and 1 ID for the "hyper-thread", but instead get the same.

What my statement was about, was for people general assumption that games will utilize the 8t/8c from lets say a fx 8320, but wont utilize more than 4t/4c out of a 8t/4c intel I7 processor because of hyper-threading.