Thread is actually a virtual processor?

RikTelner

Honorable
Feb 28, 2014
135
0
10,680
intel-core-i7-4500u has 2 cores and 4 threads. Computer sees 8 cores then?

Another question

Processor A has: 2GHz, with 6 cores, 1 thread. Total of 12GHz.
Processor B has: 2GHz, with 4 cores, 2 threads. Total of 16GHz.

So processor B is better? Is that how it works?
 
Solution


Errr.

A CPU may process one thread (a line of information being fed to the core) at one time. If you are doing multiple tasks/handling multiple applications, you are attempting to run multiple threads of information through your CPU. Hyper-threading is a way for the CPU to handle (but not process) multiple threads at a time. Multiple, in this case, means two. I doubt we will see a time when a single physical core will have hyper-threading that handles more than two threads at a time.

So, a CPU with two physical cores, that has hyper-threading, will register as having four cores. Two of these cores are physical and exist. The other two are referred to as "logical" cores; the exist so that the OS knows how to handle the multiple threads that you are feeding your CPU.

These logical cores do not make the CPU faster, it just helps with multi-tasking and programs optimized for multi-threaded work. (So, in that case, it will help it process information more efficiently.)

On a side note, one may think of AMD's modules as physical hyper-threading. One module is one floating point unit feeding two integer cores. So an FX-8350 may be thought up as a four core processor with physical hyperthreading. This is technically a half-truth, though.
 

rgd1101

Don't
Moderator


Processor A has: 2GHz, with 6 cores, 1 thread. Total of 12GHz.
Processor B has: 2GHz, with 4 cores, 2 threads. Total of 16GHz.
so you really mean 4 cores with 8 threads, still no idea what total of 16Ghz, I guess you know more than I do.
 

RikTelner

Honorable
Feb 28, 2014
135
0
10,680


Where do you f******g see 4 cores with 8 threads?! Where?!?!
 

rgd1101

Don't
Moderator


http://ark.intel.com/products/75123/Intel-Core-i7-4770K-Processor-8M-Cache-up-to-3_90-GHz
 


This is incorrect.

Hyperthreading is Intel's proprietary implementation of SMT, or Simultaneous Multi-Threading. SMT multiplies the front-end portion of a CPU core, allowing more than one logical context to be tracked at once (also called a logical processor). This suppresses variances in instruction type and flow which decrease execution efficiency. The backend can execute any mixture of instructions from any of the frontends attached to it, even in the same cycle. The instructions are selected dynamically to optimize usage of the microprocessor's execution ports to which the execution units are connected.

The cores are not divided into "real and fake" threads or "physical and logical" threads, each core with SMT enabled exposes two or more logical processors which are equal in capability. It is the job of the operating system and application designer to properly assign threads to these logical processors in order to optimize for their particular design parameters.

Intel's implementation of SMT only duplicates the frontend (two logical processors per core), as this is optimal for real time applications but IBM's brand new POWER8 architecture has 8 logical processors per core, and 12 cores per CPU for a total of 96 contexts per CPU.

 

RikTelner

Honorable
Feb 28, 2014
135
0
10,680


Okay, so if core supports Hyper-Threading, I need assume that thread is core. And when it doesn't support Hyper-Threading, cores are cores, yes? :)
 

rgd1101

Don't
Moderator


Oh you mean 4 cores with 2 threads each. not 4 cores, 2 threads. I guess my grammar wasn't as good as your.
 


Barring the number of threading, this sounds similar to what I described, but worded differently - but perhaps I am lost in the semantics?

Either way - you learn something new every day!
 


What you describe is closer to coarse-grained multithreading, not simultaneous multithreading. I'll cover the big three concepts behind hardware threading:

CGMT, or Coarse-Grained MultiThreading

FGMT, or Fine-Grained MultiThreading

SMT, or Simultaneous MultiThreading

From the perspective of user-mode software (I won't discuss kernel stuff as that can get a bit tricky) all appear the same, as two or more logical processors. The relationship between a logical processor and its physical hardware can usually be divined by looking at various identifiers such as the APIC ID and very high quality software will usually calibrate itself based on the arrangement of sockets, cores, and logical processors. However, there is no need to do this unless one wants to optimize performance.

Under the CGMT scheme the microprocessor tracks two or more thread contexts (one on each logical processor), and works on each thread in chunks. The microprocessor switches logical processors whenever a logical processor stalls out, (such as on a cache miss), blocks (such as on a lengthy IO operation), or after an allotted number of cycles to prevent starvation.

Under a non-multithreaded scheme a cache miss or a block can be resolved two ways. Either the microprocessor inserts stall cycles until the thread can continue, or the operating system performs a context switch which replaces the running thread on the logical processor with a new one. In most cases a cache miss will be resolved faster than the operating system can load a new context onto the logical processor (which in itself is highly likely to result in a cache miss) so a cache miss or an uncompensatable hazard in a non-threaded environment almost always results in stall cycles. However, a CGMT microprocessor can switch logical processors very, very quickly, and do so in a fashion that is completely transparent to the operating system. Thus, instead of stalling, the microprocessor switches to another context that is already loaded on another logical processor.

Under CGMT, whenever a certain stall threshold is met, the microprocessor switches execution to another logical processor so that it has something to do. When that logical processor stalls or exceeds its cycle allocation it switches again, either back to the first thread or to the next one. Even though the microprocessor is executing instructions from a separate logical processor, the memory manager will continue to resolve the miss for the processor that stalled.

FGMT takes this concept a little bit further and alternates execution between logical processors on every cycle. Ideally a stall would be resolved by the time that the microprocessor returns to the thread that stalled, but if it is not, it can be skipped. This is advantageous when execution resources are much cheaper and more plentiful than fast memory. Processors of this style are known as barrel processors because they rotate execution over the logical processors in a cyclic fashion. This style of execution was very popular in the 1990s and early 2000s.

SMT is the apex of multithreading, and is exclusive to superscalar microarchitectures (although FGMT/CGMT can also operate on superscalar architectures). The microprocessor issues instructions to the execution pipes (a feature of superscalar microarchitectures) from all logical processors in a dynamic fashion. If one logical processor stalls, blocks, or is idled for performance reasons, the microprocessor will dedicate all resources to the remaining logical processors. As the number of logical processors per core grows, it becomes easier and easier to keep the execution pipes on that core busy 100% of the time. This is very desirable in throughput sensitive environments such as application servers and databases, but can be detrimental to real-time sensitive environments such as gaming. This is why many game developers configure thread affinity on Hyperthreaded microprocessors to avoid fighting for resources.

I hope that this was informative.
 

ChillaxedUpgrader

Honorable
Nov 13, 2013
133
0
10,690
Hi everyone. Just wanted to say thanks for this thread. I dropped in here because the title intrigued me. It's wonderful to read down a thread which is not only highly informative in many places, but also hugely entertaining in others. Keep up the good work! ;)
Chillx
 

Powerbolt

Honorable
Oct 21, 2013
413
0
10,960


This thread =
bomb.jpg


It is kind of funny to see the OP get his jimmies in a russle though.
 

RikTelner

Honorable
Feb 28, 2014
135
0
10,680


It's funny to you. It's butthurt to me. It's like talking to the wall that it's not a marshmallow.

The on topic:
So, a CPU with two physical cores, that has hyper-threading, will register as having four cores. Two of these cores are physical and exist. The other two are referred to as "logical" cores; the exist so that the OS knows how to handle the multiple threads that you are feeding your CPU.
So, a 8-core processor with hyper-threading will be registered as 16 cores? If so, does it also perform as fast as actual 16 cores? Or is it only way system can utilize it?
 


A single socket platform with an 8 core microprocessor that has Hyperthreading enabled will appear to the operating system as a single 8 core CPU with two simultaneous threads per core for a total of 16 logical processors. Most modern operating systems expose the arrangement to applications so that applications may manage their own logical processor affinity for performance reasons (see my posts above). However, any application that does not manage its own affinity will simply be scheduled on any of the logical processors according to the platforms scheduling and power management policy. A policy that emphasizes low-power usage may attempt to keep as many of the cores in a low power state as possible (which requires idling all of the logical processors on each core), while a policy that emphasizes performance may attempt to maximize instruction throughput by keeping as few cores idle as possible.

To be clear, a microprocessor that has Hyperthreading enabled has the exact same theoretical peak instruction throughput as the same microprocessor with Hyperthreading disabled. The execution capabilities do not change, but with Hyperthreading enabled the execution resources will go unused less often.
 
Solution