[citation][nom]antilycus[/nom]I completely understand that 4 cores on a VM, is not core 4 threads on the host. You are right, it doesn't work like that. The largest problem with multi-core setups is memory management/register management. Who has access to what, when is always the limitation (and will be for as long as I can see going forward).However, based on real word results, putting 4 VM's (all windows server on 1 core vs 2 cores) generates NO PERFORMANCE increase. how is that possible? Because the 8 cores on the host are managing the threads. For this instance, IMVIRT/libvirt creates the threads, sends them to the CPU and the task is preformed (obviously within milliseconds). Switching to 4 cores per machine, makes no difference either. So, tell me how is this NOT the fault of the O/S kernel on Windows? It see's more cores yet you get the same results. Crappy kernel thread management is the answer.Plus there are different types of hypervisors and virtualization/para-virtualization. Somehwere there is a thread (or 4 or 20) and it's getting sent down to the HOST CPU and on Linux it flies through it, on Windows it doesnt. If it's bad info, explain why real world experience is trumping what you are saying?(its the 'net so please don't take the reply as harsh, I am just debating the real world outputs to the on paper "facts")[/citation]
The exact specification of virtual SMP and virtual SMT varies by VMM vendor. However, in general:
When you assign multiple logical CPUs to a VM each logical CPU exists as a single process and each one is co-scheduled. Thus, when the host scheduler wants to enter a VM as many host-logical processors must be available as there are guest-logical processors on that VM. All physical processors then enter guest mode and restore the guest state at the same time. This is critical as having processors change state in a fashion that isn't apparent to the guest can cause all sorts of concurrency issues. This is why it is a very bad idea to assign more than half the available logical processors to a single VM as all processors will have to pause host execution while they perform guest restore and guest save operations. You can overcommit logical processors all you want to a variety of VMs, just don't do a total commit to a single VM. Hyperthreading causes a bit of an optical illusion here as well, but no performance degradation.
Once the guest state is restored the VMM loses control until a privileged instruction is called, at which point it traps and either translates that instruction or executes it within the guest-state context. This mechanism is what allows a guest OS to handle its own threads via hardware context switches, hardware interrupts, and hardware address translation, but not trigger a power cycle on the host. Guest threads do not get passed through to the host at all, they are completely decoupled by design. Were this not the case it would not be possible to run a 64 bit guest (long mode) on a 32 bit VMM (protected mode) that is running on a CPU with 64 bit extensions and VTx extensions.
It is the CPU that tells the host when a guest needs to have the host intervene, not the host kernel. The only thing that the host scheduler has to do is provide the VMM with the necessary CPU time and co-schedule the assigned virtual processors. It does not have to handle the guest threads at all. You can envision a virtual processor as a process on the host that is running multiple hardware and software isolated guest processes using its own internal scheduler (the guest scheduler).