Pentium 4 L1 cache?

cchampio

Distinguished
Jun 30, 2009
25
0
18,530
Hello all! I am currently studying for my A+ 2006 objective and I am trying to find out how much L1 cache Pentium 4 CPU’s have. I am a bit confused because I have read contradicting answers on Wikipedia and in the books I have and on this web site. I have read that the Pentium 4 has 8kB L1 cache, 20kB L1 cache, and one book I have says they all have 128kB L1 cache. Based on all I been reading, I am starting to suspect there is something in the Pentium 4 called Execution Trace Cache (ETC) that is not the same thing as L1 cache; and that the ETC is 20kB (8kB data + 12kB) and the L1 is 128kB, but that is just a theory I come up with from my readings and may be completely wrong. I been reading on this topic for about 4 hours now, I need to move on, if someone could please help clear this up for me! Thanks!

I need to know the L1 cache for the Willamette, Northwood, Prescott, Cedar Mill, and the P4EE (Gallatin) and I need explanation of Execution Trace Cache.

 


look up each difffrent core, i know there is a large variance in L2 + L3 between them so i would assume the L1 varies as well. tom's has done a detailed analysis of each core when it came out.

as to the Execution Trace Cache this link may help
http://lmgtfy.com/?q=Execution+Trace+Cache
 


Did P4 even have L3 cache?
 
Willamette
L1 : ?
L2 : 256kB //Advanced Transfer; integated ; full clock speed.
L3 : none


Northwood
L1 : ?
L2 : 512kB //Adv Trans; integated ; full clock speed.
L3 : none

Gallatin (P4EE)
L1 : ?
L2 : 512kB - 2MB //Adv Trans; integated ; full clock speed.
L3 : 2MB

Prescott
L1 : ?
L2 : 1MB or 2MB //Adv Trans; integated ; full clock speed.
L3 : none

Cedar Mill
L1 : ?
L2 : 2MB //Adv Trans; integrated; full speed
L3 : none


In these links we've mentioned, none of them specifically tell how much L1 cache (unless I'm overlooking it).
I understand what ETC is now, I just need to know is it separate from L1 cache, or is it a section of the L1 that is designated for ETC. For example, I know the Pentium 3 has 32kB of L1 cache, and it is divided into 16kB data + 16kB instruction.
 
I'm pretty sure now that the ETC is 8kB of the whole L1 cache. But not sure what the L1 is as a whole. Is it 128kB (120kB + 8kB) or is it 20kB (8kB + 12kB)?

A book i have says 128, but some info I've found by using google said 20kB :\
 


well first off the extreme edition is a separate core hence suggesting that each is looked up individually and the prescott has a 12KB+16KB L1
 
From Intel pdf


Intel® Pentium® 4 Processor with 512-KB
L2 Cache on 0.13 Micron Process and
Intel® Pentium® 4 Processor Extreme
Edition Supporting Hyper-Threading
Technology1
Datasheet
2 GHz – 3.40 GHz Frequencies Supporting Hyper-Threading
Technology1 at 3.06 GHz with 533 MHz System Bus and All
Frequencies with 800 MHz System Bus
The Intel® Pentium® 4 processor family supporting Hyper-Threading Technology1
(HT Technology) delivers Intel's most advanced, most powerful processors for desktop PCs and
entry-level workstations, which are based on the Intel NetBurst® microarchitecture. The
Pentium 4 processor is designed to deliver performance across applications and usages where
end-users can truly appreciate and experience the performance. These applications include
Internet audio and streaming video, image processing, video content creation, speech, 3D, CAD,
games, multimedia, and multitasking user environments. The Intel® Pentium® 4 processor
Extreme Edition supporting HT Technology features 2 MB of L3 cache and offers high levels of
performance targeted specifically for high-end gamers and computing power users.
Available at 2 GHz, 2.20 GHz, 2.26 GHz,
2.40 GHz, 2.50 GHz, 2.53 GHz, 2.60 GHz,
2.66 GHz, 2.80 GHz, 3 GHz, 3.06 GHz,
3.20 GHz, and 3.40 GHz
Supports Hyper-Threading Technology
(HT Technology) at 3.06 GHz with 533 MHz
system bus and all frequencies with 800 MHz
system bus
Binary compatible with applications running
on previous members of the Intel
microprocessor line
Intel NetBurst® microarchitecture
System bus frequency at 400 MHz, 533 MHz,
and 800 MHz
Rapid Execution Engine: Arithmetic Logic
Units (ALUs) run at twice the processor core
frequency
Hyper-Pipelined Technology
—Advance Dynamic Execution
—Very deep out-of-order execution
Enhanced branch prediction
Optimized for 32-bit applications running on
advanced 32-bit operating systems
8-KB Level 1 data cache
Level 1 Execution Trace Cache stores 12-K
micro-ops and removes decoder latency from
main execution loops
512-KB Advanced Transfer Cache (on-die,
full-speed Level 2 (L2) cache) with 8-way
associativity and Error Correcting Code
(ECC)
2-MB Integrated Level 3 (L3) cache with
8-way associativity that is supported by Intel®
Pentium® 4 Processor Extreme Edition
Supporting Hyper-Threading Technology
144 Streaming SIMD Extensions 2 (SSE2)
instructions
Enhanced floating point and multimedia unit
for enhanced video, audio, encryption, and
3D performance
Power Management capabilities
—System Management mode
—Multiple low-power states
8-way cache associativity provides improved
cache hit rate on load/store operations
478-Pin Package
 
Read the article. Heaven forbid you learn something. L1 Trace cache plays a huge role in the P4 architecture, and it is 150KB in size. The L1 data cache increased to 16KB for Prescott core. Prior it was 8 KB.

So, total L1 cache is 158KB for Northwoods and Williamettes, and 164KB thereafter.

Thanks for playing.
 
lol i did read it

Per your article there is no L1 instruction set, it has been relocated and renamed. There is only a data cache in the L1.

How does it go on a soap box thanks for playing??

Believe it or not some of us are here to learn and share our knowledge not win an online pissing contest.
 
I think I'm understanding it now, finally! One mistake I was making was reading to fast; I mistakenly read 12K to be 12KB, and that makes a huge difference. The L1 cache stores 12K (12 kilo = 12 * 1024 = 12288) decoded micro instructions. Intels micro instructions are 100 bits long. So That means (((12288 * 100) / 8) / 1024) = 150KB. Then plus 8KB or 16 KB for the L1 data cache and you have L1 = 158KB or 166KB!

I also learn how to use Intels website to view the data sheets and confirm the L1 cache! Thanks everyone, I learn quite a bit!

http://www.intel.com/design/Pentium4/documentation.htm
 
take your time there there are many low level changes between cores. intel's pdfs are great for the details. Tom's articles will help to point out what changed from the previous generation
 

Then you fail at reading comprehension. 🙁 I cannot help you with that.
 



I appreciate the help, but no need to be a smart ass.
 
Do you feel better about yourself when you insult others?


"The L1 instruction cache was relocated. Instead of being before the fetch unit, the L1 instruction cache is now after the decode unit, with a new name, “Trace Cache”"

"most common mistakes people make when commenting Pentium 4 architecture is saying that Pentium 4 doesn’t have any instruction cache at all. That’s absolutely not true. It is there, but with a different name and a different location"

the L1 instruction cache was replaced with the trace cache
 
Not necessarily replaced by, but IS the trace cache. As your copy and paste PDF so clearly states below that lovely red highlighted area you used to debate with me the size of L1 cache, the trace cache is considered Level 1 cache, and is essentially the instruction portion of the L1 cache, placed after the decoder, which increases the speed of processing looped instruction sets.

And if you must know, what brings me great pleasure is displaying for all to see the lack of knowledge possessed by people that *THINK* they know what they're talking about, but obviously do not. It's even more delightful when they try and debate over a fact once proven wrong (how ludicrous), instead of admitting they were wrong or at least thanking for the correction. Not this this paragraph directly applies to you. You just asked if I feel better about myself from insulting others, to which I would say no. Simply insulting someone is something any 6 year old can do. It doesn't take skill or knowledge or even much thought. However, to know you are full of it requires that I actually know a thing or two about the subject at hand. That requires due diligence with many hours of research.
And you are right. This is a place to share knowledge and information. I shared with you a very nice article that clearly pointed out the size and character of trace cache, along with its role and how it is different to the conventional L1 data/instruction cache design. Your reply to this was a copy and paste of an Intel PDF, with no written message of your own, just a hightlighted area suggesting to me that you think only 8KB lay in the L1 cache. This showed me that you either disregarded the article I gave you entirely, or obviously failed to understand what it was saying, and instead defaulted to your previous (incorrect) conclusion. This is what really angers me, when I give you the key to the answer, only requiring a little bit of reading on your part, and I STILL end up having to spoon feed this information in the end. I don't get paid to hand out this information, and I don't think I'm asking too much of you to read a little bit to find your answer. Of all people, I would think one who has an avatar slogan of "Google it before you ask." would understand and appreciate a little foot work, especially with your bravo "LMGTFY" cop out you performed further up the thread. Is the issue now crystal clear to you? If not, feel free to PM me. I would now like to move on with any further questions Mr. cchampio has about this topic so we can continue this discussion in a more positive manner if possible.
 
I did read your article the relevant section about the cache anyway, as i have read many other articles. And it was enlightening so thank you for that. The pdf was to see how pissed you would get seeing as while you are here to share your knowledge, you feel the way to do that is have Dr.Cox say wrong wrong wrong... As far as my first post I pointed out that each core was different so he wouldn't assume all P4's are the same and pointed to him where he could do his homework and learn.
 
I still been reading on this and it sounds like the safest/correct way to state the size of the L1 ETC is to simply say it will hold ~12K micro ops. Read these articles:

http://www.tomshardware.com/reviews/intel,264-6.html
http://www.anandtech.com/printarticle.aspx?i=1301

Anyway, I'm starting to highly doubt this will even be on my A+ exam, so I'll move on to other topics. Thanks again for the help everyone!
 

TRENDING THREADS