Cache Disabling...How?

Page 2 - Seeking answers? Join the Tom's Hardware community: where nearly two million members share solutions and discuss the latest tech.

1Tanker

Splendid
Apr 28, 2006
4,645
1
22,780
Yeah, I ran across the article with some simple google searching, it did a good/decent job describing how cache works but I am not certain of the laser method, seems labor and cost intensive.

Though I would not necessarily put it out of the window until I have had more time to read up on current literature.

Jack
I could have a Googled it too, as well, but i figured...hey this is a CPU forum, and it could manifest some interesting posts. I felt it worthwhile to create a thread.It's not because i was lazy. :wink: :)
 

1Tanker

Splendid
Apr 28, 2006
4,645
1
22,780
Yeah, I ran across the article with some simple google searching, it did a good/decent job describing how cache works but I am not certain of the laser method, seems labor and cost intensive.

Though I would not necessarily put it out of the window until I have had more time to read up on current literature.

Jack
I could have a Googled it too, as well, but i figured...hey this is a CPU forum, and it could manifest some interesting posts. I felt it worthwhile to create a thread.It's not because i was lazy. :wink: :)

No, it was a good topic.... while I know it is possible, and it is common practice, I cannot honestly say exactly how they do it. My curiousity is now on this and when I find out I can post it back unless someone beats me to the punch :) ...

JackThanks. I have to wonder now, whether a "native" 2MB cache Allendale will show any difference with the L2 being increased to 16-way associativity. Somehow, i doubt it will make very little difference, seeing how a Conroe clocked at Allendale speeds isn't showing huge increases(and i would think doubling of cache would make a much bigger difference than double the associativity). Sorry for steering the thread of course a little. :?
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Here is another paper that discusses cache faults and how they are handled, if you read AMD's bios programming guide you can also see where the cache size is set by special Machine Specific Registers (MSRs), likely redundant cache is implemented, say one targets a 2 meg cache design, you physically build 2.5 Meg, then at startup faulty blocks can be marked as bad and not used.

AMD's Bios Guide: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF

ftp://ftp.cs.wisc.edu/markhill/Papers/toc93_faults_original.pdf

A manufacturing defect causes a fault in a cache if it impairs the correct operation of the cache. We will study those faults that make a bit in the cache unable to retain the value written to it, but that do not otherwise perturb the operation of the cache (e.g., do not cause an electrical short circuit). A fault causes an error if it causes the system to enter a logical state other than the one intended. We can prevent faults in an on-chip cache from causing errors by (1) discarding chips with such aults, (2) using redundant memory, or (3) disabling cache blocks that contain faults. The advantage of discarding chips, method (1), is that it works for any defect. Its disadvantage, however, is that by reducing yield it increases chip cost.

Those who may be older and remember that hard drives initially had to have a low-level format, usually a routine programmed in bios accomplished the task, in which before partitioning and high level formats, one would low level format the HD where bad sectors would be found and marked in the allocation table for the HD. As HD quality go better the low level formatting has become a non-issue and is likely done at the factory, nonetheless, I am postulated with limited information that cache initializes and functions similarly.

In that, upon startup up the BIOS bootstrap code initializes the cache memory and tests the caches up to the point of the actual size needed or specified by the MSRs. In this, as the processor initializes, if a block is found bad or faulty it is simply marked as bad or faulty and never used. The cache continues it's memory check through all the physically available SRAM until the specified amount of good cache is found (specified by the configuration set by the MSR registers).

So say Intel has a 4 meg part, but setups up the MSRs to identify it as a 2 meg part. The bios intialization then tests and validates good cache up to 2 megs, then quits after 2 good megs of cache are allocated. This would make the most sense. Of the 2 mega bits data that can be bad there is no way to physically pin out hard logic in the package to account for every word line and bit line.

I am still reading and learning so I can confidently bring back to the forum the methods employed to accomplish such a task. This is a detail that I do not have a great deal of knowledge on.

Jack

The way I see it this is, perhaps, one of the most important issues on chip manufacturing, next to process & litho; it influences all aspects of manufacturing, from sand-to-chip/yield, literally.
I'm still reading through & searching (good AMD paper, btw), in order to get a glimpse of a sequencial order for logic/cache error detection/correction techniques, both at the soft & hardware levels; so far, I've only found out that, from wafer quality grade through defect detection precision tools, to Design For Testing (DFT) procedures & [on-chip] Logic/Memory Built-In Self Test (BIST http://www.mentor.com/products/dft/memorytest/index.cfm), all contributes to initial, mid-process & final binning decisions, on what concerns the means by which wafer batches are selected to provide the best end results, i.e., higher yields.
For instance, even the kind/type of defects are taken into account, at the process level & graded in accordance with its "severity", some being considered "normal" (sort of inoffensive). I've also found out that, cache redundancy & ECC are just some ways of dealing with defective patterning and that Machine Specific Registers (MSR) can "store" defective cache bit pointers within the core logic (not in the cache, itself); also interesting, is the use of small portions of flash memory (see my previous post above), for a reason (of course!): MSRs behave as ROM as long as only warm resets are performed; once a cold reset to the processor is made, MSRs are erased and so are the defective cache bit pointers (perhaps this is not the most adequate wording; anyway...); as known, flash memory retains its content, even when in off state; a 'perfect' ROM for storing/maintaining the relevant info on the defective cache coordinates (I wonder how this might work with shared cache, since both logic cores can address the whole cache blocks... maybe defective management has just evolved enough... but not that much, in order to link both L1 caches... I'm just guessing).

I'll try to collect, understand & post in some order, all the relevant data I can find on the subject, as it seems so darn interesting from every angle I look at it.

@1Tanker: I know you've addressed this issue before; I just missed it. Well, seems that you've got me hooked, this time around! :wink:

(Just a few illustrative links):

http://www.freescale.com/files/technology_publications/doc/Papers/Eintell5170ARTICLE.pdf - On typical litho defectivity

http://www.nikonprecision.com/immersion/media/Immersion_Defectivity_D6996.pdf - on Nikon's Immersion Litho

http://www.sematech.org/meetings/archives/other/20001030/08_Test_Screen_Shirley.pdf - elementals of defect management

http://userpages.umbc.edu/~abhishek/link_docs/defect_based_test.pdf - Intel Automatic Test Pattern Generation (ATPG)


Cheers
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
...or not as I already do have a fair amount of clue how Intel does it and it's pretty extreme.

Could you elaborate on that, please? I know it's pretty extreme but's also darn decisive, at all process levels. After all, it's not what goes right that defines success (in general) but, rather, what goes wrong.
Thanks.


Cheers!
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
o_O wouldn't that necessitate a clean room?

It's done through all the chip manufacturing stages (Hard & Soft errors' checking, testing, debugging, etc); very high quality Si wafers are a prime requirement, in order to reduce cummulative defects, throughout the process; so, yes, you're right: A high-level (JKflipflop?!) clean room is a must.


Cheers!
 

m25

Distinguished
May 23, 2006
2,363
0
19,780
:lol: :lol: :lol:
Talking about clear rooms, I'd like to share the secret of keeping mine clean:
about 6 days/week the whole surface of it is THE trash bin; I can throw everything on the floor. When I see the stuff I am walking on is getting too much, I wipe it off :wink:
 
:lol: :lol: :lol:
Talking about clear rooms, I'd like to share the secret of keeping mine clean:
about 6 days/week the whole surface of it is THE trash bin; I can throw everything on the floor. When I see the stuff I am walking on is getting too much, I wipe it off :wink:

How do you equate layers of dust, trash, and mold and god knows what else, with a "clear room".

Are you saying you live outside and your room is not really clear so much as its invisable?
 

shinigamiX

Distinguished
Jan 8, 2006
1,107
0
19,280
Cover your walls in your basement with maggots... They clean wounds very well, maybe they will clean room very well too.....

:lol:

Ok. Away with the vacuum cleaners...


Cheers!
No, no, you'll need them for the maggots!
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
:lol: :lol: :lol:
Talking about clear rooms, I'd like to share the secret of keeping mine clean:
about 6 days/week the whole surface of it is THE trash bin; I can throw everything on the floor. When I see the stuff I am walking on is getting too much, I wipe it off :wink:

:lol:

That's why Intel developed Copy Exactly!

As for AMD, they decided to do it twice a day (APM).


Cheers!
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Cover your walls in your basement with maggots... They clean wounds very well, maybe they will clean room very well too.....

:lol:

Ok. Away with the vacuum cleaners...


Cheers!
No, no, you'll need them for the maggots!

:lol:

Ok then: Bring back the vacuum cleaners, on the double!


Cheers!
 

joset

Distinguished
Dec 18, 2005
890
0
18,980
Did you know the first clean room was created in Germany and the rooms were filled with non-oxygen gasses. Most the workers died...




They were first developed in 1941 i think.

:lol:

Einrich Himmler & Adolf Eichman (I do not know if the spelling is correct, though; and, it doesn't really matter...), sooner than that, I believe.
Speaking about clean rooms, huh?! :wink:


Cheers!
 

m25

Distinguished
May 23, 2006
2,363
0
19,780
:lol: :lol: :lol:
Talking about clear rooms, I'd like to share the secret of keeping mine clean:
about 6 days/week the whole surface of it is THE trash bin; I can throw everything on the floor. When I see the stuff I am walking on is getting too much, I wipe it off :wink:

How do you equate layers of dust, trash, and mold and god knows what else, with a "clear room".

Are you saying you live outside and your room is not really clear so much as its invisable?
I live inside it, only that most of the garbage is paper and pencil waste so it looks like you're walking in an autumn park :lol:
 

TabrisDarkPeace

Distinguished
Jan 11, 2006
1,378
0
19,280
They make the chips with more cache than they need, so if, or when, some of it fails (small amounts due to minor faults nm in scale) it can just be disabled automatically by the chip, if not laser cut to disable.

It would work very much like RAID-5, but 'expect' a failure and just electronically use the 256/260 ths of the cache that worked.

This is one reason why transistor counts sometimes go down after a refresh and production matures. (eg: 7800 GTX to 7900 GTX). It is not the 'sole' reason though, obviosuly.

Much like the IBM Cell processor, they disable entire SPEs.
IBM do not mass manufacture CPUs with low yields, that is a total BS rumour and Sony is ridding it like a pony.

 

TabrisDarkPeace

Distinguished
Jan 11, 2006
1,378
0
19,280
. I have to wonder now, whether a "native" 2MB cache Allendale will show any difference with the L2 being increased to 16-way associativity. Somehow, i doubt it will make very little difference, seeing how a Conroe clocked at Allendale speeds isn't showing huge increases(and i would think doubling of cache would make a much bigger difference than double the associativity). Sorry for steering the thread of course a little. :?

Doubling set-associativity can, and often does, make more difference than just doubling the size of the cache (and not doubling set-associativity to keep it relative).

Look at the differences in L1 caches in both size and set-associativity over the last 6 years.

It explains 'much', that most forum readers simply ignore.
 

TabrisDarkPeace

Distinguished
Jan 11, 2006
1,378
0
19,280
Im guessing disabling the cache is not a bios tweek?

Traditionally you could disable the entire L1, L3 (and maybe L3 ?) caches via the BIOS.

eg: For purposes of hard RAM pattern testing back in the good old days. (Wasn't that long ago either).

In some mainboard BIOS's you still can if you use [Ctrl+F1] and [Ctrl+F6] to access extra options under various menus.

eg: Disable L1 (+L2) cache and you can reach 6+ GHz on various processors quite easily in CPU-Z / CPU-ID apps, but it'll take 15+ minutes to get to Windows XP desktop from power on. :roll:

It was not an option to perform 'partial' disables on the L1/L2 cache though (well not without a very custom BIOS, and I doubt anyone has ever bothered).