Intel and AMD L3 Cache Gaming Benchmarks – Does L3 Matter for Gaming?
I have got something pretty interesting for today’s writeup. Infact I
don’t think anyone has attempted to quantify this particular aspect of
processors before, so we will be treading on largely uncharted
territory. As with most of the unorthodox hardware content we publish,
this one was sourced from DG Lee, someone that pretty much everyone in the pc hardware community knows by now.
Level 3 cache on modern Intel and AMD CPUs boosts gaming performance by upto ~10%
Cache level 1, Cache level 2 and Cache level 3 (there is an L4 cache too but lets not get into that just now). The short forms of these (as you will undoubtedly know) is L1, L2 and L3 caches. However, while L1 and L2 caches are dedicated per core and are somewhat closed off in nature, an L3 cache is the general pool of memory that all cores share. Every core inside the modern multi-core processor has its own L1 and L2 cache but there is only one L3 per (entire) die. In terms of speed, you are looking at an ascending order and conventionally L1 is the fastest with L2 slower and so on. However, in recent times, the sped difference between the levels has closed, as the Industry shifts to a more unified-style architecture. In some cases, the L3 cache can even be utilized by an integrated GPU (case and point: Intel). An illustration of Haswell’s die layout is attached below:
The question then arises that why don’t we simply use a big enough L1 Cache for all cores in the first place? or a fast enough L3 cache only for all cores?. The answer to that question lies in the delicate balance that the cache levels implement, the more tech savvy of our readers would realize that I am of course talking about cache-latency and hit rate tradeoffs. If you create a very large L1 cache, then firstly, you would be wasting precious die space since very few applications need those kind of speeds and secondly the size itself will result in a lowered hit rate. The L3 cache is one example of this where specialized algorithms make sure that cores use the portion of the L3 closest to them to optimize performance. This is why modern processors implement a very small but very fast L1 cache, a slightly bigger but slower L2 cache and a big but slow L3 cache. Some processors now include eDRAM which is basically an L4 Cache and of an even larger size. Anyways, enough of that, lets get down to the nitty gritties of the benchmarks themselves.
[The slides are courtesy of DG Lee]. As you can see, going up from “2MB L3″ to “8MB L3″ results in an almost 10% boost depending on how much CPU-Bound the scenario is.
In the first slide, where the resolution is low and the primary
bottleneck is the processor, going up the L3 sizes raises performance by
~10% while as on 1080p it raises performance by ~8%. This allows us to
predict a trend. I would be willing to bet that this margin would be very low on 4K resolution and quite high on multi-gpu configurations. Up next we have AMD slides:
Once again we see a similar trend going up from “No L3″ to “8MB L3″.
The scaling is pretty similar, with the only exception being that the
scale here starts from No L3 instead of 2MB L3. It is worth pointing out
at this stage that AMD steamroller architecture has a significant
difference in cache layout. Where each Intel core has its own and private L1, two AMD cores in one Module share L1 cache between themselves.
This accounts for why the scale is slightly different, relatively
speaking, amongst different AMD CPUs. To those of you who are wondering,
yes DG Lee accounted for the difference in processor cores, clock
speeds, etc and mentions them in great detail in his original piece (which I would suggest to read if you can stomach the linguistic mess that is Google Translate).
0 komentari:
Speak up your mind
Tell us what you're thinking... !