Originally Posted By: JHZR2
Right but it all propagates, right? if my processing system is capable of some multiplier of the FSB and ends up having to be idled for that number of cycles, than data thruput cuts down computing speed to 20% of what the nameplate is, due to waiting cycles. Its not how fast you can perform a calculation, its how long you have to wait to be fed...
But that's my limited insight on this. May be way off...
Remember, down to the electrons it is all about physics. You can have really really fast ram but they have to be small, because if you make it too big, your long wiring will slow it down. It is always a trade off between size and speed, and latency vs bandwidth. These design decision are all made based on the manufacturing process, not the other way around (because if your design can't be manufactured, it is useless).
To make you feel better, most of the work done in CPU is 1) recently just performed, 2) near where you are currently processing. So caching what you just used along with a chunk nearby gives you a lot of speed for very low cost. Also the reason these small and fast memory for cache is so much faster than the large and slow memory for DRAM (DDR 3 in our case), or even the super slow and super large memory for storage (NAND for SSD or SD card), is due to their manufacturing process and design differences.
Fast memory like cache usually are SRAM, basically transistors wiring in a feedback loop, so they are bulky and fast. Slower memory like DRAM are capacitor with shared charge and sense circuit, so every time you read and write you have to charge and discharge capacitors, but you only need 1 capacitor instead of 4 transistors so you save a lot of space. This is why you cannot just use one type alone, and today's system tuning their ratio makes all the differences.