Cache is a small and fast memory, usually static RAM (SRAM), which is incorporated inside the CPU or placed on a separate chip. The cache stores data which is frequently used in running the programs. By this, the cache increases the speed of accessing the data and the overall performance of the system.
The benefit of using cache is that the CPU does not need to use the motherboard’s system to transfer the data. Whenever the CPU has to use the system bus to access the data, this affects the effectiveness of the motherboard by slowing it down. Hence using cache overcomes this problem.
Cache is so effective that the system performance of a fast running CPU with little cache support is less than that of a slow running CPU with more cache support. The cache can be a L1 (incorporated inside the CPU) or L2 (placed in a separate chip). Some systems contain both L1 and L2 cache.
CACHE WORKING AND ITS PERFORMANCE
What is Cache?
Cache is a small and fast memory, usually static RAM (SRAM), which is incorporated inside the CPU or placed on a separate chip. The cache stores data which is frequently used in running the programs. By this, the cache increases the speed of accessing the data and the overall performance of the system.
The benefit of using cache is that the CPU does not need to use the motherboard’s system to transfer the data. Whenever the CPU has to use the system bus to access the data, this affects the effectiveness of the motherboard by slowing it down. Hence using cache overcomes this problem.
Cache is so effective that the system performance of a fast running CPU with little cache support is less than that of a slow running CPU with more cache support. The cache can be a L1 (incorporated inside the CPU) or L2 (placed in a separate chip). Some systems contain both L1 and L2 cache.
How does Cache increase the CPU Performance?
The improvement in the CPU performance by cache can be explained by the concept of “Locality of Reference”. At a time, the processor will be accessing data from a particular region of the memory. This block of memory will be stored in the cache for high speed access, hence increasing the performance.
I can further explain this by using the book analogy:
-Lots of books on my shelf: This is like the main memory.
-A few books on my desk: This is like the L2 cache.
-One book that I’m reading: This is like the L1 cache.
Hits and Misses in the Cache
When the required data that the CPU tries to fetch from the cache is found, then it’s called a cache hit and when it’s not found then it’s called a cache miss. A cache hit transfers the data to the CPU in high speed. However, if there’s a cache miss, then the CPU has to access the main memory to fetch the required data. It loads the data from the main memory to the cache which delays the execution of the instructions and lowers the performance.
Cache Mapping
There are various ways by which the main memory is mapped on to the cache:
- Direct Mapping
- Fully Associative Mapping
- Set Associative Mapping
Direct Mapping:
Each memory block is mapped to exactly one cache location. The cache location is decided by the following formula:
Cache Location = (block address) MOD (# of blocks in the cache)
Direct mapping is the most efficient type of mapping scheme; however it does not utilize the cache to the fullest (least effective), as it may not use all the cache lines. There may be a cache miss even when the cache is not full of active lines.
Fully Associative Mapping:
Each memory block is mapped to any random cache location. Here, the cache is fully utilized as none of the cache lines are left unused, but at the expense of speed. The searching of the line which contains a particular memory block is time consuming as the whole of the cache must be scanned to find that block.
illustration not visible in this excerpt
Set Associative Mapping:
Each memory block is mapped to a subset of cache location. The cache location is decided by the following formula:
Set Selection = (block address) MOD (# of sets in the cache)
This is a compromise between direct mapped and fully associative. The cache is divided into a set of tags and the set number is direct mapped from the memory address.
In set-associative mapping, when the number of lines per set is n, the mapping is called n -way associative.
How do we use the memory address to find the block?
We take the following example to explain:
- Main Memory= 16 bits=2^16=64KB
- Cache Memory=8 blocks=128 bits
Frequently asked questions
What is Cache memory and why is it important?
Cache is a small, fast memory (usually SRAM) that stores frequently used data, either inside the CPU (L1 cache) or on a separate chip (L2 cache). It speeds up data access because the CPU doesn't have to use the slower motherboard system bus as often, improving overall system performance. A CPU with more cache can sometimes outperform a faster CPU with less cache.
How does cache improve CPU performance?
Cache leverages the principle of "Locality of Reference," where the CPU tends to access data from a specific memory region at a given time. This region is stored in the cache for quicker access.
What are cache hits and misses?
A cache hit occurs when the CPU finds the requested data in the cache. A cache miss happens when the data isn't in the cache, forcing the CPU to retrieve it from the slower main memory. Cache misses slow down execution.
What are the different cache mapping techniques?
There are three main cache mapping techniques:
- Direct Mapping: Each memory block maps to one specific cache location, determined by a formula (Cache Location = (block address) MOD (# of blocks in the cache)). Simple but may not use the cache effectively.
- Fully Associative Mapping: Each memory block can map to any cache location. Maximizes cache utilization but can be slow to search.
- Set Associative Mapping: Each memory block maps to a subset of cache locations (sets), determined by a formula (Set Selection = (block address) MOD (# of sets in the cache)). A compromise between direct and fully associative. n-way associative mapping refers to *n* lines per set.
How does Direct Mapping work?
In direct mapping, a specific cache location is assigned to each memory block based on a modulo operation using the block address and the total number of blocks in the cache. While efficient, it may lead to cache misses even when the cache isn't completely full.
How does Fully Associative Mapping work?
Fully associative mapping allows a memory block to be placed in any available location within the cache, leading to high utilization. However, searching for a specific memory block can be time-consuming, potentially offsetting some of the performance gains.
How does Set Associative Mapping work?
Set-associative mapping divides the cache into sets, and each memory block can map to any location within its assigned set. The set is determined by a modulo operation. It balances the advantages and disadvantages of direct and fully associative mapping. The 'n' in n-way associative indicates how many lines are in each set.
How do you use the memory address to find a block in the cache?
Consider an example with Main Memory of 16 bits (64KB), Cache Memory of 8 blocks (128 bits), and a Block Size of 4 words (16 bytes). The specific method for determining the block within the cache depends on the chosen mapping technique (direct, fully associative, or set-associative). Set associative involves calculating a set number which is then looked up in memory.
- Quote paper
- Neha Sah (Author), 2013, Cache working and its performance, Munich, GRIN Verlag, https://www.grin.com/document/265630