Locality principles
Remember the two locality principles in Memoria. Temporal locality and spatial locality.
Temporal Locality
Some elements just are accessed many times in time. This is an example of a temporal locality.
Spatial locality
Some elements are accessed close to each other, this is an idea of spatial locality. In modern architectures, the a line of cache is usually 64 bytes.
For example consider this snippet:
sum = 0;
for (i = 0; i < n; i++) sum += a[i];
return sum;
Sum is an example of temporal locality as the same memory location (or register) is accessed many times, and the access of the array a
is an example of spatial locality.
loops cycle through the same instructions, this is an example of temporal locality.
Other characteristics of cache
Cache structure
The structure of the cache can be described by mainly three parameters (usually powers of two).
- The numbers of sets (cache lines)
- The size of a single line.
- An associativity index. (The least recently used block is removed between the associated caches). Then parts of the double is used to map into the cache structure (row, column). And part of it used for the tag, and you need to store the tag with the values. Also a valid bit is stored usually.
Types of cache misses
- Cold (Compulsory) Miss: where there is no data in the cache, so it is impossible to have a hit.
- Capacity: The cache is just not big enough to fit the data of the program.
- Conflict: The data is badly organized, you evict some parts of the cache that you need to access again.
Usually a Hit costs about 4 cycles for L1, 12 for L2 while a miss on skylake L3 costs 200 cycles).
Writing policies

Image from the slides.
Roofline model
See paper. The roofline model is a way to represent the performance of a system. It is a model that represents the performance of a system in terms of the arithmetic intensity and the peak performance of the system.

Cache-Aware Roofline Model
The cache-aware roofline model is a way to represent the performance of a system in terms of the arithmetic intensity and the peak performance of the system, but also considering the cache hierarchy. Intel has chosen to implement this version of the Roofline. It is computed by a tool called Intel Advisor.