Content Delivery Networks

CDNs are intermediary servers that replicate read intensive data to provide better performance when user requests them. A close relative of CDNs is edge computing (e.g. gaming stations) where lots of computation is done directly close to the user.

Types of CDNs

Mainly three types of CDNs:

Highly distributed ones. -> Akamai
Database based ones.
Ad-hoc CDNs.

Advantages and disadvantages

The main reason we use CDNs is to lower the value of latency: we are in fact bringing the data closer to the user. We have much less data in length to be transmitted. Yet we have some disadvantages too:

They usually are not able to provide much storage
Difficult to manage and coordinate.

Cache Design

If you want to optimize for computer caches, see Cache Optimization.

The Reuse Distance

The reuse distance of an element in a certain access sequence is the number of middle elements until we get that element again. In classical LRU caches, with this value you can compute the miss rate.

Increase Hit Ratio

If we increase the cache hit ratio, we will have indeed better performance. There are some standard ways that are known in the academy:

Increase cache size
Design better algorithms

Lazy promotion and Quick Demotion

These are two design principles that have proven to work when designing efficient caches. Lazy promotion retain popular objects with minimal effort

Improve throughput due to less computation
Improve efficiency due to more information at eviction
- Reduces the frequency of recency updates, potentially improving performance.
The idea is to wait to upgrade an element to VIP member in the main queue, and wait to record a certain number of accesses, or access pattern that enable it to move up.

Quick Demotion: Remove unpopular objects fast, such as one-hit-wonders.

Sieve Caching

SIEVE considers both how frequently an item has been accessed (frequency) and how recently it was accessed (recency). However, it adds an interesting layer: it tries to actively identify and remove items that are likely to be accessed only once or a very limited number of times. These are often referred to as “one-hit wonders” or “transient” items.

The following represents the algorithm in short: !Example of the algorithm

Load Balancing

When we have lots of requests, we want to find an optimal way to split these among the available resources. Another example of automatic load balancing are key value stores (see Cloud Storage#Key-value stores).

Global Load Balancing

Avoid TCP set up delay (latency)
Use of DNS caching. AKAMAI uses a kind of iterative DNS discovery (see Livello applicazione e socket#Domain Name System). In this DNS type, it makes a request, and it tells you other DNS name system to contact to get the IP. When you go the IP, you can make the HTTP request to the akamai server for the resource. This server contacts the content provider if there is no copy available, else it does it correctly.

Anycast

With Anycast, we have a group of servers that share the IP. When a user makes a request, routers are able to deliver it to the server closer geographically.

Advantages

No extra round trips
Route to nearby server Disadvantages
Does not consider network or server load (e.g. a server in region could have higher load compared to another server, with this method we don’t have any load balancing based on load)
Different packets may go to different servers
Used only for simple request-response apps

It uses consistent hashing (See Cloud Storage#Key-value stores) to know exactly what server to contact.

Types of CDNs#

Advantages and disadvantages#

Cache Design#

The Reuse Distance#

Increase Hit Ratio#

Lazy promotion and Quick Demotion#

Sieve Caching#

Load Balancing#

Global Load Balancing#

Anycast#