Introduction to cluster management
How can we allocate the resources in a cluster in an efficient manner? How can we allocate resources fairly?
Two step allocations 🟨++
There are two main kinds of allocation: first you need to allocate resources to a process, then allocate the process physically in the cluster.
Private and public cluster management 🟥++
Cluster management could be private or public.
Private means every app is managing their own sub-cluster: each app receives a private, static set of resources. Here it is easier to manage hardware for various needs. Public means there is a big cluster, like standard third party
Desiderata for Cluster Management 🟨–
- Fairness
- Users should be granted resources proportional to how much they are paying • Flexibility to express priorities
- Efficient resource usage
- Work conservation: Resources should not idle while there are users whose demand is not fully satisfied
- Performance isolation
- Guarantee that misbehaved collocated applications cannot affect me “too much”
Max-Min Scheduling
The method 🟩
The algorithm is quite easy:
- Allocate resources in order of increasing demand
- No user gets a resource share larger than its demand
- Users with unsatisfied demands get an equal share of the resource
The problem is when you have different kinds of resources to share. Another problem is that it is not always predictable to know how many tokens are you getting, this is mostly dependent by what others require.

Weighted fair scheduling
In the weighted fair scheduling, you assign a weight to each user, and then allocate resources proportionally to the weight. Thi It is a little different from weighted max-min fairness.
Dominant Resource Fairness
The Idea 🟨
The idea is quite simple: consider the dominant resource usage by each user. Then allocate the containers in a manner that is constant with respect to max resource usage ratio / number of containers. For example if Alice is using 6% of the available CPU, and Bob 3% of the available RAM, then bob should have twice as the number of containers compared to Alice.
Worked out Example
Suppose you have 9 CPU and 18 GB.
- Alice wants to issue processes 1 CPU and 4 GB
- Bob demands to issue processes with 3CPU and 1GB.
You want to equalize the share of the dominant resources with same constraints.
Token Bucket
- Has a size (number of tokens in the bucket)
- The fill rate, how quickly it is filled up. Commonly used for controlling network and storage traffic. Basically the buckets are filled up at a constant rate, and spent when the transmission is needed.
Resource Assignment
The algorithm is easy: filter for hard constraints and order for soft constraints. Users usually do not know what their application needs. Often users tend to ask more than they need, this is called overprovisioning.
Problem of low resource usage
From the point of view of the provider this is not optimal since we will have a low resource utilization:
- Hard consumption and costs. This problem has improved (Google Borg cluster manager has around 50% utilization by allocating to almost double CPU!).
Performance Goal assignments
See (Delimitrou & Kozyrakis 2014). We want to allocate resources based on performance goals, not predicted resource usages.
- For each new application, we need to recommend a resource allocation and assignment
- We need to base this raccomendation based on previous needs of similar applications.
- It does collaborative filtering to decide the resource allocation.
References
[1] Delimitrou & Kozyrakis “Quasar: Resource-Efficient and QoS-aware Cluster Management” Association for Computing Machinery 2014