Cloud Storage

Object Stores Characteristics of Cloud Systems Object storage design principles We don’t want the hierarchy that is common in Filesystems, so we need to simplify that and have these four principles: Black-box objects Flat and global key-value model (trivial model, easy to access, without the need to trasverse a file hierarchy). Flexible metadata Commodity hardware (the battery idea of Tesla until 2017). Object storage usages Object storage are useful to store things that are usually read-intensive. Some examples are ...

19 min · Xuanqiang 'Angelo' Huang

Notions of Security

CIAA principles of security We have already outlined these principles in Sicurezza delle reti and talked about the concepts of authentication and integrity. Here we try to deepen these concepts and delve a little bit more on the attack vectors. This note mainly focuses on the principles summarized by the acronyms CIA and AAA. Confidentiality This is one concerns about the secrecy of the sent message. We do not want others to be able to access and read what we are doing. ...

7 min · Xuanqiang 'Angelo' Huang

Queueing Theory

Queueing theory is the theory behind what happens when you have lots of jobs, scarce resources, and subsequently long queues and delays. It is literally the “theory of queues”: what makes queues appear and how to make them go away. This is basically what happens in clusters, where you have a limited number of workers that need to execute a number of jobs. We need some little maths to model the stochastic process of request arrivals. ...

8 min · Xuanqiang 'Angelo' Huang

Advanced 3D Representations

3D representations In this section, we present some of the most common 3D representations used in computer graphics and computer vision. Each representation has its own advantages and disadvantages, and the choice of representation often depends on the specific application. Voxels With voxels we discretize 3D space into a 3d grid, it is an intuitive manner to represent the data, but it has limited resolution. It needs $\mathcal{O}(n^{3})$ memory. Points and Volumetric primitives We can discretize surfaces into 3D points. Yet, this does not model connectivity, and might vary from frame to frame if it is a video. ...

12 min · Xuanqiang 'Angelo' Huang

Autoencoders

In questa serie di appunti proviamo a descrivere tutto quello che sappiamo al meglio riguardanti gli autoencoders Blog di riferimento Blog secondario che sembra buono Introduzione agli autoencoders L’idea degli autoencoders è rappresentare la stessa cosa attraverso uno spazio minore, in un certo senso è la compressione con loss. Per cosa intendiamo qualunque tipologia di dato, che può spaziare fra immagini, video, testi, musica e simili. Qualunque cosa che noi possiamo rappresentare in modo digitale possiamo costruirci un autoencoder. Una volta scelta una tipologia di dato, come per gli algoritmi di compressione, valutiamo come buono il modello che riesce a comprimere in modo efficiente e decomprimere in modo fedele rispetto all’originale. Abbiamo quindi un trade-off fra spazio latente, che è lo spazio in cui sono presenti gli elementi compressi, e la qualità della ricostruzione. Possiamo infatti osservare che se spazio latente = spazio originale, loss di ricostruzione = 0 perché basta imparare l’identità. In questo senso si può dire che diventa sensato solo quando lo spazio originale sia minore di qualche fattore rispetto all’originale. Quando si ha questo, abbiamo più difficoltà di ricostruzione, e c’è una leggera perdita in questo senso. ...

9 min · Xuanqiang 'Angelo' Huang

Autoregressive Modelling

On Autoregressivity The main idea of autoregressivity is to use previous prediction to predict the next state. The Autoregressive property Autoregressive models model a joint distribution of aleatoric variables by assuming a chain rule like decomposition: $$ p(x) = \prod_{i=1}^{n} p(x_i | x_{1:i-1}) $$ If we assume independence between the variables, we don’t need many variables to model it $2T$, but this assumption is too strong. If we just use a tabular approach, we’ll have a combinatorial explosion: we will have about $2^{T - 1}$ possible states (if we assume the aleatoric variables are binary, and we are creating a table for each intermediate variable). ...

2 min · Xuanqiang 'Angelo' Huang

Backpropagation

Backpropagation is perhaps the most important algorithm of the 21st century. It is used everywhere in machine learning and is also connected to computing marginal distributions. This is why all machine learning scientists and data scientists should understand this algorithm very well. An important observation is that this algorithm is linear: the time complexity is the same as the forward pass. Derivatives are unexpectedly cheap to calculate. This took a lot of time to discover. See colah’s blog. Karpathy has a nice resource for this topic too! Stanford lecture on backpropagation is another resource. ...

8 min · Xuanqiang 'Angelo' Huang

Cache Optimization

Locality principles Remember the two locality principles in Memoria. Temporal locality and spatial locality. Temporal Locality Some elements just are accessed many times in time. This is an example of a temporal locality. Spatial locality Some elements are accessed close to each other, this is an idea of spatial locality. In modern architectures, the a line of cache is usually 64 bytes. For example consider this snippet: sum = 0; for (i = 0; i < n; i++) sum += a[i]; return sum; Sum is an example of temporal locality as the same memory location (or register) is accessed many times, and the access of the array a is an example of spatial locality. loops cycle through the same instructions, this is an example of temporal locality. ...

2 min · Xuanqiang 'Angelo' Huang

Compiler Limitations

On Compiler Adding compilation flags to gcc not always makes it faster, it just enables a specific set of optimization methods. It’s also good to turn on platform specific flags to turn on some specific optimization methods to that architecture. Remember that compilers are conservative, meaning they do not apply that optimization if they think it does not always apply. What are they good at Compilers are good at: mapping program to machine ▪ register allocation ▪ instruction scheduling ▪ dead code elimination ▪ eliminating minor inefficiencies ...

3 min · Xuanqiang 'Angelo' Huang

Convolutional Neural Network

Introduction to Convolutional NN Design Goals We want to be invariant to some transformations but also at the same time to be specific to some thing. Convolutional Neural Networks (CNNs) are a class of deep neural networks that are particularly effective for image processing tasks. They are designed to automatically and adaptively learn spatial hierarchies of features from images. Compared to standard Fully connected Neural Networks, they reuse weights, making their number of parameter much fewer. ...

13 min · Xuanqiang 'Angelo' Huang