Systems for Artificial Intelligence

At the time of writing, the compute requirements for machine learning models and artificial intelligence are growing at a staggering rate of 200% every 3.5 months. Interest in the area is being quantified as 10k papers per month on the topic, while dollar investments on compute (energy, cooling, sustainability of compute in general) have had a hard time keeping up with the continuous new requests. From https://ucbrise.github.io/cs294-ai-sys-fa19/assets/lectures/lec03/03_ml-lifecycle.pdf ...

8 min · Xuanqiang 'Angelo' Huang

Fast Fourier Transforms

The algorithm has been the same, some ideas are in Fourier Series, but architectures change, which means there are new ways to make this algorithm even faster. Example of transforms We have learned in Algebra lineare numerica, Cambio di Base that linear transforms are usually a change of basis. They are matrix vector multiplications (additions and multiplications by constants). The optimizations are based on what sorts of transforms we have (e.g. Sparse Matrix Vector Multiplication, or dense versions). The same idea applies also for Fourier transforms. ...

5 min · Xuanqiang 'Angelo' Huang

Fast Linear Algebra

Many problems in scientific computing include: Solving linear equations Eigenvalue computations Singular value decomposition LU/Cholesky/QR decompositions etc… And the userbase is quite large for this types of computation (number of scientists in the world is growing exponentially ) Quick History of Performance Computing Early seventies it was EISPACK and LINPACK. Then In similar years Matlab was invented, which simplified a lot compared to previous systems. LAPACK redesigned the algorithms in previous libraries to have better block-based locality. BLAS are kernel functions for each computer, while LAPACK are the higher level functions build on top of BLAS (1, 2,3). Then another innovation was ATLAS, which automatically generates the code for BLAS for each architecture. This is called autotuning because it does a search of possible enumerations and chooses the fastest one. Now autotuning has been done a lot for NN systems. ...

5 min · Xuanqiang 'Angelo' Huang

Cloud Computing Services

Cloud Computing: An Overview Cloud shifted the paradigm from owning hardware to renting computing resources on-demand. Hardware became a service. Key Players in the Cloud Industry 🟨 The cloud computing market is dominated by several major providers, often referred to as the “Big Seven”, also called hyper-scalers. They are usually not interested in making it interoperable (they prefer the lock-in). Amazon Web Services (AWS): The largest provider, offering a comprehensive suite of cloud services. Microsoft Azure: Known for deep integration with enterprise systems and hybrid cloud solutions. Google Cloud Platform (GCP): Excels in data analytics, AI/ML, and Kubernetes-based solutions. IBM Cloud: Focuses on hybrid cloud and enterprise-grade AI. Oracle Cloud: Specializes in database solutions and enterprise applications. Alibaba Cloud: The leading provider in Asia, offering services similar to AWS. Salesforce: A major player in SaaS, particularly for CRM and business applications. These providers collectively control the majority of the global cloud infrastructure market, enabling scalable and on-demand computing resources for businesses worldwide. Capital and Operational Expenses in the Cloud Definition for CapEx and OpEx 🟥 Cloud computing transforms traditional IT cost structures by shifting expenses from capital expenditures (CapEx), such as purchasing servers and data centers, to operational expenditures (OpEx), where users pay only for the resources they consume. ...

13 min · Xuanqiang 'Angelo' Huang

Cloud Reliability

Reliability is the ability of a system to remain operational over time, i.e., to offer the service it was designed for. Cloud Hardware and software fails. In this note, we will try to find methods to analyze and predict when components fail, and how we can prevent this problem. Defining the vocabulary Availability 🟨++ $$ \text{Availability} = \frac{\text{Uptime}}{\text{Uptime} + \text{Downtime}} $$MTTF: Mean Time To Failure 🟩– $$ \text{MTTF} = \frac{1}{r} $$ This definition does not include repair time, and assumes the failures are independent with each other. ...

6 min · Xuanqiang 'Angelo' Huang

Communication in the Cloud

How can we coordinate services to actually understand what they are doing, or what the user wants them to do? How to manage networks errors? This note will mainly focus on high level communication protocols to coordinate this kind of communication. Remote Procedure Calls History and Basic Idea This has been the main idea, introduced in 1984, using the idea of stubs, see (Birrell & Nelson 1984). The system basically calls the remote procedure as if it was local on the high level, but on a lower level a network request is sent. The architecture has remained the same in these years. It hides all the complexity in the stub (marshaling, binding and sending, without caring about the sockets and communication matters). One problem is that it might be hiding the complexity too well. The programmer has surely an ease of programming, but design consideration should consider overloads generated by the network communication. ...

7 min · Xuanqiang 'Angelo' Huang

Queueing Theory

Queueing theory is the theory behind what happens when you have lots of jobs, scarce resources, and subsequently long queues and delays. It is literally the “theory of queues”: what makes queues appear and how to make them go away. This is basically what happens in clusters, where you have a limited number of workers that need to execute a number of jobs. We need some little maths to model the stochastic process of request arrivals. ...

7 min · Xuanqiang 'Angelo' Huang

Virtual Machines

The fundamental idea behind a virtual machine is to abstract the hardware of a single computer (the CPU, memory, disk drives, network interface cards, and so forth) into several different execution environments, thereby creating the illusion that each separate environment is running on its own private computer. (Silberschatz et al. 2018). Virtualization allows a single computer to host multiple virtual machines, each potentially running a completely different operating system. È virtuale nel senso che la macchina virtuale ha la stessa percezione della realtà di una macchina reale. Qualcosa che non è la realtà ma appare molto simile ad essa. ...

11 min · Xuanqiang 'Angelo' Huang

Conditioning Theory

Associative Conditioning Classical Conditioning Pavlov’s experiment He was interested in digestive systems of dogs. Then he notices that if we show food to dog, they start to salivate. If paired with sound (tuning fork) they start to salivate even if they just hear the sound. He defines two states: Before conditioning During conditioning After conditioning state. Important words are conditioned stimulus, conditioned response. And their oppose (unconditioned). It is important that it is quite consistent. Associate unconditioned stimulus with conditioned stimulus. ...

3 min · Xuanqiang 'Angelo' Huang

Neuroeconomics

Humans are not rational. See (Dostoevsky 1864), the parable of the destroying free will. In this note we will describe some behavioural economics ideas that could be interesting in this side. Economical Games The rules of the Ultimate game The proposer makes an offer as to how this money should be split between the two. The second player (the risponder) can either accept or reject this offer. If it is accepted, the money is split as proposed, but if the risponder rejects the offer, then neither player receives anything. ...

4 min · Xuanqiang 'Angelo' Huang