Reading experience of "Infrastructure in the Age of Big Models" - Chapter 1: The demand for infrastructure in the era of AI and big models

HEU-liukai

Reading experience of "Infrastructure in the Age of Big Models" - Chapter 1: The demand for infrastructure in the era of AI and big models [Copy link]

This post was last edited by HEU-liukai on 2024-8-5 09:50

I just graduated from college and I am very honored to participate in the evaluation of this book. I would like to share my personal thoughts here.

When I first read this book, I felt that it was easy to understand. During my undergraduate studies, I participated in a project on intelligent underwater robot detection. That was the first time I came across terms such as "big model" and "neural network". I did not know much about them and found them difficult to understand. However, when reading Section 1.3 of this book (Analysis of the Univariate Linear Regression Algorithm), the author described the algorithm using terms related to machine learning algorithms. I feel that this example explains the concept of big model in an easy-to-understand form to beginners like me, so it also makes me, a newbie, interested in continuing to read.

From the book's introduction, you can quickly understand that the first chapter of this book mainly explains the demand for infrastructure in the era of AI and large models. The AI discussed in this chapter is the machine learning algorithm, and it gives certain reasoning in a simple way to facilitate readers' understanding. Then when it comes to machine learning algorithms, it is of course inseparable from the support of computer hardware. Therefore, after a brief explanation of the concepts related to machine learning algorithms, the first chapter lays the groundwork for the introduction of hardware such as CPU and GPU in subsequent chapters.

At the same time, through the study of Chapter 1, I also learned or reviewed many concepts, as follows:

Machine learning algorithm related terms: training samples, models, model weight parameters, training, reasoning based on model weights, etc.

Vector convolution in computer science refers to an algorithm that accumulates the results of multiplication operations, also known as "vector scalar product";

The most common computing unit in a computer is the ALU (Arithmetic Logic Unit) of the CPU;

Turing Complete: As long as a computer can implement the functions of a Turing machine, it can complete all computing tasks, that is, Turing Complete;

The structure of a Turing machine consists of the following parts: a sufficiently long tape, an alphabet, a read/write head, a state register, and a limited instruction set.

GPGPU (General-Purpose Graphics Processing Unit), currently (as of July 2024) is the hardware computing unit that the mainstream machine learning algorithms in the industry rely on;

New engines for machine learning algorithms: TPU (Tensor Processing Unit, which is good at calculating matrices and vectors and simplifies other computing functions) and NPU (Neural Processing Unit). Tensor is a mathematical term.

秦天qintian0303

The infrastructure and architecture of AI and big models may be different, but the basic principles should be similar.

HEU-liukai

Qintianqintian0303 posted on 2024-8-6 08:50 The infrastructure and architecture of the AI and big model eras may be different, but the basic principles should be similar

Oh~ This is my first time contacting and I’m still in the learning stage.

Reading experience of "Infrastructure in the Age of Big Models" - Chapter 1: The demand for infrastructure in the era of AI and big models [Copy link]

Latest reply

Comments