Thanks to EEWORLD for hosting this event. I am honored to receive the great book "Hands-On Deep Learning with PyTorch Edition".
It is well-deserved to call this a masterpiece, because the authors are both experts in machine learning from China and the United States.
On the one hand, this book is really thick, with 572 pages, like a dictionary, with very rich concepts and content, and can be used as an introductory book or an in-depth book.
This time, we will focus on studying Chapter 1 to Chapter 4 of this book.
That is, the basic concepts of deep learning, as well as linear regression, multi-layer perceptron, etc. This part is an introduction, bringing readers to the door of machine learning to see what is inside.
First, the basic concepts:
Machine Learning: Machine Learning (ML) is a discipline that studies the theories and methods of using computers to simulate human learning activities, acquire knowledge and skills, and improve system performance.
Supervised learning involves training a learning model using a set of labeled data (data with known outputs) and then using the trained model to make predictions about unknown data.
Unsupervised learning solves various problems in pattern recognition based on training samples with unknown categories (not labeled).
Reinforcement Learning (RL), also known as reinforcement learning and evaluation learning, is a machine learning method that improves behavior by simulating reward signals in brain nerve cells.
Supervised learning
Regression
Classification
Sorting
Unsupervised learning
Clustering Dimensionality
reduction
Density estimation
Generative model
Generative adversarial network (GAN)
Association rule learning
Apriori
Eclat
Semi-supervised learning
Reinforcement learning
Model-free
Policy Optimization
Optimization Q-Learning
Model-based
AlphaZero
At present, our introductory part mainly focuses on supervised learning (Supervised)
1. Regression
2. Classification
3. Sorting
Tensor:
In multidimensional Numpy arrays, they are also called tensors. Generally speaking, all current machine learning systems use tensors as the basic data structure.
The core of the concept of a tensor is that it is a container for data. The data it contains is almost always numerical data, so it is a container for numbers. You may be familiar with matrices, which are two-dimensional tensors. Tensors are a generalization of matrices to arbitrary dimensions.
Regression: divided into linear regression and nonlinear regression.
Linearity: The relationship between two variables is a linear function relationship - the graph is a straight line, which is called linearity.
Note: Linearity refers to linearity in a broad sense, that is, the relationship between data.
Nonlinearity: The relationship between two variables is not a linear function relationship - the graph is not a straight line, which is called nonlinearity.
So when can linear regression be used? Statistician Anscombe gave four data sets, known as Anscombe's Quartet.
From the distribution of these four data sets, we can see that not all data sets can be modeled using univariate linear regression. Problems in the real world are often more complex, and it is almost impossible for variables to ideally meet the requirements of linear models. Therefore, when using linear regression, the following assumptions need to be observed:
Linear regression is a regression problem.
The relationship between the variable y to be predicted and the independent variable x is linear (Figure 2 is a nonlinear one).
The errors follow a normal distribution with a mean of 0 and the same variance as x (Figure 4 shows that the errors are not normally distributed).
The distribution of the variable x should have variability.
Different features in multiple linear regression should be independent of each other to avoid linear correlation.
A Python example of linear regression.
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression # 线性回归
# 样本数据集,第一列为x,第二列为y,在x和y之间建立回归模型
data=[
[0.067732,3.176513],[0.427810,3.816464],[0.995731,4.550095],[0.738336,4.256571],[0.981083,4.560815],
[0.526171,3.929515],[0.378887,3.526170],[0.033859,3.156393],[0.132791,3.110301],[0.138306,3.149813],
[0.247809,3.476346],[0.648270,4.119688],[0.731209,4.282233],[0.236833,3.486582],[0.969788,4.655492],
[0.607492,3.965162],[0.358622,3.514900],[0.147846,3.125947],[0.637820,4.094115],[0.230372,3.476039],
[0.070237,3.210610],[0.067154,3.190612],[0.925577,4.631504],[0.717733,4.295890],[0.015371,3.085028],
[0.335070,3.448080],[0.040486,3.167440],[0.212575,3.364266],[0.617218,3.993482],[0.541196,3.891471]
]
#生成X和y矩阵
dataMat = np.array(data)
X = dataMat[:,0:1] # 变量x
y = dataMat[:,1] #变量y
# ========线性回归========
model = LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)
model.fit(X, y) # 线性回归建模
print('系数矩阵:\n',model.coef_)
print('线性回归模型:\n',model)
# 使用模型预测
predicted = model.predict(X)
plt.scatter(X, y, marker='x')
plt.plot(X, predicted,c='r')
plt.xlabel("x")
plt.ylabel("y")
The results are:
Multilayer Perceptron:
Multilayer Perceptron (MLP) is also called Artificial Neural Network (ANN). In addition to the input and output layers, it can have multiple hidden layers. The simplest MLP contains only one hidden layer, that is, a three-layer structure, as shown below:
Since there are multi-layer perceptrons, there must be single-layer perceptrons
The perceptron was proposed by American scholar Frank Rosenblatt in 1957. The perceptron is the algorithm that is the origin of neural networks (deep learning) . Therefore, learning the structure of the perceptron is also an important idea that leads to neural networks and deep learning.
The perceptron receives multiple input signals and outputs one signal. The "signal" mentioned here can be imagined as something with "fluidity" like electric current or a river. Just as electric current flows through a wire and transports electrons forward, the signal of the perceptron will also form a flow and transport information forward. However, unlike the actual electric current, the signal of the perceptron has only two values: "flow/no flow" (1/0). Here we think that 0 corresponds to "no signal transmission" and 1 corresponds to "transmitting signal".
The figure below is an example of a perceptron that receives several input signals.
A Python example of a perceptron is as follows:
>>> import numpy as np
>>> x = np.array([0, 1]) # 输入
>>> w = np.array([0.5, 0.5]) # 权重
>>> b = -0.7 # 偏置
>>> w*x
array([ 0. , 0.5])
>>> np.sum(w*x)
0.5
>>> np.sum(w*x) + b
-0.19999999999999996 # 大约为-0.2(由浮点小数造成的运算误差)
Back to the multilayer perceptron.
The training and learning process of the multilayer perceptron includes the following steps:
- Prepare the dataset: Divide the dataset into training and test sets, and perform necessary preprocessing (such as normalization or standardization).
- Build the model: Select the appropriate network structure, including the number of hidden layers and the number of neurons, and determine the activation function and loss function.
- Compile and train the model: Specify parameters such as optimization algorithm, learning rate, and number of iterations, and train the model using the training set.
- Model evaluation and prediction: Use the test set to evaluate the performance of the model and make predictions on new data.
Below is a code example of a multi-layer perceptron (MLP) implemented using the Keras library:
# 5.1 导入必要的库
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
# 5.2 准备数据集
X_train = ...
y_train = ...
X_test = ...
y_test = ...
# 5.3 构建模型
model = Sequential()
model.add(Dense(units=64, activation='relu', input_dim=100))
model.add(Dense(units=64, activation='relu'))
model.add(Dense(units=10, activation='softmax'))
# 5.4 编译和训练模型
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10, batch_size=32)
# 5.5 模型评估和预测
loss, accuracy = model.evaluate(X_test, y_test)
predictions = model.predict(X_test)
In the above code, we first imported the required libraries and then prepared the training and test set data. Next, we built a Sequential model and defined the network structure using the Dense layer. When compiling the model, we specified the loss function, optimization algorithm, and evaluation metric. Then, we trained the model using the training set and specified the number of iterations and batch size. Finally, we evaluated the performance of the model using the test set and made predictions on new data.
This is the end of this learning sharing.