Hands-on deep learning (PyTorch version) - [Reading activity-experience sharing] Implementation of linear regression
[Copy link]
Introduction
This chapter will implement linear regression according to the tutorial provided in the book, implement data training, save model data, and load the model for data prediction when used.
Linear regression is usually used to predict the output of Y using the input of X if the data conforms to the Gaussian distribution. As shown in the figure below, the distribution of the data presents a linear relationship.
There is an important formula in linear regression, as shown in the figure below.
Y is the value you need to predict, X is the input, W is the weight of X, and B is the value of Y when W is 0.
The linear regression implemented in the book also provides a concept of noise. As shown in the following figure
So, since we need a data set to provide training data, we also need to extract a part of the data set as test data to evaluate the accuracy of the model using Square error.
def synthetic_data(w, b, num_examples): #[url=home.php?mod=space&uid=472666]@save[/url] """成y=Xw+b+噪声"""
X = torch.normal(0, 1, (num_examples, len(w)))
y = torch.matmul(X, w) + b
y += torch.normal(0, 0.01, y.shape)
return X, y.reshape((-1, 1))
The above code is used to generate the input vector of X and the output scalar of Y.
Then define the weight of X and the bias value B for generating data.
true_w = torch.tensor([2, -3.4]) # 权重
true_b = 4.2 # 偏差值
features, labels = synthetic_data(true_w, true_b, 1000) # 根据权重和偏差值生成1000个数据
According to the code in the book, a total of 1000 data are generated above.
The generated data format is shown in the figure below (it can be observed that X is a vector and Y is a scalar)
So far, our original data has been prepared, that is, training data, but we still need to take out a part of the training data as test data to optimize our training model.
def data_iter(batch_size, features, labels):
num_examples = len(features)
# print('长度', num_examples)
indices = list(range(num_examples))
# 这些样本是随机读取的,没有特定的顺序
random.shuffle(indices)
# print(indices)
for i in range(0, num_examples, batch_size):
batch_indices = torch.tensor(
indices[i: min(i + batch_size, num_examples)])
# print('batch_indices',batch_indices)
yield features[batch_indices], labels[batch_indices]
When I was learning machine learning, I always took 20% as test data. The above code is mainly used to generate small batches of data for optimizing our model. I can understand the above code before for, but I can't understand it after for. I only know that this function can randomly shuffle part of the data set and then split out part of it as test data.
After that, we create a weight and bias value B for initial loading of the model. And define squared error as the loss function to evaluate the accuracy of the model. And the model model linreg()
def sgd(params, lr, batch_size): #@save
"""批量随机梯度下降"""
with torch.no_grad():
for param in params:
param -= lr * param.grad / batch_size
param.grad.zero_()
Define the gradient descent method to dynamically update W and B to optimize the model.
lr = 0.03
num_epochs = 1000
net = linreg
loss = squared_loss
for epoch in range(num_epochs):
for X, y in data_iter(batch_size, features, labels):
l = loss(net(X, w, b), y) # X和y的批量损失
# 因为l形状是(batch_size,1),不是个标量。l中的所有元素被加到起,
# 并以此计算关于[w,b]的梯度
l.sum().backward()
sgd([w, b], lr, batch_size) # 使参数的梯度更新参数
with torch.no_grad():
train_l = loss(net(features, w, b), labels)
print(f'epoch {epoch + 1}, loss {float(train_l.mean()):f}')
print(f'w的估计误差: {true_w - w.reshape(true_w.shape)}')
print(f'b的估计误差: {true_b - b}')
torch.save({'weights': w, 'bias': b}, 'linear_regression_model.pth')
Through the above code, the model is gradient-dropped 1000 times to optimize the model's W and B to find the best fitting line to fit our model.
We can observe from the above that after dropping a thousand times, the loss function has become smaller and smaller. The accuracy of w and b is also improving. We can save the trained model data and then call our model in other programs.,
We create a new Jupyter file to load the model and make predictions on the data
We can observe that when the input X vector is 1.4627, 0.1506, the output Y is 6.6109
We compare the above data with the original data.
It can be seen that the data basically meets our needs.
|