#AI Challenge Camp First Stop#Handwritten digit MNIST recognition experiment record

undo110

#AI Challenge Camp First Stop#Handwritten digit MNIST recognition experiment record [Copy link]

This post was last edited by undo110 on 2024-4-13 00:57

Because we need to use the MNIST dataset of handwritten digits, and the size of each image in the dataset is 28x28, the dimension of the input data needs to be set to 784. At the same time, the digits we need to recognize are 0~9, so the output layer needs to be set to 10. As required by the question, the number of neurons in the middle layer is set to 15.

MNIST dataset, choose to import it directly from the dataset in the pytorch package. Divide the dataset into two parts: training and testing. During the training process, the BP algorithm needs to be added to adjust the weight of the network. However, during the testing process, back propagation is no longer required.

Use the pytorch package to build a three-layer neural network, using the ReLU activation function and the cross entropy loss function.

At the same time, based on the configuration of the personal computer, the number of iterations of training is set to 10 times, and the accuracy has also been well converged.

Code

from torchvision.datasets import mnist
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch import nn
import torch.optim as optim

meta_size = 70 # Limit the number of samples taken from the dataset each time to 70
epoches = 10
lr = 0.01 # Set the hyperparameter learning rate to 0.01
momentum = 0.5

# Data preprocessing The Compose method is to combine two operations together
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize([0.1307], [0.3081])])
# Download the dataset
train_dataset = mnist.MNIST('mnist', train=True, transform=transform, download=True)
# Non-training set, train parameter is set to False
test_dataset = mnist.MNIST('mnist', train=False, transform=transform, download=True)

# dataloader is an iterable object, iteratively extracts the size meta-size
train_loader = DataLoader(train_dataset, batch_size=meta_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=meta_size, shuffle=False)

# Define a three-layer neural network

# Because we need to use the MNIST handwritten digits dataset, and the size of each image in the dataset is # 28x28, the dimension of the input data needs to be set to 784. At the same time, the numbers we need to recognize are 0~9, so the output layer needs to be set to 10. As required by the question, the number of neurons in the middle layer is set to 15.

class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.layer1 = nn.Sequential(nn.Linear(784, 784), nn.ReLU(True)) #Use ReLU activation function
self.layer2 = nn.Sequential(nn.Linear(784, 15), nn.ReLU(True))
self.layer3 = nn.Linear(15, 10)

def forward(self, x):
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
return x

model = Net()

# Define the loss function and optimizer used in model training
criterion = nn.CrossEntropyLoss() # Cross entropy loss function
optimizer = optim.SGD(model.parameters(), lr=lr, momentum=momentum)

train_losses = [] # Record training set loss
train_acces = [] # Used to collect training set accuracytest_losses
= [] # Collect test set losstest_acces
= [] # Collect each test set accuracyfor

epoch in range(epoches):
train_loss = 0
train_acc = 0
model.train() # Model training processfor
img, label in train_loader:
img = img.view(img.size(0), -1) # Convert input image to 2D
# Forward propagationout
= model(img)
loss = criterion(out, label)
# Backward propagationoptimizer.zero_grad
()
loss.backward()
optimizer.step()

train_loss += loss.item() # The sum of all batch losses
# Calculate the classification accuracy_
, pred = out.max(1)
num_correct = (pred == label).sum().item()
acc = num_correct / img.shape[0] # The accuracy of each batch of
samplestrain_acc += acc

train_losses.append(train_loss / len(train_loader)) # Average loss of all samples
train_acces.append(train_acc / len(train_loader)) # Average accuracy of all samples
# Test
test_loss = 0
test_acc = 0
# Change the model to prediction mode

# During the training process, the BP algorithm needs to be added to adjust the weight of the network. However, during the testing process, back propagation is no longer needed.

model.eval()
for img, label in test_loader:
img = img.view(img.size(0), -1)
out = model(img)
loss = criterion(out, label)
test_loss += loss.item()
_, pred = out.max(1)
num_correct = (pred == label).sum().item()
acc = num_correct / img.shape[0]
test_acc += acc

test_losses.append(test_loss / len(test_loader))
test_acces.append(test_acc / len(test_loader))

print('Epoch: {}, Training Loss: {:.4f}, Training Accuracy: {:.4f}, Test Loss: {:.4f}, Test Accuracy: {:.4f}'
.format(epoch, train_loss / len(train_loader), train_acc / len(train_loader),
test_loss / len(test_loader), test_acc / len(test_loader)))

Experimental Results

Export file: