Forum»Forum › Special Edition for Assessment Centres › Hands-on Deep Learning (PyTorch Edition) - [Reading ...

71 views|0 replies

157 Posts	3 Resources

The OP

Published on 2024-11-19 04:44 Only look at the author

Hands-on Deep Learning (PyTorch Edition) - [Reading Activity-Sharing Experience] Implementation of Multilayer Perceptron [Copy link]

Introduction

Although the reading activity is over, I really want to finish learning everything in this book. I will continue to update the content of this book later. In this chapter, we learned about multi-layer perceptrons. Compared with single-layer perceptrons, multi-layer perceptrons can handle XOR problems. Each layer has its own weight and bias. The input data of the first layer is input to the hidden layer. Then the hidden layer extracts the features of the input layer (the size of the hidden layer can be specified). Then the features of the hidden layer are used as the input of the next layer. Finally, the classified data is obtained.

I ran the code according to the perceptron in the book. I got the training curves of different models by adjusting the size of the hidden layer.

As we can see in the figure above, I have captured the hidden layer sizes of 10, 64, 512 and 1024. The training curves of different models are compared.

We can see from the above figure that if the number of hidden layers increases (not absolutely, there is basically no difference between 512 and 1024), the training accuracy and test accuracy of the model are relatively smooth. The difference can be compared between 64 and 10 or 10 and 512. We found that except for the loss, the accuracy of the model is actually similar. For the loss function, we can find that the loss of 512 and 1024 is basically the same. But the difference between 10, 64, and 512 is quite large. Through the above, we found a rule. That is: if the size of the hidden layer exceeds a certain threshold, that is, the current layer cannot distinguish more features for the image (for example, a picture can be classified according to any condition with a maximum of 10 features, but the length of the hidden layer here is 20). Then there will be no difference between the hidden layer size of 10 and 20. But if the hidden layer size is less than the maximum value of the feature classification. Then the change of the loss function will be different.

import torch
from torch import nn
from d2l import torch as d2l
batch_size = 256

train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
num_inputs, num_outputs, num_hiddens = 784, 10, 1024

W1 = nn.Parameter(torch.randn(
    num_inputs, num_hiddens, requires_grad=True) * 0.01)
b1 = nn.Parameter(torch.zeros(num_hiddens, requires_grad=True))
W2 = nn.Parameter(torch.randn(
    num_hiddens, num_outputs, requires_grad=True) * 0.01)
b2 = nn.Parameter(torch.zeros(num_outputs, requires_grad=True))

params = [W1, b1, W2, b2]

def relu(X):
    a = torch.zeros_like(X)
    return torch.max(X, a)

def net(X):
    X = X.reshape((-1, num_inputs))
    H = relu(X@W1 + b1)  # 这里“@”代表矩阵乘法
    return (H@W2 + b2)

loss = nn.CrossEntropyLoss(reduction='none')

num_epochs, lr = 10, 0.1
updater = torch.optim.SGD(params, lr=lr)
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, updater)

d2l.predict_ch3(net, test_iter)

This post is from Special Edition for Assessment Centres

Return list

Guess Your Favourite

Just looking around

Hands-on deep learning (PyTorch version) - [Reading activity-experience sharing] Implementation of linear regression
IntroductionThischapterwillimplementlinearregressionaccordingtothetutorialprovidedinthebook,implementdatatraining,savemodeldata,andloadthemodelfordatapredictionwhenused.LinearregressionisusuallyusedtopredicttheoutputofYusingtheinputofXifthe ...
Hands-on Deep Learning (PyTorch Edition) - [Reading Activity-Sharing Experience] Using Softmax to Implement Image Classification
IntroductionInthepreviouschapters,weimplementedlinearregressiontopredictacertaindatavalueThemathinthischapterisabitdifficultIwillanalyzethecodebasedonmyownunderstanding.trans=transforms.ToTensor()mnist_train=torchvisionFashionMNIST(root="/d ...
Hands-on Deep Learning (PyTorch Edition) - [Reading Activity-Sharing Experience] Implementation of Multilayer Perceptron
IntroductionAlthoughthereadingactivityisover,IreallywanttofinishlearningeverythinginthisbookInthischapter,welearnedaboutmulti-layerperceptronsEachlayerhasitsownweightandbiasThenthehiddenlayerextractsthefeaturesoftheinputlayer(thesizeofthehi ...
GB 4824-2019 CISPR 11-2016
Which company has reflow soldering? Can you help solder some components?
AD621 and AD620
Introduction to the conventional methods and shortcut keys for drawing schematics in KiCad
When a differential op amp is used as a voltage detector for a battery in a battery pack, what is the output voltage expression? Virtual short: V+=v- ...
520 is coming soon, the science and engineering man prepared a hardcore gift for his wife
Automatic License Plate Recognition System Based on ARM Cortex

Find a datasheet?

EEWorld Datasheet Technical Support

Hot tag

Related articles more>>

Nvidia's new generation Blackwell GPU is exposed to overheating problems, causing delivery delays
On November 18, local time on Sunday, The Information reported that Nvidia's new generation of Bl
Tata Motors acquires Pegatron's only iPhone factory in India, deepening cooperation with Apple
On November 17, Reuters reported that two sources said that Indian technology giant Tata Group ha
Huawei's Strategic Department Director Gai Gang: The cumulative installed base of open source Euler operating system exceeds 10 million sets
On November 15, it was reported that the first AI native open source operating system of openEule
Wi-Fi 8 specification is on the way: 2.4/5/6GHz triple-band operation
Vietnam's chip packaging and testing business is growing, and supply-side fragmentation is splitting the market
Apple faces class action lawsuit from 40 million UK iCloud users, faces $27.6 billion in claims
The US asked TSMC to restrict the export of high-end chips, and the Ministry of Commerce responded
ASML predicts that its revenue in 2030 will exceed 457 billion yuan! Gross profit margin 56-60%
Qualcomm launches its first RISC-V architecture programmable connectivity module QCC74xM, supporting Wi-Fi 6 and other protocols

New Posts

Featured

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

User Search：

Room 1530, Zhongguancun MOOC Times Building, Block B, 18 Zhongguancun Street, Haidian District, Beijing 100190, China Tel:(010)82350740 Postcode：100190

京公网安备 11010802033920号

快速回复返回顶部 Return list