Neural networks have been a mainstay of artificial intelligence since its early days. Now, exciting new techniques such as deep learning and convolutions are taking neural networks in a whole new direction. In this book, we demonstrate neural networks on a variety of real-world tasks, such as image recognition and data science. We examine current neural network techniques, including ReLU activations, stochastic gradient descent, cross entropy, regularization, dropout, visualization, and more. Chapter 1 Neural Network Basics 1 1.1 Neurons and Layers 2 1.2 Types of Neurons 5 1.2.1 Input and Output Neurons 6 1.2.2 Hidden Neurons 7 1.2.3 Bias Neurons 7 1.2.4 Context Neurons 8 1.2.5 Other Neuron Types 10 1.3 Activation Functions 10 1.3.1 Linear Activation Function 10 1.3.2 Step Activation Function 11 1.3.3 S-Type Activation Function 12 1.3.4 Hyperbolic Tangent Activation Function 13 1.4 Rectified Linear Unit (ReLU) 13 1.4.1 Softmax Activation Function 14 1.4.2 What Role Does the Bias Play? 17 1.5 Neural Network Logic 19 1.6 Chapter Summary 22 Chapter 2 Self-Organizing Maps 23 2.1 Self-Organizing Maps 24 2.1.1 Understanding Neighborhood Functions 27 2.1.2 Mexican Hat Neighborhood Functions 30 2.1.3 Computing SOM Errors 32 2.2 Chapter Summary 33 Chapter 3 Hopfield Networks and Boltzmann Machines 34 3.1 Hopfield Neural Networks 34 3.1.1 Training Hopfield Networks 37 3.2 Hopfield-Tank Networks 41 3.3 Boltzmann Machines 42 3.3.1 Boltzmann Machine Probability 44 3.4 Applying Boltzmann Machines 45 3.4.1 Traveling Salesman Problem 45 3.4.2 Optimization Problems 48 3.4.3 Boltzmann Machine Training 51 3.5 Chapter Summary 51 Chapter 4 Feedforward Neural Networks 53 4.1 Feedforward Neural Network Architecture 54 4.1.1 Single-Output Neural Network for Regression 54 4.2 Computing the Output 56 4.3 Initializing Weights 60 4.4 Radial Basis Function Networks 63 4.4.1 Radial Basis Functions 64 4.4.2 Radial Basis Function Networks 65 4.5 Normalizing Data 67 4.5.1 1-of-N Encoding 68 4.5.2 Range Normalization 69 4.5.3 Z-Score Normalization 70 4.5.4 Complex Normalization 73 4.6 Chapter Summary 75 Chapter 5 Training and Evaluation 77 5.1 Evaluating Classification 78 5.1.1 Binary Classification 79 5.1.2 Multiclass Classification 84 5.1.3 Logarithmic Loss 86 5.1.4 Multiclass Logarithmic Loss 88 5.2 Evaluating Regression 88 5.3 Simulated Annealing Training 89 5.4 Chapter Summary 92 Chapter 6 Backpropagation Training 93 6.1 Understanding Gradients 93 6.1.1 What is a Gradient 94 6.1.2 Computing Gradients 96 6.2 Computing Output Node Increments 98 6.2.1 Quadratic Error Function 98 6.2.2 Cross Entropy Error Function 99 6.3 Computing Remaining Node Increments 99 6.4 Derivatives of Activation Functions 100 6.4.1 Derivatives of Linear Activation Functions 100 6.4.2 Derivatives of Softmax Activation Functions 100 6.4.3 Derivatives of S-Type Activation Functions 101 6.4.4 Derivatives of Hyperbolic Tangent Activation Functions 102 6.4.5 Derivatives of ReLU Activation Functions 102 6.5 Applying Backpropagation 103 6.5.1 Batch Training and Online Training 104 6.5.2 Stochastic Gradient Descent 105 6.5.3 Backpropagating Weight Updates 105 6.5.4 Choosing Learning Rate and Momentum 106 6.5.5 Nesterov Momentum 107 6.6 Chapter Summary 108 Chapter 7 Other Propagation Training 110 7.1 Elastic Propagation 110 7.2 RPROP Parameters 111 7.3 Data Structures 113 7.4 Understanding RPROP 114 7.4.1 Determining the Sign Change of Gradients 114 7.4.2 Computing Weight Changes 115 7.4.3 Modifying Update Values 115 7.5 Levenberg-Marquardt Algorithm 116 7.6 Computation of the Hessian Matrix 119 7.7 LMA with Multiple Outputs 120 7.8 Overview of the LMA Process 122 7.9 Chapter Summary 122 Chapter 8 NEAT, CPPN, and HyperNEAT 124 8.1 NEAT Networks 125 8.1.1 NEAT mutations 128 8.1.2 NEAT crossover 129 8.1.3 NEAT speciation 133 8.2 CPPN networks 134 8.2.1 CPPN phenotypes 135 8.3 HyperNEAT networks 138 8.3.1 HyperNEAT substrates 139 8.3.2 HyperNEAT computer vision 140 8.4 Chapter summary 142 Chapter 9 Deep Learning 143 9.1 Deep Learning Components 143 9.2 Partially Labeled Data 144 9.3 Rectified Linear Units 145 9.4 Convolutional Neural Networks 145 9.5 Neuronal Dropout 146 9.6 GPU Training 147 9.7 Deep Learning Tools 149 9.7.1 H2O 149 9.7.2 Theano 150 9.7.3 Lasagne and NoLearn 150 9.7.4 ConvNetJS 152 9.8 Deep Belief Neural Networks 152 9.8.1 Restricted Boltzmann Machines 154 9.8.2 Training DBNNs 155 9.8.3 Sampling Layer by Layer 157 9.8.4 Computing Positive Gradients 157 9.8.5 Gibbs Sampling 159 9.8.6 Updating Weights and Biases 160 9.8.7 Backpropagation in DBNNs 161 9.8.8 Application of Deep Beliefs 162 9.9 Chapter Summary 164 Chapter 10 Convolutional Neural Networks 165 10.1 LeNET-5 166 10.2 Convolutional Layers 168 10.3 Pooling Layers 170 10.4 Dense Layers 172 10.5 ConvNets for the MNIST Dataset 172 10.6 Chapter Summary 174 Chapter 11 Chapter Pruning and Model Selection 175 11.1 Understanding Pruning 176 11.1.1 Pruning Connections 176 11.1.2 Pruning Neurons 176 11.1.3 Improving or Reducing Performance 177 11.2 Pruning Algorithms 177 11.3 Model Selection 179 11.3.1 Grid Search Model Selection 180 11.3.2 Random Search Model Selection 183 11.3.3 Other Model Selection Techniques 184 11.4 Chapter Summary 185 Chapter 12 Dropout and Regularization 186 12.1 L1 and L2 Regularization 187 12.1.1 Understanding L1 Regularization 188 12.1.2 Understanding L2 Regularization 189 12.2 Dropout Layers 190 12.2.1 Dropout Layers 191 12.2.2 Implementing a Dropout Layer 191 12.3 Using Dropout 194 12.4 Chapter Summary 195 Chapter 13 Time Series and Recurrent Networks 197 13.1 Time Series Encoding 198 13.1.1 Encoding Data for Input and Output Neurons 199 13.1.2 Predicting Sine Waves 200 13.2 Simple Recurrent Neural Networks 204 13.2.1 Elman Neural Networks 206 13.2.2 Jordan Neural Networks 207 13.2.3 Backpropagation Through Time 208 13.2.4 Gated Recurrent Units 211 13.3 Chapter Summary 213 Chapter 14 Architecting Neural Networks 214 14.1 Evaluating Neural Networks 215 14.2 Training Parameters 215 14.2.1 Learning Rate 216 14.2.2 Momentum 218 14.2.3 Batch Size 219 14.3 General Hyperparameters 220 14.3.1 Activation Function 220 14.3.2 Configuration of Hidden Neurons 222 14.4 LeNet-5 Hyperparameters 223 14.5 Chapter Summary 224 Chapter 15 Visualization 226 15.1 Confusion Matrix 227 15.1.1 Reading Confusion Matrix 227 15.1.2 Generating Confusion Matrix 228 15.2 t-SNE Dimensionality Reduction 229 15.2.1 t-SNE Visualization 231 15.2.2 t-SNE Beyond Visualization 235 15.3 Chapter Summary 236 Chapter 16 Modeling with Neural Networks 237 16.0.1 16.0.2 How we won the challenge 242 16.0.3 Our approach to the challenge 244 16.1 Modeling with deep learning 245 16.1.1 Neural network architecture 245 16.1.2 Bag multiple neural networks 249 16.2 Chapter summary 250 Appendix A: Example code usage 252 A.1 Introduction to the series 252 A.2 Stay updated 252 A.3 Get the example code 253 A.3.1 Download the zip file 253 A.3.2 Clone the Git repository 254 A.4 Contents of the example code 255 A.5 How to contribute to the project 257 References 259
You Might Like
Recommended ContentMore
Open source project More
Popular Components
Searched by Users
Just Take a LookMore
Trending Downloads
Trending ArticlesMore