Use 3D demos to understand various optimization algorithms, and good news for C++ programmers

Latest update time：2019-03-28

Reads：

Xiaocha from Aofei Temple
Quantum Bit Report | Public Account QbitAI

There are many optimization algorithms in machine learning, such as SGD , Adam , AdaGrad , AdaDelta , etc. Their iterative formulas alone are enough to give people a headache.

Fortunately, these optimization tools are integrated into TensorFlow, Keras, and PyTorch, but do you really know how they "slide" to the minimum value in step-by-step iteration?

Now there is a demo of a machine learning optimization algorithm that can help you intuitively feel the impact of parameter adjustment on the algorithm results from the image, as well as their respective advantages and disadvantages.

It is ensmallen ! Its developers not only provide a demo, but also package a C++ database for programmers. Let's try it out first.

Try the Demo

The trial method is very simple, and you don’t even need to install any software. Just go to the ensmallen website, select the Demo tab, and you will see a 3D schematic diagram of a set of optimization algorithms.

The default on the page is the common Adam algorithm. We will see that the parameters follow the red route and eventually fall to the lowest point of the loss function:

On the left is the location of the initial value of the parameter, which is the red dot in the picture, and can be dragged at will with the mouse.

The figures in the middle and on the right are the "contours" of the loss function. The middle one uses different colors to mark different heights, and the right one directly gives the gradient field of the loss function, with the arrow pointing to indicate the direction of the gradient and the length to indicate the magnitude of the gradient. It can be seen that the denser the contours, the larger the gradient.

If you think the loss function graph above is not clear and intuitive enough, there is also a high-definition 3D graph:

As the number of iterations increases, the value of the loss function continues to decrease:

The hyperparameters that can be adjusted in the Adam algorithm are: step size , number of iterations , allowable error , β 1 , β 2 , fuzzy factor ϵ , and Batch Size .

Drag the slider to adjust the hyperparameters, and the shape and end point of the "red line" will change accordingly. Let's adjust the step size to see how this parameter affects the results.

Increasing the step size will make the learning curve oscillate more, and too small a step size will make the loss function converge too slowly:

△ The step sizes are 0.3, 0.03 and 0.003 respectively

The above is just the simplest case. The Demo interface also provides other strange loss functions:

And almost all common optimization algorithms:

These optimization algorithms have their own advantages and disadvantages in loss functions of different shapes.

If the "contours" of the loss function are ellipses, Adam converges very quickly, converging after only about 100 iterations, while AdaGrad takes nearly 300 iterations to converge.

But Adam does not have an absolute advantage. In graphs with multiple saddle points and local minima, although Adam decreases quickly in the beginning, it oscillates more severely in the last stage, and its convergence speed is slower than AdaGrad.

The above “how to play” is only a small part of the Demo. If you want to try more, please see the link at the end of the article.

Good news for C++ programmers

Don’t think that ensmallen is just a fun demo, in fact, it is also an efficient C++ optimization library. For programmers who use C++ to program AI, it can perform mathematical optimization on any function, solving the pain point of the lack of C++ machine learning tools.

In addition to packaging basic optimization algorithms, ensmallen also allows users to easily add new optimizers using a simple API. Implementing a new optimizer only requires a method and a new objective function, which can usually be done with one or two C++ functions.

The following requirements need to be met to install ensmallen:

Compiler that supports C++11
C++ linear algebra and scientific computing library Armadillo
Mathematical library OpenBLAS or Intel MKL or LAPACK

Everything in ensmallen is in the ens namespace, so it is often useful to put a using directive in your code:

using namespace ens;

Taking Adam as an example, the code is as follows:

RosenbrockFunction f;
arma::mat coordinates = f.GetInitialPoint();

Adam optimizer(0.001, 32, 0.9, 0.999, 1e-8, 100000, 1e-5, true);
optimizer.Optimize(f, coordinates);

Among them, the order of parameters in the Adam optimizer is: step size, batch size, β ₁ , β ₂ , ϵ, maximum number of iterations, allowed error, and whether to access each function in a random manner.

As for other optimization algorithms, you can go to the website to view detailed documentation.

Finally, attach all the resources:

Ensmallen compressed package download address:
https://ensmallen.org/files/ensmallen-1.14.2.tar.gz

Demo address:
https://vis.ensmallen.org/

-over-

Produced by Microsoft, hardcore course recommendation

△ Scan the QR code above Enter the mini program, explore Microsoft machine learning, and get one step closer to the Turing Award!

Click to read the original article to learn more about super hardcore data science.

Quantum Bit QbitAI · Toutiao signed author

Tracking new trends in AI technology and products

If you like it, click here!

Latest articles about

■Domestic 4o large model, understand the national style Li Ziqi in seconds

■The search engine for life is free to use, the open source version of Harry Potter's "Pensieve" is on the GitHub hot list, and it supports Chinese

■iPad can use AI painting interactive editing tool to become popular, netizens: tremble PS

■Real data for various tasks, large-scale online shopping benchmark Shopping MMLU open source｜NeurIPS&KDD Cup 2024

■Scheduled for December 11, registration for the MEET2025 Smart Future Conference has opened!

■2499, AI concentration is off the charts! Wear this pair of glasses, order coffee/real-time translation/AR navigation in one sentence

■Terminus launches its first universal intelligent agent, achieving high-dimensional perception of the physical world

■HKUST's embodied robotics team receives billions of yuan in funding

■ChatGPT paid features are free! Mistral copied Canvas and Artifacts