A Chinese guy's CVPR 18 paper was questioned: it was difficult for his peers to reproduce it, and he asked the review team to withdraw the paper

Latest update time：2018-09-29

Reads：

Annie from Aofei Temple
Produced by Quantum Bit | Public Account QbitAI

This academic turmoil before National Day was more severe than any other time.

Yesterday, a netizen with the ID p1esk pointed out in the Reddit forum that there was something wrong with the accepted paper Perturbative Neural Networks at the top international conference CVPR 18 .

This paper claims to propose a lightweight and efficient model that can be used as a substitute for convolutional neural networks , with an accuracy rate of up to 90.53%. After multiple hands-on experiments, netizen p1esk found that the highest accuracy rate obtained was only 85.91%, and the results could not be reproduced at all.

The difficulty of reproducing papers is a common and serious problem in the integrated academic circle, which makes the entire industry hate and headache. It wastes a lot of other people's time, but people can only smell it but not taste it. I don't know how many young people are troubled by it.

p1esk feels that this paper, which can almost be declared invalid, should not waste anyone’s time any longer and should be withdrawn immediately.

As if dropping a bomb into the deep waters of the forum, the academic controversy instantly triggered a lot of discussion.

Accuracy of “discount”

The focus of the problem lies in the accuracy calculation method proposed in the paper.

In the paper Perturbative Neural Networks , the researchers proposed a CNN alternative, the perturbative neural network (PNN), which eliminates convolution in the traditional sense and calculates the response as a weighted linear combination while inputting additive noise perturbations of nonlinear activations.

Perturbative Neural Networks

Address: https://arxiv.org/abs/1806.01817

Through analysis and practice, the authors of the paper confirmed that the perturbation layer can effectively replace the traditional convolutional layer. When tested in visual datasets such as MNIST, CIFAR-10, PASCAL and ImageNet, the PNN with fewer parameters performed comparable to the standard CNN.

Seeing that the PNN model method is novel, the results are excellent, and the relevant code is provided, netizen p1esk found it very interesting and tried to reproduce it according to the method mentioned. On GitHub, the guy shared his reproduction results.

Before reproducing, p1esk first analyzed the original implementation of the paper author and found that in the first layer of the network, the original implementation applied conventional convolution, but the remaining layers used a fan-out of size 1, that is, each input channel used a single noisy mask.

Later, p1esk found the biggest problem of the original implementation: the accuracy calculation method was incorrect . The author did not calculate the accuracy on all examples of the test data set, but chose to calculate it separately in each batch, and applied a neural network with smooth weights. The accuracy calculated by the original paper author is actually = 0.7* accuracy of the previous batch + 0.3* accuracy of the current batch.

After some discussion, the implementation results of p1esk and the original author are different:

When the model runs noiseresnet18 in the CIFAR-10 dataset, the accuracy in the original paper is 90.53%, while p1esk achieves the highest accuracy of 85.91% using the modified method .

So the question is, is this method, which was miscalculated at the beginning, useful ? Netizen p1esk conducted a large number of experiments to verify whether better results would be obtained if the input was disturbed by a noise mask.

To this end, I built three models: a baseline model that reduced the number of filters to make the number of parameters similar to that of PNN; a model that used noise-free 1×1 convolutions in all layers except the first; and a model that used perturbed 1×1 convolutions in all layers except the first.

After some manipulations, I found that adding a noise mask improved the equivalent “incomplete” ResNet by no more than 1% compared to the noise-free one. Using a 1×1 filter resulted in a drop in accuracy no matter how the noise mask was applied.

Finally, p1esk concluded that the accuracy calculation method in the paper is incorrect and the method proposed by the author is invalid, so the paper is meaningless .

However, at present, these are only one-sided statements from p1esk, and there is no conclusion yet.

Arousing heated discussion

The process of p1esk's reproduction attracted many people's attention. Regarding this academic controversy worthy of discussion, everyone's opinions were not consistent .

The first wave of comments was based on the event itself. Many netizens lamented that the "difficulty in reproducing" the paper was a major problem facing current scientific research and that they themselves had suffered greatly from it.

There are also doubts about p1esk itself. After studying the reproduction method of p1esk, netizen alexmlamb felt that the conclusion that "PNN is invalid" was somewhat untenable, and the actual accuracy given in the reproduction and the accuracy mentioned in the report were not much different after 100 iterations.

The good news is that after the incident, the team being questioned did not play dumb and quickly responded positively to p1esk's questions .

The first author of the paper, Felix Juefei-Xu (Reddit ID: katanaxu, hereinafter referred to as Xu), first thanked netizens for their efforts and reminders in realizing PNN. The team is currently thoroughly analyzing this work and fully affirms the team's work before providing further response.

Regarding the problem of verification method pointed out by netizens, Xu admitted the negligence and said that if the results are indeed too different, the team will retract the paper:

“The default smoothing function in our visualization tool was an oversight, and we have fixed it and are rerunning the entire experiment. We will update the arXiv paper and Github with updated results. If experimental results show that our results are indeed significantly worse than reported in the CVPR version, we will retract the paper.”

In response to the netizens' reproduction, Xu also expressed his own views: "In summary, according to my preliminary evaluation, in his implementation, as long as the appropriate # filters, noise level, and optimization method are selected, 90~91% of the effect can be achieved on CIFAR-10, and the above parameters he chose are 85~86%. However, I will not say more before I see more of his (process)."

△ First author replied to the original text

Chinese student

Xu's response was relatively sincere and rational, and subsequently won praise from many netizens .

Netizen toadlion said that although the wrong result sounds a bit disappointing, the response from the author makes sense and is the correct way to handle it.

Netizen kugkfokj also agreed with the author's response, but he felt that the paper should not be retracted even if the results were wrong. "Science not only includes what is correct and useful, but also what does not work. Both are equally important," he said.

"Everyone makes mistakes, and if it can save other people's time, then the mistake is valuable," said netizen mikolchon.

Even the netizen p1esk, who posted a question, praised the behavior of Xu and others. He believed that as scientific researchers, sharing their own code is something worth promoting in the academic circle, and the error in the accuracy calculation method is more like an " honest mistake ."

In fact, the first author of this team that does not shy away from questioning is a young Chinese man.

The paper is from Felix Juefei-Xu, Vishnu Naresh Boddeti and Marios Savvides of Carnegie Mellon University and Michigan State University.

The first author, Xu, is a Chinese guy. He graduated from Shanghai Jiao Tong University with a bachelor's degree in electrical engineering. After further study at CMU, he went on to pursue a doctorate in electrical and computer engineering at CMU under the tutelage of Professor Marios Savvides. He is currently continuing his research at the CMU CyLab Biometrics Center.

△ Professor Marios Savvides

Along the way, Xu has also become one of the "other people's children".

When he was in high school, Xu participated in the popular high school quiz show "SK Champions" and won the weekly championship. The following year, he was awarded the honorary title of Shanghai Outstanding High School Graduate.

△ The TV program "SK No. 1 List" hosted by Chunni

Afterwards, whether it was winning awards in the National College English Contest or receiving various best paper awards at IEEE series conferences while studying abroad, Xu has been on a steady path along the way.

One More Thing

On one hand, it is difficult to reproduce the paper, and on the other hand, the author's sincere response. I don't know what you think about this matter?

-over-

Join the community

The 28th group of the QuantumBit AI Community has started recruiting. Students who are interested in AI are welcome to reply to the keyword "communication group" in the dialogue interface of the QuantumBit public account (QbitAI) to obtain the way to join the group;

In addition, qubit professional sub-groups ( autonomous driving, CV, NLP, machine learning , etc.) are recruiting for engineers and researchers working in related fields.

To join the professional group, please reply to the keyword "professional group" in the dialogue interface of the Quantum Bit public account (QbitAI) to obtain the entry method. (The professional group has strict review, please understand)

Sincere recruitment

Qbit is recruiting editors/reporters, and the work location is Beijing Zhongguancun. We look forward to talented and enthusiastic students to join us! For relevant details, please reply to the word "recruitment" in the dialogue interface of the Qbit public account (QbitAI).