Performance surpasses He Kaiming's Mask R-CNN! Huazhong University of Science and Technology master's student open-sources new image segmentation method | CVPR19 Oral

Latest update time：2019-03-05

Reads：

安妮乾明发自凹非寺
量子位报道 | 公众号 QbitAI

The intern has made another contribution!

This time, the intern who achieved good results was from Horizon, a master's student at Huazhong University of Science and Technology.

The research Mask Scoring R-CNN he completed as the first author surpassed He Kaiming's Mask R-CNN N in the COCO image instance segmentation task and won the oral report at the top computer vision conference CVPR 2019.

That is to say, it stood out from more than 5,000 submissions and became the top 5.6% .

No matter how the backbone is changed, the performance remains stable and is always slightly better than Mask R-CNN.

It can be said that the disciple is better than the master.

Moreover, their algorithm has been open sourced (the link is at the end of the article).

Score the mask

Mask R-CNN is a concise and flexible instance segmentation framework, one of the masterpieces of He Kaiming. It has amazed researchers since its debut in 2017, and He Kaiming also won the ICCV 2017 Best Paper Award.

△ What is the meaning of

How does the newly released Mask Scoring R-CNN surpass its predecessors?

The key lies in the name "scoring". In this paper, researchers proposed a new method to score the "instance segmentation hypothesis" of the algorithm . The accuracy of this score will affect the performance of the instance segmentation model.

However, the scoring methods used by predecessors such as Mask R-CNN are not very suitable.

In the instance segmentation task, although the output of these models is a mask, the scoring is shared with the bounding box target detection, and both are scores calculated based on the classification confidence of the target area.

This score may not be consistent with the quality of the image segmentation mask, and may be biased when used to evaluate the quality of the mask.

Therefore, this CVPR 2019 paper proposed a new scoring method: scoring the mask , which they called the mask score .

MS R-CNN architecture

The scoring method proposed in Mask Scoring R-CNN is very simple: it not only directly relies on the classification score obtained by detection, but also lets the model learn a separate scoring rule for masks: MaskIoU head .

The MaskIoU head is inspired by the classic evaluation metric AP (average accuracy), which compares the predicted mask with the object features. The MaskIoU head receives both the output of the mask head and the features of the ROI (Region of Interest) as input and is trained using a simple regression loss.

Finally, by considering both the classification score and the mask quality score , we can evaluate the quality of the algorithm.

The evaluation method is fair and just, and the performance of the instance segmentation model naturally improves.

Experiments have shown that when challenging the COCO benchmark, when evaluating with the mask score of MS R-CNN, the AP is always improved by nearly 1.5% on different backbone networks.

Better than Mask R-CNN

The table below compares the performance of MS R-CNN and other instance segmentation methods on the COCO 2017 test set (Test-Dev set).

Regardless of whether the backbone network is pure ResNet-101, or uses DCN or FPN, the AP score of MS R-CNN is a few percentage points higher than that of Mask R-CNN.

On the COCO 2017 validation set, MS R-CNN also scores better than Mask R-CNN:

Who is the author?

The first author, named Huang Zhaojin , is a master's student at Huazhong University of Science and Technology. He studied under Wang Xinggang, an associate professor at the School of Telecommunications of Huazhong University of Science and Technology. Wang Xinggang is also one of the authors of this paper.

The other authors are Chang Huang, Yongchao Gong and Lichao Huang from Horizon Robotics.

If you are interested in this study, please save the portal:

Mask Scoring R-CNN paper :

https://arxiv.org/abs/1903.00241

GitHub address :
https://github.com/zjhuang22/maskscoring_rcnn

Other optimization ideas for Mask R-CNN

Before this, some people have proposed ideas to optimize Mask R-CNN.

For example, a paper published by the Chinese University of Hong Kong, Peking University, SenseTime, and Tencent Youtu at CVPR 2018 proposed an instance segmentation framework called PANet.

The information propagation in Mask R-CNN is optimized, which improves the quality of generated prediction masks by accelerating the information flow and integrating features at different levels.

Without large-scale training, it won the instance segmentation task of the COCO 2017 challenge.

Paper address:

Path Aggregation Network for Instance Segmentation
https://arxiv.org/abs/1803.01534

Code address:
https://github.com/ShuLiu1993/PANet

-over-

Join the community

Qbit now opens the "AI+Industry" community, which is aimed at practitioners, technicians, product personnel and other personnel in the AI industry. You can choose the corresponding industry community according to your industry. Reply to the keyword "industry group" in the dialogue interface of the Qbit public account (QbitAI) to obtain the way to join the group. The industry group will be reviewed, please understand.

In addition, the Quantum Bit AI community is recruiting. Students who are interested in AI are welcome to reply to the keyword "communication group" in the dialogue interface of the Quantum Bit official account (QbitAI) to obtain the method to join the group.

Sincere recruitment

Qbit is recruiting editors/reporters, and the work location is Beijing Zhongguancun. We look forward to talented and enthusiastic students to join us! For relevant details, please reply to the word "recruitment" in the dialogue interface of the Qbit public account (QbitAI).