Tencent's old photo restoration algorithm is open source, with details down to the hair, and 3 pre-trained models available for download | GitHub Hot List

Latest update time：2022-03-13

Reads：

Mingmin from Aofei Temple
Quantum Bit | Public Account QbitAI

Do you remember this GFPGAN that can restore old photos to every detail ?

Now, its code is officially open source!

The official has uploaded 3 pre-trained models on GitHub. The effects of the 3 versions are as follows:

Among them, V1.3 is the latest updated version, which has a more natural repair effect and can output high-quality results under low-quality input conditions.

Since its launch, GFPGAN has attracted more than 17,000 stars on GitHub and has even topped the hot list.

It even caused a wave of trial play craze on Twitter:

This project was proposed by Tencent PCG ARC Laboratory, and its related papers have been included in CVPR2021.

3 pre-trained models to choose from

The open source code is mainly divided into two parts: pre-training and training.

In the pre-training, the V1.3 version of GFPGAN is taken as an example, and the download address of the pre-trained model is given:

wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth -P Experiments/pretrained_models

Then, you can start inference on the pre-trained model with just one line of code:

python inference_gfpgan.py -i inputs/whole_imgs -o results -v 1.3 -s 2

The details are as follows:

Usage: python inference_gfpgan.py -i inputs/whole_imgs -o results -v 1.3 -s 2 [options]...

  -h                   show this help
  -i input             Input image or folder. Default: inputs/whole_imgs
  -o output            Output folder. Default: results
  -v version           GFPGAN model version. Option: 1 | 1.2 | 1.3. Default: 1.3
  -s upscale           The final upsampling scale of the image. Default: 2
  -bg_upsampler        background upsampler. Default: realesrgan
  -bg_tile             Tile size for background sampler, 0 for no tile during testing. Default: 400
  -suffix              Suffix of the restored faces
  -only_center_face    Only restore the center face
  -aligned             Input are aligned faces
  -ext                 Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto

Here, the official also shows the differences between the three pre-trained models.

Compared with the initial version, the latter two versions have significantly improved the restoration accuracy.

The sharpening of V1.2 is more obvious, and it also has some beauty effects, so it will look fake in some cases.

V1.3 has clearly solved this problem, making the output more natural and allowing for secondary repair; however, the downside is that facial features sometimes change (such as the Anne Hathaway example below) .

In short, V1.3 is not completely better than V1.2, and you can choose the appropriate model as needed.

Now comes the training part.

First, the dataset selected is FFHQ;

Then, put the downloaded pre-trained model and other data in the experiments/pretrained_models folder.

Other data include:

Pre-trained StyleGAN2 model, FFHQ face alignment model file and ArcFace model.

Next, modify the corresponding configuration file options/train_gfpgan_v1.yml .

Here, you can also try a simple version without face alignment options/train_gfpgan_v1_simple.yml .

Finally, you can start training.

python -m torch.distributed.launch —nproc_per_node=4 —master_port=22021 gfpgan/train.py -opt options/train_gfpgan_v1.yml —launcher pytorch

In addition, the authorities have two reminders.

First, inputting more high-quality face images can improve the restoration effect.

Second, some image preprocessing may be required during training, such as beautification.

If you choose to train V1.2, the official also provides a fine-tuning guide:

GFPGAN V1.2 uses a clean architecture, which is easier to deploy. It is converted from a bilinear model, so the original model needs to be fine-tuned before conversion.

Demo

In addition to the open source code, the official has also opened multiple online trial channels.

Here, we use HuggingFace to show you the specific effect.

Let’s first take a look at the restored Mona Lisa. Not only have the noise on her face been removed, but even the scarf on her hair is clearly visible.

The restored Einstein's facial wrinkles are more obvious when he smiles, and his hair and stubble have also been restored.

Finally, let’s take a look at the restored photo of young Ma Huateng. The photo is so clear that it seems as if it was taken yesterday.

Blind face restoration + a lot of prior information

GFPGAN can quickly and accurately repair various facial images, mainly by applying blind face restoration .

Traditional face restoration methods mainly focus on restoring specific degraded face images in the same scene.

For example, some previous face restoration methods would restore Obama's photo to a white face. In addition to the bias in the data set, this may also be due to the algorithm not modeling the characteristics of each face.

Blind face restoration solves this problem very well. It refers to the process of restoring a clear, high-quality target face image from a low-quality face image to be restored when the point spread function is unknown or uncertain.

It is essentially a non-matching face restoration method.

However, some previous blind face restoration methods did not perform well in details, so the author introduced rich prior information in GFPGAN to ensure high-quality output effects.

Specifically, in the GFP-GAN model framework, a degradation removal module and a pre-trained GAN are mainly used as priors.

The two modules are connected through latent-encoding mapping and multiple channel-segment spatial feature transformation layers (CS-SFT) .

During the training process, low-quality faces must first be subjected to rough processing such as noise reduction, and then the facial information is retained.

In terms of fidelity, the researchers introduced a Facial Component Loss to determine which details need to be enhanced and retained, and then repaired them with an Identity Preserving Loss .

team introduction

The first author of this paper is Xintao Wang, a researcher at Tencent ARC Lab (Shenzhen Applied Research Center).

He received his bachelor's degree from Zhejiang University and his doctorate from the Chinese University of Hong Kong.

During his doctoral studies, he studied under Professor Tang Xiaoou and Professor Chen Change Loy.

His research interests include computer vision and deep learning, with a particular focus on image and video restoration.

GitHub address:
https://github.com/TencentARC/GFPGAN

Paper address:
https://arxiv.org/abs/2101.04061

Trial address:
https://huggingface.co/spaces/akhaliq/GFPGAN

-over-

This article is the original content of [Quantum位], a signed account of NetEase News•NetEase's special content incentive plan. Any unauthorized reproduction is prohibited without the account's authorization.

Live Registration | How to build the "Android" of AI ecosystem

From perception to cognition, how long will it take for AI to reach the core of production? From software to data, how can the AI ecosystem build its own "Android"?

On March 16 at 19:30, the "Quantum Bit Viewpoint" CEO/CTO series sharing event will invite Tianyun Data CEO Lei Tao to share his personal insights live. Scan the QR code to make an appointment to watch the live broadcast~