3D Reconstruction for Motion Blurred Images Using Deep Learning-Based Intelligent Systems

2021-12-15 12:48JingZhangKepingYuZhengWenXinQiandAnupKumarPaul
Computers Materials&Continua 2021年2期

Jing Zhang, Keping Yu, Zheng Wen,Xin Qi and Anup Kumar Paul

1Tamoritsusho Co., Ltd., Tokyo,110-0005,Japan

2Department of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an, 710054,China

3Global Information and Telecommunication Institute, Waseda University, Tokyo,169-8050,Japan

4School of Fundamental Science and Engineering, Waseda University, Tokyo,169-8050, Japan

5Department of Electronics and Communications Engineering, East West University, Dhaka, 1212,Bangladesh

Abstract: The 3D reconstruction using deep learning-based intelligent systems can provide great help for measuring an individual’s height and shape quickly and accurately through 2D motion-blurred images.Generally,during the acquisition of images in real-time, motion blur, caused by camera shaking or human motion, appears.Deep learning-based intelligent control applied in vision can help us solve the problem.To this end, we propose a 3D reconstruction method for motion-blurred images using deep learning.First, we develop a BF-WGAN algorithm that combines the bilateral filtering (BF) denoising theory with a Wasserstein generative adversarial network (WGAN) to remove motion blur.The bilateral filter denoising algorithm is used to remove the noise and to retain the details of the blurred image.Then,the blurred image and the corresponding sharp image are input into the WGAN.This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss functions.Next, we use the deblurred images generated by the BFWGAN algorithm for 3D reconstruction.We propose a threshold optimization random sample consensus (TO-RANSAC) algorithm that can remove the wrong relationship between two views in the 3D reconstructed model relatively accurately.Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively, which improves the accuracy of the 3D reconstruction results.The experimental results show that our BF-WGAN algorithm has a better deblurring effect and higher efficiency than do other representative algorithms.In addition,the TO-RANSAC algorithm yields a calculation accuracy considerably higher than that of the traditional RANSAC algorithm.

Keywords: 3D reconstruction;motion blurring;deep learning;intelligent systems;bilateral filtering;random sample consensus

1 Introduction

Due to some factors,such as camera shaking and human motion,real-time image blurring easily occurs.For a good visual effect,it is very important to remove the blur and obtain a sharp image[1].The“intelligent”solutions are essential in solving the blurring problem by using the effective critical thinking procedures to restore the sharp image.Most of the existing image deblurring methods are based on the image prior probability model.Krishnan et al.[2] assumed that the image gradient obeys the Laplace distribution, and Zoran et al.[3] simulated the distribution of the image gradient with a Gaussian mixture model.The image prior probability methods overlap with noise in the frequency domain or transform domain, so the excessive smoothing of texture structures greatly reduces the visual effect.In recent years, many scholars have applied deep learning to image deblurring algorithms.Xu et al.[4] proposed an image deblurring method based on a convolutional neural network (CNN) to overcome the ringing effect in saturated regions of images.Chakrabarti [5] predicted complex Fourier coefficients of motion kernels to perform non-blind deblurring in the Fourier space.Gong et al.[6] used a fully convolutional network for motion flow estimation.Nah et al.[7] adopted a kernel-free end-to-end approach that uses a multiscale CNN to directly deblur the image.However, the CNN method considers the prior features of the image indirectly,which are easily affected by noise.

To solve the problems of the existing deep learning algorithms, we propose a BF-WGAN algorithm,which combines the bilateral filtering (BF) [8] denoising theory with the Wasserstein generative adversarial network [9] (WGAN), to remove motion-blurred images.The BF-WGAN algorithm contains two parts.First, the bilateral filter denoising algorithm is used to remove the noise and retain the details of the blurred image.The advantage of the bilateral filter theory is that it not only considers the spatial distance between pixels but also considers the degree of similarity between pixels, which ensures that the pixel values near the edge are preserved.Second, the blurred image and corresponding sharp image are input into the WGAN.This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss [10] functions, which allows the finer texture-related details to be restored and the high-precision contours of the image to be revealed.Further,the BF-WGAN has fewer parameters comparing to multiscale CNN, which heavily speeds up the inference.Therefore, the BF-WGAN obtain state-of-the-art results in motion deblurring while faster than the closest competitor-CNN.

3D reconstruction of the human body is very useful for the rapid and accurate measurement of an individual’s height and body shape [11].With the use of 2D real-time images of the human body taken from different angles, 3D reconstruction technology can quickly and accurately provide information on the growth of children.At present, it is estimated that there are approximately 149 million children under the age of 6 with physical dysplasia worldwide.A child’s height and shape can directly reflect his or her magnitude of growth [12].Because there are many children that need to be evaluated, the traditional manual measurement methods for height and body shape require considerable manpower and time.

For the 3D reconstruction of motion-blurred images, we use the deblurred images generated with the BF-WGAN algorithm to perform the 3D reconstruction.The most important part of 3D reconstruction is the calculation of the camera parameters, which mainly include the global rotation matrix and global translation vector for multiview 3D reconstruction [13].The global rotation matrix was used to remove the wrong relationship between two views in the 3D reconstructed model.A commonly used method to calculate the global rotation matrix is the RANSAC algorithm [14].However, the traditional RANSAC algorithm uses a fixed threshold, which can affect the accuracy of the global rotation matrix.This paper proposes a threshold optimization random sample consensus (TO-RANSAC) algorithm that can adjust the threshold adaptively to improve the accuracy of the 3D reconstruction results.

The contributions of this paper are listed as follows:

a) We use deep learning-based intelligent systems to remove the motion blur in images.The BF-WGAN algorithm is proposed, which combines the BF denoising theory with WGAN.The BF denoising algorithm is used to remove the noise and retain the details of the blurred image.The WGAN adopts the blurred image, and corresponding sharp images are input into the WGAN.The BF-WGAN algorithm has a better deblurring effect and higher efficiency than other representative algorithms.

b) We adopt the deblurred images generated from the BF-WGAN algorithm to perform the 3D reconstruction.The TO-RANSAC algorithm is proposed, which can remove the wrong relationship between two views in the 3D reconstructed model relatively accurately.Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively,which improves the accuracy of the 3D reconstruction results.

The remainder of this paper is organized as follows:Section 2 consists of two parts.Part 2.1 presents the deep learning-based intelligent systems to remove the motion blur of images through the BF-WGAN algorithm, and Part 2.2 explains the TO-RANSAC algorithm that we used to perform the 3D reconstruction.In Section 3, we designed and evaluated an experiment to test the performance of the BFWGAN algorithm and the TO-RANSAC algorithm.In Section 4, we conclude our study and suggest directions for future work.

2 Our Approach

2.1 BF-WGAN Algorithm

Normally,the processing of an image depends upon the quality,and the captured image in poor quality might result in a mistake.The intelligent systems using intelligent decision-making algorithms and techniques can help us to solve the image blurring problem.

In a mathematical model,image blurring can be described by the convolution process for an image.The original sharp imagexis convolved with the blurring kernelk,while noisenis added.Then,we obtain the blurred image [15]:

where *is a convolution operator.

2.1.1 Bilateral Filter Denoising Algorithm

A bilateral filter is a nonlinear denoising algorithm that eliminates noise while preserving image details[16].The general Gaussian filter mainly considers the spatial distance between pixels when sampling but does not consider the degree of similarity between pixels [17].Compared with the Gaussian filter, the bilateral filter considers both the spatial distance and degree of similarity, thereby suppressing the irrelevant details and enhancing the sharp edges of the image.

Step 1:Compute the Gaussian weight region filter based on the spatial distances:

wheref(ξ ) andh(x) represent the input image and output image, respectively.ξ is near the neighborhood centered onx.c(ξ ,x) is the Gaussian weight based on spatial distance, which is used to measure the spatial distance between the centerxand the point ξ.

whereσdis the standard deviation.kd(x) is the normalization factor:

Step 2:Obtain the edge filter based on the degree of similarity:

wheres(f(ξ ),f(x)) is the weight based on the degree of similarity between pixels:

where γ(f(ξ ),f(x))=γ(f(ξ )-f(x))=‖f(ξ )-f(x)‖, σγis the standard deviation.kr(x) is the normalization factor:

Step 3:Create the bilateral filter by combining the Gaussian weight region filter with the edge filter:

wherek(x)is the normalization factor:

After the local subregion Ω is defined, the discretized form of the formula (8) can be expressed as follows:

2.1.2 WGAN Deblurring Algorithm

This paper proposes a WGAN deblurring algorithm that adopts both the WGAN loss and perceptual loss functions[18].The WGAN loss function ensures that the generated samples are diverse,thereby allows the fine texture-related details to be restored.The input and output results of the WGAN deblurring algorithm are shown in Fig.1.The input is the motion-blurred image, and the output result is the deblurred image [19].

The WGAN between generatorGand discriminatorDis the minimax value using Kantorovich-Rubinstein duality[20]:

wherexrepresents the original sharp image andErepresents the expectation.Γ is the set of 1-Lipschitz functions.Pris the data distribution, andPgis the model distribution, defined by ˜x=G(z), where the inputzrepresents the blurred image.D(x) represents the probability thatxis a real image.

Figure 1:Input and output results of the WGAN deblurring algorithm

①WGAN framework

As shown in Fig.2, the framework of the WGAN deblurring algorithm consists of a generator and a discriminator [21].

Figure 2:Framework of the WGAN deblurring algorithm

②Loss Function

The loss function of this paper consists of the WGAN loss and perceptual loss functions.The total loss functionLis defined as follows:

where λ=0.01 and is set according to the experience value.

WGAN loss.The WGAN lossLWGANis calculated as follows:

whereIBrepresents the blurred image.Deblurring is performed by the trained generatorGθGand discriminatorDθD.Nrepresents the size of the training data [22].

Perceptual loss.The perceptual loss function is defined as follows:

whereWi,jandHi,jare the dimensions of the feature maps.ϕi,jis the feature map obtained by the j-th convolution before the i-th maxpooling layer within the VGG19 network [23].ISrepresents the sharp image, andIBrepresents the blurred image.

2.2 Multi-view 3D Reconstruction Based on the TO-RANSAC Algorithm

Multiview 3D reconstruction is mainly composed of four parts:(1) Feature extraction and matching;(2) Camera parameter calculation; (3) 3D point cloud calculation; and (4) Bundle adjustment.The camera parameter calculation mainly involves the global rotation matrix and global translation vector for multiview 3D reconstruction [24].The global rotation matrix is used to remove the wrong relationship between two views in the 3D reconstructed model.

The most commonly used method to calculate the global rotation matrix is the RANSAC algorithm[25].However,the traditional RANSAC algorithm adopts a fixed threshold,which can affect the accuracy of the global rotation matrix.To improve the calculation accuracy, a threshold optimization random sample consensus algorithm (TO-RANSAC) is proposed.The TO-RANSAC algorithm can adjust the threshold adaptively, which prevents errors caused by different thresholds in the 3D reconstruction results.

The global rotation matrix is calculated by the relative rotation matrix through the least-squares optimization algorithm.The formula is shown in (15):

whereRijis a known relative rotation matrix,RiandRjare two global rotation matrices that need to be calculated respectively.First, we calculate the global rotation matrices.The wrong relationship between two views needs to be removed.Then,the global rotation matrices can be calculated with the formula(15).

This paper proposes a TO-RANSAC algorithm to remove the wrong relationship between two views in the 3D reconstructed model.TO-RANSAC is a combination of the RANSAC algorithm and the threshold optimization concept.The use of different threshold parameters for the traditional RANSAC will affect the algorithm results.To avoid this problem, the TO-RANSAC algorithm is used to determine whether the model is reliable on the basis of theNFA(number of false alarms) value [26].Generally, the smaller the value ofNFA, the more reliable the model is.The calculation formula is:

whereMis the calculated model parameter,kis the number of assumed correct samples,n0is the number of possible models,nis the number of total samples,nsis the minimum number of samples used to generate the modelM,lk(M)is the k-th smallest error for the modelM,and α0is the probability that the random error is 1.

The flow chart of the TO-RANSAC algorithm is shown in Fig.3.The TO-RANSAC algorithm consists of five steps, which are expressed below:

Figure 3:Flow-chart of the TO-RANSAC algorithm

Step 1:Determine the sampling timesN.We used formula(17)to determine the sampling timesN.

wherepis the confidence value,which was set to bep=0.99.qrepresents the minimum number of samples required for the calculation model,which was set to beq=3.ε is the interior point rate,which was set to be ε=0.95.

Step 2:Calculate the initial global rotation matrix.Formula(16)is used,whereMrepresents the initial global rotation matrix, which is calculated by the random spanning tree;nis the number of all two-view relationships, andnsis the number of edges on the random spanning tree.

Step 3:Calculate the errors for the remaining edges and sort the edges by the magnitude of the error.The error was calculated as the angle difference between the relative rotation matrix and the global rotation matrix, and the formula used is:

In formula (18),D(a,b) is the angle between the vectorsaandb.

Step 4:Calculate the value ofNFA(M,k)and update its minimum value.IfN>0,the algorithm returns to step 2, and the sampling timesNare reduced by 1; otherwise,the algorithm proceeds to Step 5.

Step 5:Select the edge set that minimizes the value ofNFA(M,k) according to the correct two-view relationship.

3 Experiments

For the performance evaluation of our approach,we collected 3000 real-time images of children from a kindergarten.There were 100 children aged 2-6 years,including 50 female students and 50 male students.A total of 30 real-time images were collected for each student in the JPG format.To evaluate the effect of the 3D reconstruction method for motion-blurred images, we simulated the method in three parts.First, simulated noise images and blurred images were generated.The noise images and blurred images were generated by a ThinkPad S3-490 computer [27].The algorithms for the simulated noise images and blurred images were run by MATLAB 2018b.Second, the BF-WGAN algorithm was run on GeForce RTX 2080Ti GPU and executed with Python.Moreover, the TO-RANSAC algorithm was run on a ThinkPad S3-490 computer for deblurring images,which was executed by MATLAB 2018b.

The children were aged from 2-6 years,and one student of each age was selected as an example.Fig.4 shows the original sharp images of five students from five different angles.The first child was a boy who was 2 years old,and his height was 80.3 cm.The second child was a girl who was 3 years old,and her height was 92.4 cm.The third child was a girl who was 4 years old,and her height was 101.7 cm.The fourth child was a girl who was 5 years old,and her height was 112.3 cm.The fifth child was a boy who was 6 years old,and his height was 123.1 cm.The size of the original sharp images was 512 × 512 pixels.

3.1 Generation of Simulated Noise and Blurred Images

We chose the images of a 2-year-old boy and a 4-year-old girl to simulate the experiment.For the generation of simulated noise and blurred images, we mainly considered two aspects:the image noise parameters and motion blur parameters.

3.1.1 Image Noise Parameters

Gaussian noise is a common type of noise that occurs with camera shaking[28].The MATLAB library includes a function that adds noise to an image,the imnoise function.We used the imnoise function to add Gaussian noise to the image.Fig.5 shows the Gaussian noise image with variancesV=0.01,V=0.008 andV=0.04.

3.1.2 Motion Blur Parameters

We used the MATLAB special function to blur the image and mainly considered two aspects:The blur angle and blur amplitude.For the blur angle, the blur amplitude was set to 15 pixels, and the blur angles studied were 30°, 45°, and 60°.Fig.6 shows the generated images of the two students with different blur angles and blur amplitudes.

3.2 Experiment of BF-WGAN Algorithm

3.2.1 Qualitative Evaluation

Fig.7 shows the image restoration results with noise and blurred image.Set the image restoration results with Gaussian noise varianceV=0.01,and the image restoration results with a blur amplitude of 15 pixels and a blur angle of 45°.Fig.7 shows that BF-WGAN algorithm effectively removes the noise and restores the fine texture-related details.

For comparison,we compared our algorithm with other image deblurring algorithms,including Xu L’s algorithm[6],Chakrabarti A’s algorithm[7],Gong D’s algorithm[8]and Nah S’s algorithm[9].Figs.8 and 9 show the results of the comparison of the different algorithms,including the noise and blurred images of two students.(a)is the original sharp image.(b)shows the images with Gaussian noise varianceV=0.008,with a blur amplitude of 20 pixels and a blur angle of 60°.(c)is the image restored with our BF-WGAN algorithm.(d)is the image restored with Xu L’s algorithm.(e)is the image restored with Chakrabarti A’s algorithm.(f)is the image restored with Gong D’s algorithm.(g)is the image restored with Nah S’s algorithm.Compared with the other four algorithms,our algorithms yielded the largest degree of restoration of the edge blur of the image, and the resulting image was the most similar to the original sharp image.

Figure 4:Original sharp images of five students

3.2.2 Quantitative Evaluation

①Time Contrast Experiment

For the time contrast experiment of image deblurring,the images of a 2-year-old boy and a 4-year-old girl were selected.The experiment was repeated 3 times for each group,and then,the average value of three measurements was used for analysis.

Figure 5:Gaussian noise image with different variances.(a) original sharp image.(b) V =0.01.(c)V =0.008.(d)V =0.04

Figure 6:Images of the two students with different blur angles.(a)Original sharp image.(b)Blur angle of 30°.(c) Blur angle of 45°.(d)Blur angle of 60°

Figure 7:Image restoration results with noise and blurred images.(a)Original sharp image.(b)Noise and blurred image.(c) Restored image

Figure 8:Image restoration results with various algorithms for a male student.(a)Original sharp image.(b)Noise and Blurred image.(c)BF-WGAN approach.(d)Xu L’s algorithm.(e)Chakrabarti A’s algorithm.(f)Gong D’s algorithm.(g)Nah S’s algorithm

Figure 9:Image restoration results with various algorithms for a female student.(a)Original sharp image.(b)Noise and Blurred image(c)BF-WGAN approach.(d)Xu L’s algorithm.(e)Chakrabarti A’s algorithm.(f) Gong D’s algorithm.(g)Nah S’s algorithm

Tab.1 shows the time contrast results of image deblurring using five algorithms.The BF-WGAN algorithm greatly reduces the time required for image deblurring because the BF-WGAN algorithm has fewer parameters than does the CNN,which greatly speeds up the inference process.

Table 1:Comparison of the time required for image deblurring using five algorithms

②Accuracy Contrast Experiment

We adopt the peak signal-to-noise ratio (PSNR) [29,30] to measure the accuracy of image deblurring.For the blurred image of the 2-year-old boy, the images had Gaussian noise varianceV=0.008, a blur amplitude of 20 pixels, and blur angle of 60°.Figs.10 and 11 show the PSNR results of the blurred image for the five algorithms.The PSNR value of our BF-WGAN is higher than the other four representative algorithms, and it yields a better restoration effect.

Figure 10:Comparison of the PSNR value for a 2-year-old boy

Figure 11:Comparison of the PSNR value for a 4-year-old girl

3.3 Experiment of 3D Reconstruction

3.3.1 Results of 3D Reconstruction

Fig.12 shows the 3D reconstruction results for the 2-year-old boy.According to the 3D reconstruction results,the height,shoulder width and head width of the 2-year-old boy were 79.1 cm,25.3 cm and 14.2 cm,respectively.Compared with the actual measured data of the 2-year-old boy, the differences in the height,shoulder width and head width were 1.2 cm, 0.7 cm and 0.5 cm, respectively.Therefore, the ACRANSAC 3D reconstruction algorithm presents a reasonable reconstruction effect.

Figure 12:3D reconstruction results for the 2-year-old boy

Fig.13 shows the 3D reconstruction results for the 4-year-old girl.According to the 3D reconstruction results,the height,shoulder width and head width of the 4-year-old girl were 102.5 cm,28.7 cm and 17.2 cm,respectively.Compared with the actual measured data of the 2-year-old boy, the differences in the height,shoulder width and head width were 0.8 cm, 0.6 cm and 0.4 cm, respectively.Therefore, the ACRANSAC 3D reconstruction algorithm also presents a reasonable reconstruction effect.

3.3.2 Performance of 3D Reconstruction

The TO-RANSAC and RANSAC algorithms were used to remove the wrong two-view relationships.For the 2-year-old boy and 4-year-old girl, Tabs.2 and 3 show the comparison of the wrong edges removed with the TO-RANSAC and RANSAC algorithms.The threshold parameter of the RANSAC was set to 1°, and the TO-RANSAC algorithm used the adaptive threshold parameters.The second column in the table shows the number of wrong edges removed after using the TO-RANSAC and RANSAC algorithms.The third column shows the percentage of wrong edges removed to the total number of edges.Compared with the RANSAC algorithm, the TO-RANSAC algorithm preserves more relationships between two views in the 3D reconstructed model.

For the 2-year-old boy and 4-year-old girl,Tabs.4 and 5 show the comparison of the 3D reconstruction results determined with the TO-RANSAC and RANSAC algorithms.The 3D reconstruction results of the TO-RANSAC algorithm were better than those of the RANSAC algorithm.The RANSAC algorithm used a fixed threshold, which can affect the accuracy of the global rotation matrix.Therefore, the TORANSAC algorithm obtained more 3D points and exhibited higher accuracy.Compared with the RANSAC algorithm, the two algorithms required almost the same amount of time to run, which indicates that the TO-RANSAC algorithm is stable.

Figure 13:3D reconstruction results for the 4-year-old girl

Table 2:Comparison of the wrong edges removed with the TO-RANSAC and RANSAC algorithms for the images of a 2-year-old boy

Table 3:Comparison of the wrong edges removed with the TO-RANSAC and RANSAC algorithms for the images of a 4-year-old girl

Table 4:3D reconstruction comparison of the TO-RANSAC and RANSAC algorithms

Table 5:3D reconstruction comparison of the TO-RANSAC and RANSAC algorithms

4 Conclusion

The“intelligent”solutions are essential to take care of solving the blurring problem,which uses effective critical thinking procedures to restore the sharp image.First,we propose a BF-WGAN algorithm to remove the motion-blurred images,which combines the BF denoising theory with a WGAN.In this algorithm,the bilateral filter denoising algorithm is used to remove the noise and retain the details of the blurred image.Then, the blurred image and corresponding sharp image are input into the WGAN.This algorithm distinguishes the motion-blurred image from the corresponding sharp image according to the WGAN loss and perceptual loss functions, which allows the fine texture-related details to be revealed and the highprecision contours of the images to be revealed.Second, we used the deblurred images generated by the BF-WGAN algorithm to perform 3D reconstruction.The TO-RANSAC algorithm is proposed, which can remove the wrong relationships between two views in the 3D reconstructed models relatively accurately.Compared with the traditional RANSAC algorithm, the TO-RANSAC algorithm can adjust the threshold adaptively, which improves the accuracy of the 3D reconstruction results.The experimental results show that our BF-WGAN has a better deblurring effect and higher efficiency than do other representative algorithms.In addition, the TO-RANSAC 3D reconstruction algorithm yields a calculation accuracy considerably higher than that of the traditional RANSAC algorithm.

In a word,deep learning is significant for successfully executing image deblurring tasks.Effective deep learning algorithms can help yield more accurate 3D data,which can be used to measure individuals’height and shape quickly and accurately.The vast use of these intelligent systems is due to its intelligent decisionmaking algorithms and techniques.However,deep learning trends in intelligent systems have the possibility of slowing down the entire computing process.There may be significant performance pressure on the processing and evaluation of images.In order to overcome these limitations in accuracy and computational time, we need to incorporate an effective deep learning image processing algorithm with an efficient data processing architecture in the future.

Acknowledgement:The author would like to thank the anonymous reviewers for their valuable comments and suggestions that improve the presentation of this paper.

Funding Statement:This work was supported in part by the National Natural Science Foundation of China under Grant 61902311 and in part by the Japan Society for the Promotion of Science(JSPS)Grants-in-Aid for Scientific Research (KAKENHI)under Grant JP18K18044.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.