An efficient and globally optimal solution to perspective-n-line problem

2022-03-25 04:15:02QidaYUGuiliXUZhengshengWANGZhenhuaLI
Chinese Journal of Aeronautics 2022年3期

Qida YU, Guili XU, Zhengsheng WANG, Zhenhua LI

College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China

KEYWORDS Camera pose estimation;Gro¨bner basis;Machine vision;Perspective-n-Line (PnL)problem;Quaternion parameterization

Abstract This research develops an accurate and efficient method for the Perspective-n-Line(PnL)problem.The developed method addresses and solves PnL via exploiting the problem’s geometry in a non-linear least squares fashion. Specifically, by representing the rotation matrix with a novel quaternion parameterization,the PnL problem is first decomposed into four independent subproblems. Then,each subproblem is reformulated as an unconstrained minimization problem, in which the Kronecker product is adopted to write the cost function in a more compact form. Finally, the Gro¨bner basis technique is used to solve the polynomial system derived from the first-order optimality conditions of the cost function. Moreover, a novel strategy is presented to improve the efficiency of the algorithm.It is improved by exploiting structure information embedded in the rotation parameterization to accelerate the computing of coefficient matrix of a cost function. Experiments on synthetic data and real images show that the developed method is comparable to or better than state-of-the-art methods in accuracy, but with reduced computational requirements.

1. Introduction

While the Perspective-n-Line(PnL)problem has been of mathematical and practical interest for more than two decades,increasingly better techniques for solving it continue to be developed. This problem involves using a single photograph of n known lines in space to determine the pose (orientation and position, or rotation and translation) of a fully calibrated camera that took the photograph. It has numerous applications in the computer vision, photogrammetry, and robotics fields, such as camera tracking,structure from motion,and visual serving.Particularly, in various space missions, accurate pose estimation is an indispensable condition to safe close-proximity operations related to rendezvous and docking,assembly,and monitoring.

As is rather well known,when n=3,the PnL problem may yield multiple(up to eight)solutions,any of which could be the pose of the camera,while for n ≥4,this problem always leads to a single solution,yet its analytical form cannot be obtained.In the literature, some specialized algorithmshave been proposed to address the so-called Perspective-3-Line (P3L)problem,which is the minimal case of the PnL problem.However, considering that data redundancy generally contributes to the improvement of accuracy as well as robustness,and that practical applications typically require a pose solver being capable of handling arbitrary number of feature correspondences,most of the existing PnL solvers focus on finding optimal solutions to the PnL problem for the general case of n ≥4. From the point of view of optimization, these solvers can be categorized into two classes: iterative methods and noniterative methods.

Iterative methods generally cast the PnL problem as a non-Linear Least Squares (non-LLS) one through iteratively optimizing a properly defined cost function with respect to geometric or algebraic errors.For instance,Liu et al.decoupled the estimation of camera orientation and position by first eliminating the linear parameters, then computing the rotation matrix in an iterative fashion. Kumar and Hansondeveloped an improved iterative algorithm that optimizes the rotation and translation errors simultaneously, and verified its significant advantage over.However, these iterative methods are computationally expensive when dealing with large-scale tasks,and their convergence processes are sensitive to initialization.Thus, they are usually not suitable for use in real-time applications.

Noniterative methods,instead,attempt to directly minimize an algebraic error to lower computational cost and avoid initialization.Among these methods,the Direct Linear Transformation(DLT)methodis the most straightforward approach to the PnL solution.In Ref.10,a linear equation system is first constructed by mapping the transformation relation between n(n ≥6) given 3D-2D line correspondences, and the system is then solved by Singular Value Decomposition (SVD). Later on,Prˇibyl et al.proposed several variants of the DLT algorithm for the PnL problem relying on the Plu¨cker coordinates of 3D lines. Ansar and Daniilidisobtained another linear solution by using a lifting method to linearize the original quadratic system into a linear one. Recently, Xu et al.extended two State-Of-The-Art (SOTA) linear PnP solvers(EPnPand REPnP) to address the PnL problem, and developed a series of linear-formulation-based approaches.In general, linear methods can offer satisfactory solutions when n ≥6, but yield results with low accuracy for n=4 and n=5 due to the neglect of some non-linear constraints,e.g., orthogonality.

To address the limitations of linear methods, non-linear methods have been employed.Zhang et al.proposed Robust PnL (RPnL) method, which converts the PnL problem into a suboptimal problem by solving a sixteenth order polynomial in one variable. Xu et al.modified RPnL into Accurate Subset-based PnL (ASPnL) method by adding a post global refinement. The RPnL method and its derivative algorithmscan work well for both weakly redundant and overdetermined cases, but their run time increases rapidly when n gets large.Mirzaei and Roumeliotisproposed a globally optimal PnL solver,called AlgLS.This method is based on the Cayley-Gibbs-Rodriguez(CGR)parameterization of rotations, in which the calculation of rotation and translation are decoupled by building a polynomial cost function with regard to rotation parameters, and then all stationary points of this rotational function are retrieved by using a polynomial resultant technique. Although the AlgLS method can achieve high accuracy and linear speed in most cases, its accuracy degenerates significantly for specific values of the rotation angles(±π)due to the singular problem of CGR.To address this issue,the author of Ref. 18 designed an easy strategy that rotates 3D lines to a new randomly generated frame, then finds the all optimal solutions,and finally rotates them back to the original frame. Yu et al.also adopted this strategy and provided a more compact formulation for the PnL problem (named as OPnL).In their method, the solutions were extracted by using the Gro¨bner basis techniques. OPnL is among the most efficient and accurate PnL solvers, and its superior performance is confirmed by a comprehensive comparison with the leading methods. However, the efficiency of OPnL can be further improved, because applying random pre-rotations to 3D lines is not elegant and may produce repeated solutions.

In this research, we present an efficient and globally optimal algorithm for the PnL problem, referred to as EOPnL,which is the counterpart of the Accurate and Scalable PnP(ASPnP)method and the extension of our previous work.By representing the rotation matrix with a novel quaternion parameterization, we first decompose the PnL problem into four independent subproblems, where repeated solutions will never appear. Then, each subproblem is reformulated as an unconstrained minimization problem,and the polynomial system derived from the first-order optimality conditions is finally solved by deeply studied Gro¨bner basis techniques. As an improvement to ASPnP, we exploit structure information embedded in the rotation parameterization to accelerate the computing of coefficient matrix of a cost function. Experiments on synthetic data and real images show that our EOPnL is comparable to or slightly better than the SOTA methods in accuracy, but with significantly less computational cost. Our method is able to cope with weakly-redundant (n=4 or n=5) and over-constrained configurations of the PnL problem accurately as well as efficiently, making it desirable for aerospace applications.

The remainder of the paper is organized as follows.We first give a brief description of the considered problem in Section 2,and then introduce our method in Section 3.Next,we provide an exhaustive comparison with the leading PnL methods in Section 4. Finally, Section 5 concludes the paper.

2. Problem definition

Fig. 1 Geometry of PnL problem.

3. Proposed method

3.1. Least squares minimization problem

Similar to previous works,we recast the PnL problem in terms of a non-LLS minimization problem,but we write it in a more compact form using the Kronecker product (denoted by ⊗).To this end, we first make Eq. (1) have the same structure as Eq. (2), respecting:

3.2. Unconstrained problem

To remove the constraints of R, i.e., Eqs. (11) and (12), our previous workuse the CGR parameterization of rotations,along with a trick to avoid the singularity at every 180angle.However, applying random rotations to 3D lines as a preprocessing is not elegant and may produce repeated solutions.Here we adapt the ASPnP method to address the PnL problem, where the singularity-free unit-quaternion parameterization of rotations is adopted. Note that the unit-quaternion parameterization is also free of the constraints of R, the same as CGR. It is defined as:

With these, our formulation (Eq. (10)) can be transformed into four minimization subproblems. In the following, we will take Case 1 for example to solve one of these subproblems,and others can be processed in a similar manner.

Eq. (19) is a system of three ternary polynomial equations,and it can be efficiently solved via the Gro¨bner Basis(GB)and action-matrix methods which have been deeply studied in recent years.Notably,Kukelova et al.introduced a currently widespread-used automatic generator for GB solvers, which provides enormous convenience for the solving of polynomial equations in computer vision problems. It analyses and determines GBs in the finite field, then builds the elimination template and the action matrix by iteration, where the action matrix can provide the solution to the original system via Eigen factorization. By calling the automatic generator from Ref. 22, we can get at most 27 solutions from Eq. (19). The resulting elimination template and action matrix are of size 89×116 and 27×27, respectively. For each solution, we substitute s;s;sback into Eqs. (14) and (8) for recovering the pose.

Using a similar way, we can find at most 9, 3, and 1 solutions to the problems for Cases 2–4, respectively. Thereby,there are at most 40 solutions in total, where we choose the solution with the smallest error value on Eq. (10) as the final globally optimal solution.

3.3. Efficient computation

Fig. 2 Experimental results in case of centered ordinary configuration.

Fig. 3 Experimental results in case of un-centered ordinary configuration.

4. Experimental results

In this section, we compare our proposed EOPnL against the following PnL methods: Ansar,AlgLS,DLT-Lines,RPnL,ASPnL,LPnL-Bar-LS,LPnL-Bar-ENull,DLT-Combined-Lines,SRPnL,and OPnL.To make a comprehensive comparison, we use both simulated data and real-world images to conduct evaluation experiments.The proposed method and its competitors are compiled using MATLAB scripts and the tests are carried out on a desktop PC equipped with an Intel Core i9-10900K 3.7 GHz CPU and 64 GB RAM. The source code of each competitor is provided by its original authors.

4.1. Experiments with simulated data

We use the same experimental setup as in our previous workto generate simulated data.We first synthesize a virtual camera whose resolution is 640×480 pixels and focal length is 800 pixels. The pose (rotation and translation) of the camera is randomly generated in the world frame. With this, we then generate the following three types of 3D-2D line correspondences:centered ordinary configurations,un-centered ordinary configurations, and planar configurations. Specifically, in the first two settings,the endpoints of 3D lines are uniformly sampled from the interval of [-2;2]× [-2;2]× [4;8 ]in the camera frame. The difference is that their corresponding 2D lines are distributed in the whole image for the centered case but[0;320]× [0;240]for the un-centered case.In the planar configurations,the endpoints of 3D lines lie on a [-2;2]× [-2;2]×c plane,where c ∊[4;8 ]is a constant.Finally,we attack the endpoints of 2D lines with different levels of zero-mean Gaussian noise. For a particular experiment, we perform 500 independent trials to obtain accurate statistics. To evaluate the rotation and translation errors, we use:

where rand rare the j-th columns of the ground-truth and estimated rotation matrices, respectively.

Fig. 4 Experimental results in case of planar configuration.

Fig. 5 Comparison of our method against SOTA in computational efficiency. The right plot shows a close-up version of the left one.

(1) Ordinary configurations We first evaluate the performance of all compared methods in ordinary configurations.To this end, we conduct two experiments. In the first experiment,we vary the number of lines(n)from 4 to 20 with a fixed standard deviation of Gaussian noise(σ=5 pixels).In the second experiment,we set n to 10 and vary σ from 1 to 15 pixels.Each experiment is the result of multiple(as many as 500)trials of this sort, and we show the results (the mean and median errors of rotation and translation) from these experiments in Figs.2 and 3,where Fig.2 is for the centered ordinary configurations, while Fig. 3 is for the un-centered ordinary configurations. As we can see, un-centered case is obviously more challenging than the centered one,since the errors of all methods get larger in the un-centered case. In both cases, our EOPnL can achieve comparable accuracy with the SOTA methods and is even slightly better.

Fig. 6 Samples selected from the VGG dataset.

Table 1 Results of competing approaches on the VGG dataset with regard to rotation and translation errors.

(2) Planar configurations For planar configurations, we carry out the same experiments as in ordinary configurations,i.e., varying n ∊[4;20] while fixing σ=5 pixels, and setting n=10 while increasing σ from 1 to 15 pixels. As illustrated in Fig. 4, most of the compared methods fail to handle planar configurations, only EOPnL, SRPnL, and OPnL can achieve high accuracy. Notably, in both ordinary and planar configurations, our EOPnL has better performance for the cases of n=4 and n=5.This attribute makes it extremely appropriate for aerospace applications,where typically only 4 or 5 line features are used to estimate the pose.

We also test the run time of all compared methods by varying n from 4 to 2000. The results are presented in Fig. 5. It is obvious that our EOPnL is the fastest among the non-linear optimization approaches in the case of n ≥100,which is about 1 ms(a nearly constant execution time w.r.t.n)faster than our previous work (OPnL). Comparing with linear methods, our method is faster than the DLT-Lines method when n ≥300 and the DLT-Combined-Lines method when n ≥900, but it is slower than the LPnL-Bar-LS and LPnL-Bar-ENull methods.However,our method has a superiority over them in accuracy. Our EOPnL takes about 2 ms to deal with 2000 line correspondences. Therefore, it can be applied to practical applications.

4.2. Experiments with real data

To compare our method with other methods using real-world images, we follow the experiments in Ref. 19, where we select ten pictures (displayed in Fig. 6) from the VGG dataset with true value of camera parameters and matched line correspondences, and run each method on them. In this test, the translation error Δt is measured by Δt=‖t-t‖, while the rotation error is evaluated in the same way as the experiments with simulated data. The detailed results are listed in Table 1, where EOPnL have the best performance in most cases.

5. Conclusions

In this research, we have presented an accurate as well as efficient method for solving the PnL problem based on a novel quaternion parameterization. Our presented method utilizes the Kronecker product to formulate the PnL problem, which makes the subsequent derivation more compact and leads to an inhomogeneous system for the cost function. This system can be efficiently solved by GB technique without any additional constraints. To further improve the efficiency of the algorithm, we exploit the structure information embedded in the rotation parameterization to accelerate the computing of coefficient matrix of a cost function.Experiments on synthetic data and real images show that the proposed method is comparable to or better than state-of-the-art methods in accuracy,but with reduced computational requirements.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

This work was supported in part by the National Natural Science Foundation of China (Nos. 61905112 and 62073161),in part by the China Scholarship Council (Nos.201906830092), and in part by the Fundamental Research Funds for the Central University (No. NZ2020005).