Unsupervised change detection of man-made objects using coherent and incoherent features of multi-temporal SAR images

2022-09-03 08:26FENGHaoWUJianzhongZHANGLuandLIAOMingsheng

FENG Hao ,WU Jianzhong ,ZHANG Lu,* ,and LIAO Mingsheng

1.State Key Laboratory of Information Engineering in Surveying,Mapping and Remote Sensing,Wuhan University,Wuhan 430079,China;2.Key Laboratory of Land Subsidence Monitoring and Prevention,Ministry of Land and Resources,Shanghai 200072,China;3.Shanghai Engineering Research Center of Land Subsidence,Shanghai 200072,China;4.Shanghai Institute of Geological Survey,Shanghai 200072,China

Abstract: Constrained by complex imaging mechanism and extraordinary visual appearance,change detection with synthetic aperture radar (SAR) images has been a difficult research topic,especially in urban areas.Although existing studies have extended from bi-temporal data pair to multi-temporal datasets to derive more plentiful information,there are still two problems to be solved in practical applications.First,change indicators constructed from incoherent feature only cannot characterize the change objects accurately.Second,the results of pixel-level methods are usually presented in the form of the noisy binary map,making the spatial change not intuitive and the temporal change of a single pixel meaningless.In this study,we propose an unsupervised man-made objects change detection framework using both coherent and incoherent features derived from multi-temporal SAR images.The coefficients of variation in timeseries incoherent features and the man-made object index (MOI)defined with coherent features are first combined to identify the initial change pixels.Afterwards,an improved spatiotemporal clustering algorithm is developed based on density-based spatial clustering of applications with noise (DBSCAN) and dynamic time warping (DTW),which can transform the initial results into noiseless object-level patches,and take the cluster center as a representative of the man-made object to determine the change pattern of each patch.An experiment with a stack of 10 TerraSAR-X images in Stripmap mode demonstrated that this method is effective in urban scenes and has the potential applicability to wide area change detection.

Keywords: change detection,multi-temporal synthetic aperture radar (SAR) data,coherent and incoherent features,clustering.

1.Introduction

Change detection in remote sensing is a technique to distinguish alterations of ground targets from multi-temporal earth observation datasets including optical and synthetic aperture radar (SAR) images [1-4].Compared with optical sensors,SAR sensors have the capability to acquire data all-time regardless of weather conditions.Furthermore,high resolution SAR satellite missions,such as TerraSAR-X/TanDEM-X and COSMO-SkyMed launched in recent decades,exhibit better performance of image data acquisitions in short revisit periods of several days.The time-series data acquired by these missions make it possible to detect and track the changes of ground targets more frequently.

In the past decades,bi-temporal change detection focused on detecting changes between two SAR observations by evaluating change indicators [5,6] constructed from statistical characteristics [7,8],spatial features [9,10]and coherence [11].Multi-temporal images can provide more abundant information,so they can be analyzed with different strategies for change detection.With the increasing number of images,the coherence maps generated from all data pairs can make more contribution to the recognition of man-made structure.Incoherent and coherent features can be combined to identify the construction change area [12],detect the conversion of land use [13],and assess the aftermath of wars and disasters [14].Joint analyses of multi-temporal images can facilitate more accurate description of the boundary of human activities.

Several time-series change indicators and frameworks have been proposed to fully exploit the capability of multitemporal data in capturing the temporal variations.For example,the backscattering coefficient of permanent scatterer and its coefficient of variation were taken as key features to detect long events and point-event under certain criteria [15].Another example is a change detection framework based on cosegmentation to generate the change frequency map and the change moment map of buildings [16].Statistical features and temporal clustering of time-series SAR data are also applied in urban change analysis to obtain change patterns to interpret complicated dynamic spatiotemporal changes in urban areas [17].However,most existing studies ignore coherent features.Meanwhile,the pixel-based binary results cannot directly highlight the changing objects,and are contaminated by speckle noise,making interpretation nonintuitive in real scene.

It is noteworthy that,in recent years,supervised methods based on deep learning have also been applied to change detection of SAR images [18-21].However,the application of such methods involves a complicated procedure of building large-volume sample datasets and training deep neural network models,which is usually time-consuming and highly demanding of computing resources.Since here we mainly focus on unsupervised algorithms,supervised methods are excluded from comparison and discussion in this paper.

In this paper,we propose a novel change detection framework using time-series SAR data in urban areas.First,backscattering coefficient and coherence are taken as incoherent and coherent features of SAR images,respectively.Two corresponding parameters,namely coefficient of variation and man-made object index (MOI),are jointly used to extract the initial change pixels containing speckle noise.Second,to aggregate discrete pixels into meaningful objects in reality,an improved temporal clustering algorithm is developed by combining densitybased spatial clustering of applications with noise(DBSCAN) and dynamic time warping (DTW) to segment pixels into patches.In this way,the spatiotemporal change of each artificial object can be obtained with noise removed.To demonstrate the effectiveness of the proposed change detection approach,an experiment with 10 TerraSAR-X SAR images acquired in the Stripmap mode is carried out to detect spatiotemporal changes of manmade objects in the Shanghai metropolitan area.

The rest of this paper is organized as follows.Section 2 describes the theoretical principles of the methods used in the framework.Section 3 gives a brief introduction of the study area and test data.Experimental results and analyses are elaborated in Section 4 to illustrate the effectiveness of the proposed method.Conclusions are given in Section 5.

2.Methodology

The proposed method consists of four major steps,i.e.,SAR data preprocessing,change feature construction,MOI generation,and time-series clustering based on DBSCAN-DTW.The stack of single-look complex (SLC)SAR images is taken as the input data.The final outputs include the spatial change map and temporal change patterns.The overview of the processing chain is shown in Fig.1,and each step is detailed as follows.

Fig.1 Processing chain of the proposed framework

2.1 Multi-temporal SAR data preprocessing

It is essential to perform fundamental preprocessing to convert SLC SAR datasets acquired in the repeat-pass mode into incoherent stack and coherent stack before detecting time-series changes.Conventional operations of co-registration and resample are first carried out to achieve spatial alignment among multi-temporal images.Although our purpose is not to perform SAR interferometry,we still need to ensure the registration accuracy at the sub-pixel level in order to reduce the false alarm caused by registration error.Afterwards,refined-Lee filtering[22] is undertaken to suppress the speckles in SAR images.A small window size is suggested so that boundaries and small objects can be preserved as much as possible.Meanwhile,we abandon the temporal filtering to avoid destroying the original dynamic change patterns.

In this study,we introduce the backscattering coefficient as the incoherent feature of SAR images.Sigma Nought σ0as the standard measurement of backscattering coefficient represents the radar reflectivity per unit area in the ground surface.It is calculated through radiometric calibration of a single SAR image and expressed[23] as

whereksis the calibration and processor scaling factor,DNis the amplitude value of a pixel,θlocis the local incidence angle,andis the decibel (dB) format of σ0.σ0usually shows a high value over urban areas due to artificial structures (rectangular edges/corners),uniform orientations of buildings,and materials of high dielectrics.

As the primary coherent feature,interferometric coherence is the standardized correlation between two complex SAR images,and its mathematic expression is

whereu1andu2denote complex SAR images.For a SAR data stack ofNimages,the coherent stack ofN-1 coherence maps is generated from each two successive images.Typically,high coherence can be achieved in man-made construction areas even over a long time span,hence we often use it to distinguish stable man-made objects from natural targets.How to jointly utilize coherent and incoherent features to identify man-made objects will be explained in detail in Subsection 2.3.

2.2 Change feature construction

The change indicator designed for bi-temporal change detection is constructed based on mathematical operations,i.e.,subtraction,ratio,logarithmic ratio,or distance of local statistical characteristic,etc.Due to the diversity of multi-temporal changes,comparison between two observations cannot capture all changes during the whole time span.The change indicator for time-series data should be a statistical measurement that can describe the degree of variations during a certain period.Consequently,we choose the coefficient of variation [15] for Sigma Nought as the change indicator,which is expressed as

whereσis the standard deviation of a group of samples,andμis the mean value.cvis a normalized measurement of the dispersion of a probability distribution.A large value ofcvindicates that the samples are highly dispersed or there are outliers in the stable sequence.In other words,with the hypothesis that changes occur at some moments across time-series images,the coefficient of variation will become larger.Also,it is noteworthy that the coefficient of variation has been adopted by timeseries interferometric SAR (InSAR) technology to initially select candidates of persistent scatterers.In this study,we take the coefficient of variation calculated for each pixel in the preprocessed incoherent data stack as the multi-temporal change feature.

2.3 MOI generation

The change feature constructed in Subsection 2.2 can record various types of changes,making it difficult to accurately identify changes of man-made objects using such incoherent change feature only.Therefore,it is essential to combine incoherent and coherent features to separate man-made objects from natural targets as much as possible.To verify the feasibility of this idea,we make a tentative assessment by drawing random samples of major land cover types from TerraSAR-X images and plotting them in the two-dimensional space of coherence and backscattering coefficient,as shown in Fig.2.In order to maintain the resolution,the window size for coherence estimation is fixed as 3×3.

Fig.2 Distribution of coherent and incoherent characteristics of main ground objects in urban scene

Undoubtedly,water surface is usually easy to identify in SAR images because of its low backscattering intensity as well as low coherence,corresponding to the lowerleft cluster in Fig.2.We can observe the upper-right cluster of man-made objects with high coherence,which shows a clear border against other types.Meanwhile,their backscattering coefficients show a wide range from-18 dB to 27 dB,suggesting a complex spatial pattern of mixed backscattering mechanisms rather than a single mechanism.Actually,Fig.2 shows that there are confusions between man-made objects,vegetation and bare land in terms of backscattering coefficient.In contrast,it is much more efficient to use coherent features to separate man-made objects from surrounding background.

In our approach,MOI is defined based on the coherence stack,expressed as

whereΓn-1is the coherence stack {γ1,2,γ2,3,···,γn-1,n},max(·) and mean(·) represent the maximum and mean operations,respectively.By combining the maximum and mean values of time-series coherences,targets of two typical types can be detected by using the MOI.One is the man-made object existent only during a short period,and the other is the stable target that maintains moderate coherence around the predetermined threshold (0.6 in this study) over a long period.Although using this index may introduce errors,the discrete pixels will be identified as outliers and removed in the later procedure.

Subsequently,we utilize the maps of MOI and change feature to select the initial change areas of man-made objects through thresholding and intersection operations.The result is a two-dimensional binary map which theoretically contains isolated pixels and false alarms.The change patterns in temporal dimension can be achieved by clustering of the initial result.

2.4 Time-series clustering based on DBSCAN-DTW

The results derived from MOI and change feature maps are pixel-level,and thus further processing is needed to produce refined object-level change maps,which can be achieved through time-series clustering.The primary objective of time-series clustering is to aggregate similar sequences together to separate different clusters,subject to the assumption that the trends of sequences of all pixels from the same object are highly similar to each other.The resultant cluster centers are usually used to represent change patterns.

To evaluate the inter-pixel similarity in terms of temporal sequences,we adopt the DTW algorithm to calculate the distance between them [24].Its principle and capabilities have been demonstrated in multi-temporal image classification [25-27].

The process of DTW can be considered as an optimization problem.Suppose that we have vectorsQ=[q1,q2,···,qn] andC=[c1,c2,···,cn],and their lengths arenandm,respectively.To align these two sequences,ann×mmatrixΔis constructed,elementδi,jis the Euclidean distance betweenqi∈Q(∀i=1,2,···,n) andcj∈C(∀j=1,2,···,m).Here we define a cumulative distancedi,j.The two sequencesQandCare matched from the starting point.For each point,the calculated distances of all previous points will be accumulated.After reaching the end point (n,m),this cumulative distance is the recursive sum of the minimal distances,that is,the similarity between vectorsQandC.In addition,in the procedure of optimizing the path,starting points and ending points of two vectors must match,and the principles of continuity and monotonicity should be followed.In the clustering operation,the cluster center can be represented by the vector which has the smallest sum of DTW similarity with all sequences in the current cluster.

For the time-series data with the same vector length,we can also compare the sequences similarity with the simple Euclidean distance.There are two reasons why we do not apply it.On one hand,the Euclidean distance is sensitive to noise.On the other hand,the appearance or disappearance of man-made objects in reality cannot be achieved overnight.This dynamic process may lead to different time nodes of changes at large-scale buildings or dense building areas.

Like most clustering algorithms,clustering based on DTW needs to set the number of clusters beforehand,which is impossible to be determined in the city-level change detection.To solve this problem,we introduce another clustering algorithm,DBSCAN [28,29].DBSCAN is a spatial clustering algorithm based on the density of discrete distributed data.It can cluster datasets with arbitrary shape without setting the number of clusters,and can find outliers while clustering.A brief overview of the algorithm is as follows.Two parameters (ϵ,MinPts) are set in advance,ϵis the neighborhood radius and MinPtsis the threshold of density.Neighbors will be searched for each point in radiusϵ.Points satisfying threshold are labeled as core points and all associated points are connected into clusters.More details of this algorithm will not be discussed here,and the processing flow of the proposed method is given in Fig.3,in which the main logical algorithm of DBSCAN is illustrated.

Fig.3 Flow chart of the improved DBSCAN-DTW

The improved spatiotemporal clustering algorithm combines the advantages of these two approaches.In Fig.3,the left part depicts the flow path of DBSCAN,and the initial change result is taken as the input data.The whole process of DBSCAN with given radius and density is carried out as the first step to obtain object-level spatial sets.For each set,there is not necessarily only one target in it because the size and distribution of man-made objects are inconsistent in a real scene.To adaptively determine the number of clusters for each set,we first assume the number of clusters is 1,and use DTW similarity measurement to calculate the sum of time-series distances between backscattering coefficient sequence and cluster center,i.e.,Σc.Afterwards,by increasing the number of clusters,the sum of distances is iteratively calculated until its change rate is less than a predefined threshold.The change rate will converge after one or a few iterations,andNc-1 clusters are classified to generate the final outputs,including the spatial change map and temporal change patterns represented by cluster centers.

3.Study area and dataset

The study area is located in the southern part of Shanghai,China,as shown in Fig.4(a),where the red and blue rectangles show the ground coverage of the data acquisition and the location of our study area,respectively.Shanghai is the largest metropolis in China with rapid development and economic growth.Due to the increasing human activities,types of land use have transformed from the cultivated land to the construction and industrial land in suburban and rural areas.The appearance and disappearance of man-made infrastructures should be monitored to support urban planning and land using.In this test scene,the types of ground targets are diversified,including roads,bare land,cultivated land,residential buildings,factories,vegetation,rivers and so on.Also,the seasonal changes of the cultivated land and vegetation,and the irregular changes of various dynamic targets make the change detection of man-made objects challenging.Ten SLC SAR images were repeatedly acquired by the TerraSAR-X satellite during the period from February 2015 to February 2016 over the study area.The basic parameters of the dataset are shown in Table 1.The mean amplitude image with a size of 5 000 × 3 000 pixels is shown in Fig.4(b),where ground truths marked by red rectangles are generated from SAR images and optical images in December 2014 (Fig.4(c)) and July 2016 (Fig.4(d)) from Google EarthTM(GE).

Fig.4 Overview of the study area

Moreover,we labeled the ground truths of spatial changes (red rectangles in Fig.4(b)) by careful comparisons among time-series SAR images and two cloud-free optical images covering roughly the same period.Because infrastructures with height will lose shape features in SAR image,pixel-level evaluation does not make much sense,while object-level ground truths are more favorable than pixel-level ones for visual interpretation by human eyes.The locations of object-level patches are sufficient to indicate whether the change detection results are correct.

4.Experiments and analysis

4.1 Experimental setup

Before carrying out tests in the experimental area,parameters of window sizes and thresholds should be determined properly.In this study,they are set empirically,as listed in Table 2.To illustrate the effectiveness of the proposed framework,comparative experiments should be undertaken,but very few algorithms are designed to obtain the spatiotemporal changes of man-made objects with both incoherent and coherent datasets.We choose the statistical features and temporal clustering (SFTC)approach [17] as an alternative method to perform comparative experiment to generate the spatial change map and temporal change patterns with incoherent stack only.The parameters of the DBSCAN also used in SFTC are adjusted to improve the result at the object level,so that two methods can be compared.

Table 2 Parameter settings for the experimental setup

We quantitatively evaluate the accuracy of the detected spatial change maps using the confusion matrices constructed by comparing experimental results against the ground truth map.Four statistical indicators,i.e.,precision (PR),accuracy (ACC),recall (RE),and Kappa coefficient (KAPPA),are derived from confusion matrices,expressed [30] as follows:

where true positive (TP) is the number of changed patches correctly classified,false negative (FN) is the number of changed patches misclassified into unchanged ones,false positive (FP) denotes the number of unchanged objects misclassified as changed ones,and true negative (TN) is the number of unchanged patches correctly classified.Besides,we also qualitatively analyze the temporal patterns to assess whether the change curves are consistent with the real scenes.

4.2 Results and analysis

Fig.5(a) and Fig.5(b) show the overview pattern of spatial change detection results of the proposed method and the SFTC method in the study area.The red rectangles outline patches of changed man-made objects identified.Fig.5(c) and Fig.5(d) are the first and last SAR images in the original data stack,from which significant changes can be visually identified.As a whole,clusters with different sizes and shapes are detected,almost without undesired effects of speckle noise in images.By comparison with the ground truth map,TP,FN,FP,and TN are counted to form confusion matrices,and then utilized to generate four indicators for accuracy assessment as listed in Table 3.All indicators suggest that better performance is achieved by the proposed method.Detailed analyses of the detection results in three aspects will be given in the following subsections.

Fig.5 Spatial change detection results and reference SAR images

Table 3 Evaluation indicators of two methods

4.2.1 Analysis of missed detection

In practical applications,recall is the most commonly used evaluation index,which represents the ratio of positive detection.The recall values for the two methods are 0.864 and 0.659,respectively.Obviously,the result of SFTC contains more missed detection.The example in Fig.6 can explain this phenomenon.First,although the position and shape of changed object patches can be visually identified from both results,the detected pixels of changes in Fig.6(d) are quite sparse,and even some boundaries are lost.The reason for this phenomenon is that incoherent features are heterogeneous and discontinuous across the building areas with high sensitivity only at the dihedral corners of buildings,as shown in Fig.6(f).Consequently,the clustering algorithm will discard them as noise.In contrast,the proposed method uses coherent features in Fig.6(e) to avoid this problem.

Fig.6 Example of missed detection

According to careful inspection,we find most of the missed detections by the SFTC method belong to similar cases as shown in Fig.6.Regarding the result of the proposed method,some missed detections are also attributed to the discrete initial changed pixels extracted,which mainly occur on the objects in continuous but slow changes with low coherence during the observation period.Such situation belongs to probability event rather than defect in principle,so the recall of the proposed method is higher.

4.2.2 Analysis of false alarm

Another evaluation parameter commonly used is the false alarm rate denoted by PR.The higher the PR value,the lower is the false alarm rate,and vice versa.As listed in the Table 3,PR for the result of the proposed method is higher than 0.9,while that of the SFTC method is lower than 0.83,which indicats that more changes not belonging to man-made objects exist in the result of the SFTC method.

Fig.7 gives a typical example of false alarm in the result of the SFTC method.We can see clearly a triangleshaped land parcel covered by dense vegetation (possibly crops,vegetables,flowers,orchard,etc.) in the center of Fig.7(a).The time-series images of backscatter intensity and coherence are portrayed in Fig.7(b) and Fig.7(c)separately,with the change curves of mean backscattering coefficient (BC) and coherence in the vegetated area plotted in Fig.7(d).One can easily discern the fluctuations of the backscattering coefficient along with alternating seasons,with the maximum difference as high as 10 dB,while the coherence is stable at a low level around 0.2.Without use of coherent feature,it is highly prone to make mistake to recognize the seasonal backscatter variation of this land parcel as man-made object changes.Furthermore,false alarms are also found on some dynamic targets in the SFTC result,such as stacked containers,which cannot be identified if disregarding coherence.

Fig.7 Example of false alarm

4.2.3 Effects of temporal clustering

In some cases,both methods produce similar spatial change maps,but differences exist in temporal changes.Fig.8 shows an example of such a situation.Two independent clusters are generated by the improved DBSCANDTW method,while all pixels are clustered into one patch by the density-based clustering algorithm of SFTC.It can be easily identified from optical images that the two areas are actually separated by a wide road and a watercourse.According to our understanding of the real scenes,the detection result of the proposed method appears to be more reasonable.

We also select 10 random pixels from each of the two regions to plot their temporal change curves to illustrate the effects of temporal clustering.Curves in Fig.8(d) and time-series images in Fig.8(c) show that the right part begins to change at the sixth image and keeps stable in the next four images,while the most region on the left remained unchanged until the eighth image.The proposed DBSCAN-DTW method effectively distinguishes these two different temporal evolution patterns from each other by using time-series features of multi-temporal SAR images.Although adjusting the parameters can make the spatial clustering algorithm of SFTC produce the correct result in this case,it will also lead to incomplete patch detection in low pixel density areas,and introduce more false alarms in large-scale detection.

Fig.8 Example of temporal clustering effects

5.Conclusions

In this paper,we present an unsupervised change detection framework for man-made objects using both coherent and incoherent features derived from multi-temporal SAR images.The key idea is to identify man-made objects with the MOI defined by coherence.Then,the MOI is jointly used with the change feature constructed with BC to extract the initial spatial change detection results.Afterwards,in consideration of the spatiotemporal features of changed objects,an unsupervised DBSCANDTW algorithm is developed to cluster the initial results into patches of single objects and suppress speckle noise at the same time.

Comparisons between the proposed method and the SFTC method that uses incoherent features only are carried out by experiments with 10 TerraSAR-X SAR images acquired in Stripmap-mode over Shanghai.Through quantitative analyses of the results,we conclude that the proposed method can achieve better performance in terms of reducing missed detection and false alarm,which benefits from the joint utilization of coherent and incoherent features.Meanwhile,instead of pixel-level detection results,the noiseless object-level ones make the interpretation intuitive,and the temporal change patterns reconstructed can better characterize the real evolution trends.

Future research works will focus on developing various change indicators based on multi-temporal coherent and incoherent features (e.g.,polarimetric and interferometric information) to improve the accuracy and reliability of change detection results at low computation time cost.Furthermore,the applicability of the proposed method to city-level change detection across wide areas will be investigated.