D.Mohanapriya,Dr.K.Mahesh
1 Department of Computer Applications,Alagappa University,Karaikudi,Tamil Nadu 630003,India
2 Department of Computer Applications,Alagappa University,Karaikudi,Tamil Nadu 630003,India
Abstract:On accomplishing an efficacious object tracking,the activity of an object concerned becomes notified in a forthright manner.An accurate form of object tracking task necessitates a robust object tracking procedures irrespective of hardware assistance.Such approaches inferred a vast computational complexity to track an object with high accuracy in a stipulated amount of processing time.On the other hand,the tracking gets affected owing to the existence of varied quality diminishing factors such as occlusion,illumination changes,shadows etc.,In order to rectify all these inadequacies in tracking an object,a novel background normalization procedure articulated on the basis of a textural pattern is proposed in this paper.After preprocessing an acquired image,employment of an Environmental Succession Prediction algorithm for discriminating disparate background environment by background clustering approach have been accomplished.Afterward,abstract textural characterizations through utilization of a Probability based Gradient Pattern(PGP)approach for recognizing the similarity between patterns obtained so far.Comparison between standardized frame obtained in prior and those processed patterns detects the motion exposed by an object and the object concerned gets identified within a blob.Hence,the system is resistant towards illumination variations.These illumination variation was interpreted in object tracking residing within a dynamic background.Devised approach certainly outperforms other object tracking methodologies like Group Target Tracking(GTT),ViPER-GT,grabcut,snakes in terms of accuracy and average time.Proposed PGP-based pattern texture analysis is compared with Gamifying Video Object(GVO)approach and hence,it evidently outperforms in terms of precision,recall and F1 measure.
Keywords:binary labeling;computer vision;gradient pattern;laplacian operator;object Tracking
Act of locating and tracking an object via online tracking mechanism in a video finds its usefulness in wideranged applications such as human-computer interaction,surveillance,activity recognition,motion analysis[1].On tracing a position of an object in a video initially,it is trailed with an ease in upcoming frames[2].Hence,some worthy information is acquired out of tracking the object regarding its activity as well as the identity criterion of the concerned object[3—5].The vital criterion to be kept in line with the concept of object tracking is resistance towards the illumination alterations in the foreground of an image leaving behind the background region suppressed[5].For accomplishing standard many useful algorithms such as Robust Fragment-based tracking(Frag-Track),Incremental Visual Tracking(IVT),Multiple Instance Learning(MIL)and Graph based Discriminative Learning(GBDL)are employed that incurred a lot of resources and complicated notions to resist with the altering environments from where the object is tracked[6,7].Robust optimization algorithm or multifaceted appearance model is necessitated for confining the object from the entire search space.
On providing some inventive tracking strategy by means of augmenting few particularized structure information,the object can be tracked but with some other shortcomings.,eg.,insufficient pixel values,distribution of colors and other texture descriptors.Hence,the total number of trackers is highly essential for tracking an object in a successful manner[8].This sort of strategy consequently surges the entire system complexity.Though there is a limitation in realizing the innermost structure of an imagery with respect to the existence of occlusion and structure deformation in an abstracted image.In online object tracking,issues arise with segregation of background from the foreground in a frequently altering environment.[9]Confining the interruption of background from the foreground that is the targeted image is completely subjected to a regular subjugation of tracking strategy getting deviated towards other objects.Conventionally utilizing bounding box approach does not suffice the need for segregating the background in presence of an occlusion.
As a means of resolving such issues,a sparse representation-based pattern recognition methodology is introduced.It is capable of recognizing objects by revealing an appropriate patch that ultimately associates with the object concerned but this strategy is capable of control occlusion only to a limited extent and it still exists.Hence,the overall quality of the tracked object gets affected in terms of quality.
The novel technical contributions of proposed work are listed as follows:
·A novel background normalization technique deployed on the basis of textural pattern analysis of the targeted object.This Laplacian Chime Pattern(LCP)technique illuminates and enhances the visual quality of an image even in poor illumination effects
·The Environmental Succession Prediction suppresses the background region and thereby enhances the foreground image region by means of articulating a binary image through masking procedure.
·Processing masked image by applying Probability based Gradient Pattern(PGP)technique for offering utmost accuracy in tracking an object irrespective of alterations in illumination effects.
This paper is organized as follows:Section II describes the related works on visual object tracking and their limitation inferred.Section III discusses the proposed Laplacian Chime Pattern(LCP)for preprocessing the image and enhancing it and further segregation of foreground / background region through pattern formation and blob detection for tracking an object through ESP-PGP methodology.Section IV illustrates the performance analysis of proposed algorithm over the prevailing object tracking techniques.Finally,section V presents the conclusion.
This section discusses the major issues in the existing object tracking techniques in detail.[10]suggested an inventive online object tracking methodology by means of incorporating both sparse prototypes and Principal Component Analysis(PCA)algorithms.Though the objects are tracked in a better way in a comparative manner than prevailing approaches,it is incapable of tracking variations in objects that happens in a dynamic manner.[11]endorsed a stretchy appearance model that is constructed on the basis of sparse representation structure by means of utilizing Block Orthogonal Matching Pursuit(BOMP)approach.Though it is capable of filtering noise from the targeted object,the convergence rate realized upon learning all sorts of features suffers a huge complication in assimilating and associating features in an appropriate manner[12]devised a sparse representation structure framed on the basis of complex nonlinear appearance model by deliberating structured union subspaces library for treating occlusions but experiences a vast amount of computational load.
[13]designed a Snooper Text detecting mechanism for recognizing the candidate characters embedded within images through employment of toggle-mapping image segmenting approach.The accuracy is highly affected with mitigated resolution and distorted shapes.It is not robust to resolve the noisy tendency of the objects.[6]established three different cues in order to assess the character of a particular scene.Bayesian approach is incorporated with Markov Random Field(MRF)to recognize those character inferred within text along with its in-built dependencies.The devised inventive methodology is capable of abstracting image features via appropriate text recognition with an enhanced robustness.Hence it incurs more intensified assessment of distinct regions that consequently augments the complexity of processing in the deliberated system.[14]introduced a novel mechanism by means of utilizing both spatial as well as temporal information for identifying and tracking video texts originated from diversified orientation.Appropriate text candidates are acquired through employing Subgraphs of Delaunay triangulation are formulated to tract video with respect to appropriate texts achieved.Though the devised mechanism is better than conventional state-of-art techniques in terms of quality of the object traced,accuracy inferred in distinguishing appropriate text is diminished.
[15]formulated a unified approach that comprises of three different constituents.They are modelling of background,registering each frame perceived and the process of tracking an object after analyzing the entire background of the intelligent video surveillance plan through a Pan-tilt-zoom(PTZ)camera.Attainment of precise outcomes are accomplished only through deploying subtraction algorithms for increasing number of times that certainly surges the hardware dependency of the system for proficient processing.[16]utilized sparse representation and designed an inventive appearance model for accomplishing a proficient visual tracking mechanism.The dissimilarities inferred with those tracked objects are recognized through an online dictionary learning approach.A robust similarity metric is incorporated to track the video sequences.But an uneven trade-off is realized between the complexity measures of the system and the accuracy inferred in tracking an object.
Figure 1.Flow diagram of proposed ESP-PGP.
[17]articulated a training module to resolve the convex optimization problem by means of employing cutting plane optimization strategy for tracking multiobject in a monocular pattern.However,a more number of candidates are utilized to acquire an optimal match between identified objects,is not sufficient to adapt with various environmental alterations and provide an accurate object tracking.[18]resolved the multi-target object tracking issue for generating interobject interactions through general degrees by means of devising a polynomial time solution through deployment of binary integer programming.Besides acquiring an enhanced accuracy in associating information,it incurs a large number of objects as intermediary reference and subsequently surges computational complexity of the system.[19]endorsed a multiview learning strategy through incorporation of multiple Support Vector Machine(SVM)approach.Differed view of several features is assimilated by employing an inventive combination via implementing an entropy criterion in order to signify Local Binary Patterns(LBP),Histogram of Oriented Gradients(HOG)and gray scale values.Inappropriate choice made with samples of object during the tracking procedure leads to inaccurate object identification.
[20]suggested a precise kernel Slow Feature Analysis(SFA)specified for kernels mentioned in the krein space existing in both indefinite and positive definite kernels.It is capable of proficiently segments and tracks object existing in a temporal video.However,the preciseness of the tracked objects are diminished by those falsely discriminated samples and to distinguish the background information from those foreground target.[21]employed a bidirectional communication flow realized in between two different processes through deployment of a Bayesian loop capable of identifying all sorts of alterations inferred.Since it utilizes only two number of parameters,the computational complexity of the entire system in accomplishing a procedure is significantly mitigated.However,inadequate discrimination of multiple targets at a single instance of time and also deliberates shadow as an object instead of alleviating it from the processing frame.[22]suggested an inventive approach for tracking visual objects by means of learning the visual features of that particular object by means of utilizing a dual-fold convolutional neural network but it suffers from the issue of sampling ambiguity.Hence,an optimal decision making is required to opt for deliberating shifted samples from varied set of layers.[23]utilized Structure Complexity Coefficients(SCC)assimilated with observation dependent hidden Markov model(OD-HMM)for tracking a target that is dynamic in nature but still it experiences an limitation in tracking an object that gets altered temporally.
The images are processed with minimum processing speed at maximum time if there are detection of multiple objects.The obtained resolution is very low,which cannot be able to predict the object accurately and it is unable to process the objects located at longer distances.It has low detection rate on stationery objects.The capturing speed is low on processing the multiple images and increases the time of processing.It has produced the experimental analysis for raising through image enhancement in detection of night time and doesn’t give any high clarity rate when it is deployed on live circumstances.The processing time is maximum when compared to histogram techniques as it involves various steps in denoising methods
Algorithm 1.Laplacian Chime Pattern Technique Step 1:Perceive‘X’as input image Step 2:Compute row and column size of the image as i and j respectively Step 3:Obtain boundary coefficients for the image using(7)Step 4:Find the fitness function using Dragonfly optimization algorithm as,γ1=X+-X.(1)Where,X+ denotes the position of current individual and X resembles the position of the object Step 5:Normalize the image with the fitness function using(8)Step 6:Initiate the filter value for the trailing fitness updates as,Δγ =1.01.(2)Step 7:successive fitness updates are given as,β =γ2+(β*(Δγ-γ2)).(3)Step 8:Apply FFT to identify and segregate noisy pixels Compute temporary variable,σ1-→log(X);FFT(σ)-→σ2.(4)Step 9:Compute preprocess image,Y =βσ2.(5)Step 10:Restore image by applying inverse FFT,Y =eY.(6)
The proposed Probability based Gradient Pattern(PGP)extractionmechanism invokes a sequential form of procedure to track the target concerned within a video frame fed as input.figure 1 illustrates the overall workflow of the devised object tracking model.
Figure 2.Bright illumination.
The input frame obtained from a video frame is preprocessed through LCP methodology.Herethe position of the object is identified through the application of dragon fly optimization for normalizing the acquired imagery.The image is further subjected for transformation to get adapt into a frequency domain.Thus the noisy pixels involved in the image region are removed and an image enhancing procedure is employed to enrich the illuminating deficiencies in the imageries if available.After enhancing,ESP methodology is employed to suppress the background region from the foreground region where the object exists in an image.It is done by converting the image into a binary form and also through the provision of a masking over the recognized object.The masked portion is set as input for the trailing PGP methodology in order to detect the object from that abstracted pattern.
The image with a pattern is compared with the standardized form of an input image obtained from the video sequence.If a similarity is inferred between the pixel distributions of those two compared imageries,then the objects are identified and the blob is laid on that image where the motion of an object is recognized.If the similarity is not found,then the search for new match is triggered.
Figure 3.Dark illumination.
The process of object tracking is accomplished by acquiring the information regarding an object from the video file given as input.That information is collected by means of processing that video file to obtain a sequence of images by means of applying Matlab command.After attainment of sequence of imageries from that video file,appropriate optimization methodology is utilized to opt for better boundaries of the recognized object.
Initially,in the preprocessing stage noise pixels contained by an image abstracted from a video frame are alleviated.Primarily to segregate those noise pixels from an image,the boundary coefficients are computed with respect to the size of the object whenever it is recognized in the frame.
i,j-resembles the boundary coordinates associated with an image pixelα.
r-denotes the column size of the image.
c-specifies the column size of the image.
The images acquired are matched up with each other on the basis of texture possessed by every single image that is processed.Comparison between each and every image that is acquired from different video frames is probably unfeasible due to different texture pattern employed on them.Hence,all images that are deliberated for processing tend to maintain a regular form of standardization.Initially,all sorts of image pixels existing in and out of those boundary points are spatially registered.Afterward,the gray-level values of that image concerned are transformed into a stable base that comprises of similar intensity distribution level at all pixels.The association prevailing in between any two successive group of imageries acquired from a video frame is explored are recorded spatially as if designated by a linear model termed as,
Where,γ1-first fitness value computed for optimization.
The dragonfly optimization algorithm[24]is utilized to signify the appropriate position of the object.The navigating behavior exposed in search of solution by a dragonfly is utilized to formulate a swarm intelligence procedure to predict the actual position of the object.If the coefficients are confined towards an actual object then those dragonflies are attracted with a respect that single point and if it does not belong to an actual object points,all dragons are completely distracted from that particular point.The initial fitness function certainly attracts successive fitness function.A similar strategy is trailed to recognize the fitness function for optimizing the position of the tracking object.Designated fitness function is employed to normalize the image.After that,completely normalized imageries are acquired that is all set for successive stages of processing with a standardized texture pattern.In order to segregate noise pixel from those images,it is to be transferred from the spatial domain to a frequency based domain range in which the noisy pixels are discriminated from the image laid for processing by means of deploying Fast Fourier Transform(FFT).In the frequency based domain,the images are represented in a continuous range of pixels and hence,the noisy pixels are segregated with an ease by means of deploying a high pass filter.After restoring the image without any intervention of noise,the image is again processed to accomplish a spatial domain by invoking an inverse FFT on the same image.Thus,the image is preprocessed by segregating the noisy pixels from the original imagery acquired.
Algorithm 2.Image Enhancement algorithm.Step 1:Obtain the filtered output image‘Y’Step 2:If μlow and μhigh are in appropriate dimension y =max(μlow,min(μhigh,Y)).(10)Step 3:compute z using(9)Step 4:reiterate step 1 to 3 until all output images are enhanced
Figure 4.Bright illumination.
The preprocessed image comprises a transformed luminance region that is not appropriate for identifying an object with proficiency through visualization methodologies.Local contrast regions that certainly possesses the luminance region either with an utmost and merely low level.Hence,the preprocessed image is to be enriched with contrast enhancement approach.On assessing the outer dimension of an image it is decided whether to enhance that particular image or not.Hence,a conventional histogram equalization approach is utilized to regulate the altered color intensities.This is accomplished for the imagery processed through procedures implied in prior.After measuring the minimum and higher row size of an appropriate image the image is enhanced as follows,
Where,
μlow-lower row size of an image
μhigh-higher row size of an observed image
Figure 5.Dark illumination.
The entire environment in which an object exists in a spatial domain is confined to be a sub-object level.The level of occupancy to which the contiguous environment surrounding the detected object is fragmented into cells that are subsequently characterized into interlinked arbitrary variables.These allocated variables are updated in a reiterative manner that in turn infers the entire surrounding background of the image in accordance with its alterations.The background of a particular image that is processed is modeled in order to create a binary form of image pattern with a mask for those recognized object.By means of providing a mask over that initially referenced frame,the objects identified initially are recognized with a minimized complexity in forthcoming images.The variations in positions of the object are plotted with preciseness in the image concerned.The difference observed in positions of pixels paves the way for accomplishing the difference methodology in which the background image is suppressed and foreground image is tracked as a binary image.The enhanced preprocessed image is given in as input for the ESP approach in order to segregate the foreground portion from the background part.The input imageries acquired are processed for identifying the background portion by leaving out the positioned object in the preprocessed input image.
Algorithm 3.Environmental succession prediction[25].Initialize xr =1,yr =1,Np and BL =1 for i=1:Ft if i==1 then∈=Z;ω =(μ*Z)+((1-μ)*∈).(11)end if for m=1 to h-BL for n=1 to w-BL∩x =m+xr*cos 6.2832*p Np+0.5,(12)∩y =n-yr*sin 6.2832*p +0.5.(13)compute:Np∀q =1 2πρ∄*(e-0.5(ρ∩x∄(ρ2+∩y))2)*(cos(λ*∩x)+∄(14)).
Figure 6.Bright illumination.
The binary image enhanced with an illumination is taken as an input and the background region is suppressed via∩xand∩y.After realizing the position of the object from those enhanced images the pixels that encompass the object alone is masked and the entire image is converted into a binary image.The algorithm for ESP feature extraction is listed as follows:
The existence of spatial and temporal correlations invideo image utilizes the finite difference method to extract the small moving target from the complex background.
The object that accomplishes a movement is recognized through a predefined arrangement that gets formulated by illustrating the likelihood of those dynamic pixels captured in a static environment through a standard grid in a spatial occupancy evolution.This criterion certainly tracks the movement accomplished by an object by means of tracing the area in which previously detected pixel is positioned.
Likewise,the entire image is sectioned and processed through an acquisition of patterns abstracted.The masked pattern holds the responsibility of exposing the movement of the object.The final outcome of an object tracking task is accomplished through a comparing procedure invoked between the patterns obtained for every image.An object is tracked by means of matching the neighboring pixels obtained from each pattern recognized form those processed imageries.Matching procedure employed among the images trails hierarchical manner in which pattern abstracted from an image is matched up with the successive images obtained in a sequential manner from a video sequence.
Algorithm 4.Probability based gradient pattern algorithm.Step 1:Obtain ∀q for two imageries Step 2:Pattern comparing process for all patterns extracted if∀qi >∀qj.(15)then Ipgp =ϑ*∀qj *(∀qi)-1.(16)else if∀qi >∀qj.(17)then Ipgp =ϑ*∀qi*(∀qj)-1.(18)end if Blob detection=Ipgp+(( 1 6.28*s)*e-(r2+c2 2s2 )).(19)
The neighborhood pixels obtained from target object obtained from the inferred image frame is acquired from performing a centroid matching procedure for every image matrix assessed.The mismatch recognized between those patterns results in switch over of the comparison between other images obtained.On realizing the similar pattern between those processed binary images as well the original input image the blob is plotted over the input imagery as a means of detecting the object.The algorithm utilized for identifying the object and plotting blob over it is given below.
The pattern abstracted from that masked image is compared with the original input image.On realizing the masked area the objects recognized in it are classified from the locally stabilized images.Those images are converted into a binary form and those motion inferred from the object are detected.After identifying the displacement happened in the identified object,the blob is plotted on that respective image and the object concerned is tracked as plotted in figure 8 and figure 9.
Figure 8.Bright illumination.
Figure 9.Dark illumination.
This section illustrates the performance validation of proposed Environmental Succession Prediction -Probability based Gradient Pattern(PGP)with the existing Group Target Tracking(GTT),ViPER-GT,grabcut,snakes method.The validating parameters utilized to ensure the preciseness of the target region is accuracy,average time utilized for detecting the object from the image sequence.Likewise,Gamifying Video Object(GVO)segmentation approach is utilized to compare other metrics like Pascal Overlap Measure(POM),precision,recall and F1-measure[26].The overall performance of the approach is examined through employing these methodologies on a standard dataset in MATLAB environment that comprises of video frames maintained to a similar standard.The videos collected from the URL is converted into a standardized resolution of similarly annotated instance exposed in 3 chains of 6 windows utilizing 1920 X 1080 resolution[27].
Figure 10.Accuracy analysis of the existing[26]and the proposed method.
The proposed work utilizes the MOT 17(https://motchallenge.net/data/MOT17/)benchmark dataset and it consists of 11 training and testing set.The training sets are ESP-PGP,ETH-Bahnhof,ADLRundle 8,and KITTI-13 for training and remaining five training dataset TUD-stadtmitte,TUD-Campus,ADL-Rundle 6 and KITTI-17 are used for evaluation.CEM method also offers the evaluation results on training set.
4.1.1 Accuracy
The preciseness of identifying a target region is measured with the term accuracy.figure 10 delivers the comparative analysis of other prevailing mechanisms and the proposed ESP-PGP method.
Among those prevailing methodologies snakes approach exposes a maximum amount of accuracy in a comparative manner while the devised ESP-PGP mechanism certainly outperforms all those tracking methodologies by 8%.This kind of performance enhancement is accomplished by the methodology because it utilizes PGP approach to compare and recognize the similarity between patterns acquired in order to confine a blob.
4.1.2 Precision(Pr),Recall(R)and F1-measure(F1)The overall performance of methodologies are assessed by means of measuring the True Positive(TP),True Negative(TN),False Positive(FP)and False Negative(FN)computations obtained on comparingthe images being processed and those original images acquired from a video sequence.
Precision metric is defined as the value assessed between ratios of the associations between any two patterns retrieved at a single instance of time.
The proportional value inferred from those associated patterns and those retrieved patterns usually defines recall value.
The association inferred between precision and recall is defined as F1-measure.
Table 1 illustrates the comparative analysis between precision,recall and F1 measure observed on realizing GVO[28]and ESP-PGP on the standard dataset.
Table 1.Comparison analysis.
The performance of the prevailing GVO approach[28]is completely overtaken by the devised ESP-PGP methodology.Since those environmental occlusions and other distracting criterion are alleviated from the acquired image,the background region is completely suppressed to obtain the foreground portion of the image in a clear manner.Hence,the tracking of an object is made feasible in an enhanced way.The endorsed ESP-PGP approach accomplishes an enhanced precision,recall and F1-measures value by 43%,2% and 29%respectively.
4.1.3 Pascal Overlap Measure(POM)
POM is defined as the intersection obtained in between the ground truth as well as the output segmentations reported from those images processed.figure 11 illustrates the POM analysis between GVO[28]and ESPPGP techniques.
Figure 11.POM analysis.
Figure 12.Comparative analysis between proposed ESPPGP and conventional object tracking methodologies[26].
The binary image acquired out of masking the object completely segments the object from that background region and hence,the true pixels of the objects are segregated in a robust manner.Devised ESP-PGP approach outperforms the conventional POM approach by 2%.
4.1.4 Average Time
The tracking success rate is the ratio of the number of successful frames to the total frames as
The comparative analysis between the proposed ESP-PGP methods with other prevailing approaches with respect to the average time acquired for recognizing an object is illustrated in figure 12.The proposed ESP-PGP approach is better in mitigating the average time acquired for recognizing and detecting the object owing to the utilization of an optimizing strategy in confining its boundaries.The mitigated amount of average time inferred with respect to other methodologies is 5%.
Many computer vision based applications necessitated a perfect object tracking scenario that serves in monitoring the activity of an object.Video frame that encompassed a sequence of an image is processed to track an object by means of discriminating the background from the object existing in a foreground.The limitations observed in tracking an object gets alleviated through a deployment of a novel background normalization methodology formulated on the basis of abstracting textural pattern of an image.An Environmental Succession Prediction approach is devised to suppress the background region from foreground region by means of generating a binary image.The binary image framed is masked to discriminate background and hence,it is utilized to compare with the standardized frame acquired from the input image.Subsequently,the textural pattern of the image is recognized through deployment of Probability based Gradient Pattern(PGP)approach.The similarity between patterns obtained so far is recognized.Comparison between standardized frame obtained in prior and those processed patterns detects the motion exposed by an object and the object concerned gets identified within a blob.Hence,the system is resistant towards illumination variations interpreted in tracking an object residing within a dynamic background gets tracked in a robust manner.Devised approach certainly outperforms other object tracking methodologies like GTT,ViPER-GT,grabcut,snakes[26]in terms of accuracy and average time.Proposed PGP-based pattern texture analysis is compared with GVO[29]approach and hence,it evidently outperforms in terms of precision,recall and F1 measure.Owing to the improvement acquired with all these measures the accuracy of tracking an object is escalated by 8%and subsequently,mitigation occurred in average time inferred to track an object by 5%due to the prior background normalization and novel gradient patterns.
Funding:There is no Funding.
Conflict of Interest:The authors declare that they have no conflict of interest.
Ethical approval:This article does not contain any studies with human participants or animals performed by any of the authors.