Device-Free Through-the-Wall Activity Recognition Using Bi-Directional Long Short-Term Memory and WiFi Channel State lnformation

2022-01-08 13:06ZiYuanGongXiangLuYuXuanLiuHuanHuanHouRuiZhou

Zi-Yuan Gong | Xiang Lu | Yu-Xuan Liu | Huan-Huan Hou | Rui Zhou

Abstract—Activity recognition plays a key role in health management and security.Traditional approaches are based on vision or wearables,which only work under the line of sight (LOS) or require the targets to carry dedicated devices.As human bodies and their movements have influences on WiFi propagation,this paper proposes the recognition of human activities by analyzing the channel state information (CSI) from the WiFi physical layer.The method requires only the commodity:WiFi transmitters and receivers that can operate through a wall,under LOS and non-line of sight (NLOS),while the targets are not required to carry dedicated devices.After collecting CSI,the discrete wavelet transform is applied to reduce the noise,followed by outlier detection based on the local outlier factor to extract the activity segment.Activity recognition is fulfilled by using the bi-directional long short-term memory that takes the sequential features into consideration.Experiments in through-the-wall environments achieve recognition accuracy >95% for six common activities,such as standing up,squatting down,walking,running,jumping,and falling,outperforming existing work in this field.

Index Terms—Activity recognition,bi-directional long short-term memory (Bi-LSTM),channel state information (CSI),device-free,through-the-wall.

1.lntroduction

Home security and health management have been attracting considerable attention for many years.Monitoring human activity indoors carries great importance since abnormal behaviors of an individual can be identified,so as to take proper necessary measures.Current mainstream approaches to perform indoor activity recognition are based on wearables[1],[2]or vision[3],[4].Vision-based solutions deploy cameras to collect images and realize activity recognition through image processing.Vision-based solutions need a luminous environment to work effectively and can only work under the line of sight (LOS).In addition,cameras pose privacy concerns.Activity recognition based on wearables overcomes the limitations of LOS and does not pose privacy concerns.These solutions use sensors to infer activities by analyzing sensor data through machine learning.However,wearable-based solutions require the targets to wear dedicated devices,which may be inconvenient.

With the development of wireless communications,WiFi signals are no longer limited to acting as a communications medium.In an indoor scenario,human presence alters the propagation of WiFi signals,causing WiFi signals to carry useful information about human activities.By extracting and analyzing the characteristic information about human activities from WiFi signals,activity recognition can be realized[5],[6].Due to the ubiquity of wireless signals,activity recognition based on WiFi circumvents the inconvenience brought about by wearables and eliminates the invasion of privacy.It can work under both LOS and NLOS conditions.As WiFi signals can penetrate walls,there is a possibility for activities to be monitored through walls.There has been some pioneer research work conduced on activity recognition based on WiFi.However,two problems exist:1) Most existing approaches utilize traditional machine learning algorithms,which require manual selection of features leading to low efficiency and accuracy and 2) existing solutions can achieve high accuracy under LOS and NLOS,but are relatively inaccurate for through-the-wall scenarios.To overcome these limitations,we propose to apply bi-directional long short-term memory (Bi-LSTM) to extract the features and infer activities,considering the time sequential features in the vectors of WiFi channel state information (CSI).To remove noise,we applied a discrete wavelet transform (DWT) to filter the raw CSI data.To acquire the activity segment in the CSI sequence,we applied outlier detection based on the local outlier factor (LOF) to detect the activity segment.In the evaluation of six common activities,our proposed method was able to achieve the accuracy >95% through a wall.The method also performed well under LOS and NLOS,achieving the accuracy of 98.3% and 96.7%,respectively.The main contributions of our work are summarized as follows:

1) We applied Bi-LSTM to perform activity recognition,which selected the time sequential features automatically.Compared with traditional machine learning algorithms,Bi-LSTM avoided the complex feature extraction process and improved recognition accuracy.

2) We accurately identified daily activities,such as squatting down,standing up,jumping,walking,running,and falling in different scenarios.Compared with existing work,the recognition accuracy improved by a large margin and achieved the goal of device-free through-the-wall activity recognition with WiFi.

3) We applied DWT to reduce noise and LOF to detect the activity segment,which improves the recognition accuracy.

Section 2 reviews the related work on activity recognition with WiFi.Section 3 introduces the preliminary of CSI and the rationale behind activity recognition.Section 4 elaborates on the proposed activity recognition method based on Bi-LSTM.Section 5 evaluates the proposed method and reports the experimental results.Section 6 discusses the conclusion of the work.

2.Related Work

A lot of research on wireless sensing has emerged in recent years.Some of it is related to indoor activity recognition based on WiFi.CSI-based activity recognition (CARM)[7]and monitoring systems quantitatively build the correlation between CSI value dynamics and certain human activities as the profiling mechanism.WiseFi[8]is a localization and activity recognition system,leveraging the amplitude and phase of CSI,and the angle-of-arrival of blocked signals.WiFall[9]employs the time variability and special diversity of CSI as the indicator to detect human activities and falls,with initial denoising by the moving weighted average algorithm,abnormal CSI series detection by the LOF algorithm,and activity classification through support vector machines (SVM).RT-Fall[10]is a fall detection system which utilizes the phase and amplitude of CSI,fulfilling the goal of segmenting and detecting falls.Changet al.[11]converted the received CSI into an image by observing the similarity of CSI with the texture,then applied the visual method to extract the features,and identified activities through SVM.NotiFi[12]employs a non-parametric Bayesian model and dynamic hierarchical Dirichlet process to recognize abnormal activities.FALAR[13]leverages CSI to recognize activities regardless of the location and small environmental changes.The extraction and classification process is realized by the non-negative matrix factorization theory.EI[14]is an environment-independent activity recognition system,exploiting adversarial networks to remove environmental effects and subject-specific information,learning the transferable features of activities.Chenet al.[15]proposed attention-based Bi-LSTM for activity recognition using CSI.They leveraged an attention mechanism to assign different weights for all the learned features.WiSDAR[16]achieves spatial diversity-aware activity recognition by extending the multiple antennas of modern WiFi devices to construct multiple separated antenna pairs for activity observation.Shenget al.[17]realized activity recognition by integrating spatial features learned from a convolutional neural network into Bi-LSTM.The original model was fine-tuned in the new environment,such as different rooms.

3.Preliminary and Rationale

3.1.Channel State lnformation

Transmission of wireless signals is affected by the environment,due to reflection,diffraction,etc.CSI is the fine-grained information from the physical layer that describes the channel frequency response from the transmitter to the receiver.In the frequency domain,the narrow band flat-fading channel is modeled as y=Hx+n,where y is the received vector,x is the transmitted vector,H is a complex matrix consisting of CSI values,and n is the channel noise.In the orthogonal frequency division multiplexing (OFDM) systems,by using a commodity network interface card with the modified firmware and driver,the amplitude and phase of each OFDM subcarrier within the channel can be revealed to the upper layers from each packet,in the format of the CSI matrix

whereNtxandNrxare the numbers of transmitting and receiving antennas,respectively; Hi,jis CSI of the channel from transmitting antennaito receiving antennaj,containingNssubcarriers,and is expressed as

Subcarrierkcan be expressed ashk=∣hk∣ej∠hk,k∈[1,Ns],where ∣hk∣ is the amplitude and ∠hkisthe phase.

3.2.Rationale of Activity Recognition

In an indoor environment,there are obstacles,such as floors,walls,and furniture.These obstacles affect the wireless transmission by reflection,diffraction,and so on,resulting in a multipath effect.In a static environment,CSI amplitude waves from the transmitter to the receiver are relatively stable.When an individual is present,the wireless signals will be affected by their passing due to the body of the individual.If the individual performs activities,the reflection and diffraction points on the human body frequently change,causing the CSI amplitude values to vary.Fig.1illustrates the CSI amplitude waves through a wall when an individual performs activities.It is feasible to recognize the activities according to the influence of activities on CSI amplitude patterns via classification.

Fig.1.CSI amplitude variation of the through-the-wall signal induced by six human activities:(a) standing up,(b) squatting,(c) jumping,(d) walking,(e) running,and (f) falling.

4.Device-Free Activity Recognition

In this work,device-free activity recognition was performed in an environment with one pair of WiFi transmitter and receiver,each equipped with a few antennas.There was no need for the target to wear a device.Activity recognition was realized by analyzing the variations of the wireless signals induced by activities performed by the target.Deep learning algorithms can be utilized to classify the activities.As performing an activity takes some time,a CSI amplitude sequence sample was received for each activity.We propose using Bi-LSTM to fulfill the task of activity recognition,so as to make use of the sequential data.The process comprised four stages:CSI data collection,CSI data preprocessing,recognition model training,and activity recognition.The entire process is illustrated inFig.2.

Fig.2.Process of device-free activity recognition.

4.1.CSl Data Collection

The WiFi transmitter sends packets continuously and the WiFi receiver receives the packets.The raw CSI data are retrieved from the received packets,each of which is a high dimensional vector,expressed as R=(h1,h2,…,hi,…,hN)T,wherehirepresents subcarrieri,which is a complex number containing the amplitude and the phase,andN=Ntx×Nrx×Nsrepresents the dimension of the vector.As the phase informationinhiis not accurate,we exploitonlytheamplitudeinformationtorealizeactivityrecognition.Hence,weobtainthe vectorofCSIamplitudeas=(∣h1∣,∣h2∣,…,∣hi∣,…,∣hN∣)T,where∣hi∣is theamplitudeof subcarrieri.

4.2.CSl Data Preprocessing

Raw CSI data contain noise and inactive data,which need to be preprocessed before classification.We first applied DWT to filter the raw CSI datato reduce the noise,and then extracted the activity segment from the denoised CSI data through anomaly detection.The extracted activity segments will be used in the activity recognition model.

1) Noise reduction:Noise in the raw CSI data causes recognition errors.A good noise reduction algorithm helps in improving recognition accuracy.We applied DWT to filter the CSI amplitude waves (i.e.,the sequence of) and reduce the high frequency noise.A wavelet transform provided the good time frequency resolution and had the advantage of high calculation efficiency.

2) Activity segment extraction:When there is no activity,the CSI amplitude waves are relatively stable.When an individual performs an activity,the wireless transmission will be affected,and the CSI amplitude will fluctuate significantly,as shown inFig.3(a).This fluctuation window is the activity segment.We applied an anomaly detection algorithm based on LOF[18]to detect and extract the activity segment.The abnormal data points were first found by measuring their local reachable density.If a data point was far away from the other data points,its local reachable density was low,otherwise it was considered high.We then used LOF to measure the anomaly degree of a data point based on its relative density to its neighboring points.An LOF value of~1 indicates that the data point is located in a region of a homogeneous density,i.e.,in the static part of the data;a higher LOF value indicates an outlier,which means that the data point is in the active part of the data,as shown inFig.3(b).The data points in the active part constituted the activity segment,which was extracted as the sample to be used in the activity recognition model.

Fig.3.Extraction of the activity segment by LOF:(a) CSI amplitude wave and (b) LOF of the activity segment.

4.3.Activity Recognition with Bi-LSTM

Since an activity sample is composed of a sequence of CSI,we propose to apply Bi-LSTM to classify the activities.The preprocessed training samples were used to train the network.For activity recognition,the preprocessed testing samples were put into the established Bi-LSTM model,which output the recognized activity by classification.The activity samples had characteristics of time sequence,hence a recurrent neural network (RNN) was suitable for analysis[19].As the actions in an activity have high coherence,i.e.,each action at a given moment has connections with the preceding and the succeeding actions,we applied bi-directional RNN (Bi-RNN),which simultaneously made use of both the historical and future information,to recognize an activity.During training of the Bi-RNN model,after the stages of propagation,the gradient tended to disappear or explode.Therefore,we adopted Bi-LSTM,which introduced self-circulation units to generate the path of long continuous flows of gradient.The cumulative time scale could change dynamically.In a Bi-LSTM unit,h(t−1)represents the output of the last unit,andx(t) represents the input.x(t) andh(t−1)are connected to generate an outputf(t),which is between 0 and 1,through the sigmoid function:

wheref(t) represents the forgotten gate in Bi-LSTM.Iff(t) is 1,it means completely remembering the last cell state.Iff(t) is 0,it means completely forgetting the last cell state.The number between [ 0,1] represents the proportion of forgetting.Wfandbfare the weight and bias coefficients.i(t) andare two small neural network layers,calculated as

whereσ(⋅) is the logistic sigmoid function,tanh(⋅) maps values between −1 and 1;Wi,bi,Wc,andbcare the corresponding weight and bias coefficients.The C(t−1)vector passed from the previous time is linearly superimposed with thevector,calculated as

andh(t) is the final output,to the next unit of the same layer and the unit of the next layer.

The Bi-LSTM model proposed in this paper is shown inFig.4.

Fig.4.Activity recognition model based on Bi-LSTM.

The first layer includes the hidden layers,containing the forward Bi-LSTM layers and the backward Bi-LSTM layers.The activation functions used by the hidden layers are the sigmoid function and the tanh function.The last layer is the output layer,outputting the classification results,using sigmoid as the activation function.The loss function is the cross-entropy function.The optimizer is Adam.

5.Evaluations

5.1.Experimental Setup

Two laptops equipped with Intel wireless link 5300 were deployed,one as the transmitter and the other as the receiver.Using the CSI tools[20],the amplitude of the subcarriers was retrieved.The transmitter and the receiver had 3 antennas each,thus there were 9 wireless links.Each link had 30 groups of subcarriers,thus there were 270 subcarriers.We conduct experiments under four scenarios,as shown inFig.5.Our focus was on through-the-wall activity recognition,and we also evaluated the method for LOS and NLOS.Six common activities were evaluated:Standing up,squatting down,jumping,walking,running,and falling.Fig.5(a)shows the real experimental scenario within the research laboratory with a size of 10 m×8 m.Fig.5(b)illustrates the setup through a wall that separated the laboratory and the outside corridor and that is made of bricks and steel,given the name A-wall.The transmitter was placed in the corridor and the receiver was placed in the laboratory.The participant performed activities in the laboratory.For each activity,50 samples were collected,thus 300 samples were collected in total,80% of which were taken as training samples (240 training samples) and 20% as testing samples (60 testing samples).Fig.5(c) illustrates the through-thewall scenario where the wall separates two laboratories,which is made of bricks and given the name,B-wall.The transmitter was placed in the neighboring laboratory and the receiver was placed in this laboratory.Six participants took part in the experiment.Each participant performed each activity 10 times,thus a total of 360 samples were collected,half of which were used as training samples and the other half as testing samples.Figs.5(d) and (e) illustrate the LOS and NLOS experimental scenarios in the research laboratory.In each setup,50 samples were collected for each of the six activities,80% of which were taken as training samples and 20% as testing samples.All the packets were sent at a rate of 100 Hz.Training of the activity recognition model took about 16 min and activity recognition took less than 1 s.

Fig.5.Experimental scenarios:(a) real scene,(b) through A-wall,(c) through B-wall,(d) LOS,and (e) NLOS.

5.2.Experimental Results

The training samples were used to train the activity recognition model based on Bi-LSTM,which was then evaluated with the testing samples.The Bi-LSTM model exhibited strong classification capability for sequential activity data and achieved the recognition accuracy of 95% in the through-the-A-wall scenario and 95.5% in the throug-the-B-wall scenario for the six activities by one participant.The recognition accuracy was 98.3% under LOS and 96.7% under NLOS.The confusion matrix of each setup is shown inFig.6.Fig.6(a)shows the confusion matrix when activity recognition was performed through the A-wall,in which 3 errors occurred out of 60 tests,Fig.6(b) shows the confusion matrix when activity recognition was performed through the B-wall,in which 8 errors occurred out of 180 tests.Figs.6(c) and (d) show the confusion matrix under LOS and NLOS,respectively.

Fig.6.Confusion matrix of activity recognition:(a) through A-wall,(b) through B-wall,(c) under LOS,and (d) under NLOS.

5.3.Comparison with Existing Methods

To demonstrate the effectiveness of the proposed method,we compare our method with existing methods:WiFall[9]and NotiFi[12],using the same training and testing sets.WiFall[9]uses a moving average filter (MAF) to filter the CSI data,LOF to detect the activity segment,and SVM to classify activities.NotiFi[12]employs a non-parametric Bayesian model and dynamic hierarchical Dirichlet process to recognize abnormal activities.The comparison between these methods and the proposed method for the four experimental scenarios(through the A-wall,through the B-wall,under LOS,and under NLOS) is shown inFig.7.It is evident that the proposed method achieves the highest accuracy.When activity recognition was performed through the A-wall,our work achieved the accuracy of 95.0%,outperforming the WiFall accuracy of 80.0% and the Notifi accuracy of 75.0%.When activity recognition was performed through the B-wall,our work achieved the accuracy of 95.5%,while that of WiFall is 79.3% and that of NotiFi is 73.5%.Under LOS,our work achieved the accuracy of 98.3%,outperforming the WiFall accuracy of 90.0% and NotiFi accuracy of 88.3%.Under NLOS,our work achieved the accuracy of 96.7%,while the accuracy for WiFall and NotiFi is 86.6% and 85.0%,respectively.

Fig.7.Comparison with existing methods.

Fig.8.Impact of number of participants.

5.4.Number of Participants

Due to different physical features,such as height and weight,different people will affect wireless propagation differently.We performed activity recognition on six different participants (four males and two females,with the body mass index of 16.6,18.0,19.9,20.8,24.0,and 26.2).The experiment was carried out in the through-the-B-wall scenario and the number of participants was increased from one to six.The experimental results are shown inFig.8.With an increase in the number of participants,the recognition accuracy reduced slightly,but was still acceptable.The accuracy of 95.5% was achieved with one participant,and the accuracy of 94.3%,92.2%,91.3%,91.0%,and 90.0% was achieved with two,three,four,five,and six participants,respectively.The reason for the accuracy reduction was the effect of different physical features and styles of activities by different participants,making it difficult to extract cross-participant features.

5.5.Variation of Device Location

When the location of the WiFi transmitter or receiver was varied,the wireless transmission path changed.To evaluate whether the Bi-LSTM model could adapt to this change,we moved the WiFi receiver by one meter.We performed two experiments.In the first experiment,the Bi-LSTM model was trained and tested with the receiver at the original location.In the second experiment,the Bi-LSTM model was trained with 240 training samples with the receiver at the original location and 18 training samples with the receiver at the new location,the Bi-LSTM model is tested with the receiver at the new location.The recognition accuracy was 93.3% with device location variation,while the accuracy was 98.3% without device location variation.The recognition accuracy of each activity is shown inFig.9,indicating that if the environment changed slightly,high accuracy could still be achieved by adding a few new samples.

Fig.9.Impact of variation of device location.

5.6.Classifiers

In the proposed method,we applied Bi-LSTM to classify human activities.To demonstrate its effectiveness,we compared Bi-LSTM with other classifiers on the same datasets,which were RNN (without LSTM),deep neural network (DNN),and SVM.Bi-LSTM was superior to RNN,DNN,and SVM in all the experimental scenarios,as shown inFig.10.When activity recognition was performed through the Awall,Bi-LSTM achieved the accuracy of 95.0%,while that for RNN is 90.0%,DNN 90.0%,and SVM 75.0%.When activity recognition was performed through the B-wall,Bi-LSTM achieved the accuracy of 95.5%,while that for RNN is 90.2%,DNN 88.7%,and SVM 75.3%.Under LOS and NLOS,Bi-LSTM also outperformed RNN,DNN,and SVM.By making use of bi-directional sequential data,Bi-LSTM exhibited strong classification capability,even for through-thewall scenarios.

Fig.10.Comparison of classifier.

5.7.Filters

In the proposed method,we applied DWT to denoise the data.To demonstrate its effectiveness,we compared DWT with common filters:A median filter (MF) and MAF.DWT outperformed MF and MAF when performing activity recognition under LOS,NLOS,and in the through-the-A-wall scenario.Under the A-wall scenario,DWT achieved the accuracy of 95.0%,while MF achieved 86.6% and MAF 88.3%.Under LOS,DWT achieved the accuracy of 98.3%,while MF achieved 90% and MAF 95%.In NLOS,DWT achieved the accuracy of 96.7%,while MF achieved 93.3% and MAF 81.6%.

6.Conclusions

By analyzing the influence of human bodies on WiFi signals,it is feasible to identify human activities,such as standing up,squatting down,jumping,running,walking,and falling.To achieve high recognition accuracy,we exploited the multipath effect and fine-grained CSI to identify activities,by extracting and analyzing the amplitude values of the subcarriers in the wireless channel.The raw CSI samples were first filtered and denoised by DWT,followed by the use of LOF to detect and extract the activity segment.The preprocessed activity samples were classified by Bi-LSTM to identify the activities,which allowed the features to be extracted automatically and enabled exploitation of the bi-directional sequential features of the samples.The recognition accuracy of more than 95% was achieved with Bi-LSTM in two different through-the-wall scenarios,regardless of attenuated wireless signals,outperforming RNN,DNN,and SVM.Apart from through-the-wall scenarios,experiments also showed that the proposed method achieved high recognition accuracy under LOS and NLOS conditions.

Disclosures

The authors declare no conflicts of interest.