End-to-end aspect category sentiment analysis based on type graph convolutional networks①

2023-09-12 07:30:04SHAOQingZHANGWenshuangWANGShaojun
High Technology Letters 2023年3期

SHAO Qing(邵 清),ZHANG Wenshuang,WANG Shaojun

(School of Optoelectronic Information and Intelligent Engineering,University of Shanghai

for Science and Technology,Shanghai 200093,P.R.China)

Abstract For the existing aspect category sentiment analysis research,most of the aspects are given for sentiment extraction,and this pipeline method is prone to error accumulation,and the use of graph convolutional neural network for aspect category sentiment analysis does not fully utilize the dependency type information between words,so it cannot enhance feature extraction.This paper proposes an end-to-end aspect category sentiment analysis (ETESA) model based on type graph convolutional networks.The model uses the bidirectional encoder representation from transformers (BERT)pretraining model to obtain aspect categories and word vectors containing contextual dynamic semantic information,which can solve the problem of polysemy; when using graph convolutional network(GCN) for feature extraction,the fusion operation of word vectors and initialization tensor of dependency types can obtain the importance values of different dependency types and enhance the text feature representation; by transforming aspect category and sentiment pair extraction into multiple single-label classification problems,aspect category and sentiment can be extracted simultaneously in an end-to-end way and solve the problem of error accumulation.Experiments are tested on three public datasets,and the results show that the ETESA model can achieve higher Precision,Recall and F1 value,proving the effectiveness of the model.

Key words: aspect-based sentiment analysis(ABSA), bidirectional encoder representation from transformers (BERT),type graph convolutional network(TGCN),aspect category and sentiment pair extraction

0 Introduction

With the rapid development of Internet shopping platforms,online and offline integrated marketing strategies have gradually become a new trend,and the research on sentiment analysis and opinion mining[1]has also made great progress,becoming a hot research field with significant research value in natural language processing.Traditional document-level sentiment analysis[2]and sentence-level sentiment analysis[3-4]assume in advance that a document or sentence contains only one sentiment tendency, namely positive, neutral or negative.However,due to the complexity and variety of sentiments expressed in the texts,such coarse-grained sentiment analysis cannot effectively extract all sentiments contained in different aspects of sentences.Therefore, more fine-grained aspect-based sentiment analysis(ABSA)[5-6]has been paid attention to by more and more researchers.

Aspect-based sentiment analysis usually consists of two subtasks: aspect term sentiment analysis (ATSA)[7]and aspect category sentiment analysis (ACSA)[8].The ATSA task focuses more on detecting the sentiment polarity corresponding to explicit aspect terms in the sentences,while aspects in realistic review texts are often implicit and do not appear directly in sentences.The number of aspect categories corresponding to a domain is often fixed,and the annotations of aspect categories are often easier to obtain than aspect term annotations.Therefore,aspect category sentiment analysis is more suitable for application in real-world scenarios,and this paper mainly studies aspect category sentiment analysis.

Aspect category sentiment analysis includes two subtasks:aspect category extraction (ACE) and aspect category corresponding sentiment analysis.Most of the current research extracts these two sub-tasks separately and conducts research of the two tasks in a pipelined way,which will cause the accumulation of errors and make the model performance not high.This paper transforms aspect category sentiment analysis into the problem of aspect-sentiment pair extraction (ASPE),trained in an end-to-end manner.

In recent years,methods based on deep learning have been widely used in ABSA tasks and achieved good results due to their ability to learn and extract features autonomously.Currently,model frameworks combining recurrent neural networks and attention mechanisms are more common in aspect-level sentiment analysis research.These models can deal with long text dependencies,mine the time sequence information and semantic information in the text,and imitate human neurons to give different weights to different words,which can more accurately capture the emotional tendencies in sentences.However,most of these studies do not consider the dependencies relationships and types of dependencies among words in sentences,resulting in underutilization of the results of syntactic analysis and making it impossible to achieve satisfactory results in sentiment analysis.

Therefore,a graph convolutional network(GCN)with joined word dependency types is proposed to enhance the learning of feature extraction by combining the dependencies types between words in a text with the dependencies.For the aspect-sentiment pair extraction task,this paper proposes a method that combines all aspect categories in the dataset with sentences as input,and then individually outputs the sentiment polarity of each aspect category.The output labels are converted into a four-category problem,that is,four categories including positive,neutral,negative,and none.For example,in the sentence ‘while it was large and a bit noisy,the drinks were fantastic,and the food was superb’, to extract all aspect-sentiment pairs contained in a sentence,all aspect categories (service,price,ambience,food,miscellaneous) in this dataset need to be fed into the network model jointly with the sentence,respectively,and output the sentiment polarity or no sentiment label of each aspect category.Regarding the imbalance in the dataset caused by this input,the focal loss function is used to solve the problem.This approach enables to solve the problem of end-to-end aspect category sentiment analysis(ETESA) and avoid the error accumulation problem generated by the pipelined method.The main contributions of this paper are as follows.

(1) Through data preprocessing,the aspect-sentiment pair extraction is transformed into multiple singlelabel classification problems,and the loss values obtained from aspect extraction and sentiment extraction are fed forward simultaneously,which can realize the simultaneous judgment of aspect category and its corresponding sentiment polarity and can solve the error accumulation problem caused by the traditional pipeline approach.

(2) The dependency type matrix is constructed in the type graph convolutional network(TGCN),and the word vector and the dependency type tensor are fused to distinguish the importance of the dependency type.At the same time,combined with the binary weight matrix,the text feature representation ability can be enhanced during semantic analysis.

(3) To realize the secondary interaction between aspect categories and contextual semantic information,firstly,the association information between aspect categories and contextual semantics is obtained through the bidirectional gating recurrent unit (BiGRU) network.On this basis,the attention module is designed to capture the sentiment words in the context that are closely related to the aspect in the context,thereby describing more complex contextual relationships of aspect words.

The rest of this paper is organized as follows.Section 1 introduces the related research work.Section 2 details the information of ETESA model proposed in this paper.Section 3 discusses the experimental details and conclusions.Section 4 is a summary of this paper and future directions.

1 Related work

In this section,it presents the state of the application of aspect term sentiment analysis,aspect category sentiment analysis,and multi-task joint learning in the field of sentiment analysis,which is included in aspectlevel sentiment analysis.

1.1 Aspect term sentiment analysis

For aspect term sentiment analysis, many researchers use the basic framework of neural network and attention mechanism for sentiment recognition,and perform information interaction processing between aspect terms and context semantics to extract sentiment words related to the term.For example,Ma et al.[9]proposed an interactive attention network based on the long short term memory (LSTM) and attention to perform sentiment analysis by interactively learning the target and context representations through the attention mechanism to obtain their corresponding feature weight representations,respectively.Tian et al.[10]proposed a method model for TGCN to ABSA with the help of a graph convolutional network of dependencies types.For edges of different dependency types,attention is used to give different weights to learn the graph neural network combining edges and nodes to obtain context semantic information,and attention mechanism is used to learn the weights of different TGCN layers.Wang et al.[11]proposed a relational graph attention network model,which obtained an aspect-oriented dependency structure by pruning the dependency graph,and solved the problem of incorrect matching of aspect information and sentiment information caused by multiple aspects in a sentence.Wang et al.[12]proposed a syntactic information-aware aspect-level sentiment classification model,which uses a memory network that combines an attention mechanism,part-of-speech(POS),position,text semantics and aspect information,and uses a GCN to improve the model classification performance.

1.2 Aspect category sentiment analysis

For aspect category sentiment analysis,since aspect categories are abstractions of terms and do not necessarily appear in the text,it is essential to focus on category-specific sentiment information in the context.For example,Wang et al.[8]proposed an attentionbased LSTM network model that applies the aspect embedding twice consecutively,which can fully model the interdependence between the aspect information and the context input,thereby obtaining better sentiment analysis effect.Wang et al.[13]proposed a context-aware network model,which solved the problem of polysemy by introducing a multi-dimensional attention mechanism,and combined sentence semantic information and aspect category semantic information to enhance the interaction between aspect words and sentiment words,further improving the performance of sentiment classification.Xue and Li[14]proposed a network model based on gating mechanism and convolution,using Tanh gate and ReLU gate respectively to process word embedding and aspect embedding from the upper convolutional neural network (CNN) layer to achieve sentiment analysis of a given aspect,the model can be processed in parallel and is more efficient.Cai et al.[15]proposed a hierarchical graph convolutional network model to extract the relationship between the two subtasks of aspect category judgment and sentiment recognition.First,GCN was used to model the internal relationship between multiple categories,and then GCN was used to identify the sentiment of the category extracted by the previous layer.

1.3 Multi-task joint learning

In the current research,most scholars focus on predicting the sentiment of a given aspect term or category.However,a large amount of comment information in real life often does not give specific aspect information,so it is necessary to extract aspect information first,and then perform sentiment analysis.This method of sentiment classification is easy to lead to the accumulation of errors,that is,if the aspect extraction is wrong,it will lead to an error of the sentiment extraction of a specific aspect.Therefore,a model framework of multi-task joint extraction has been proposed recently,which is more suitable for application in realworld scenarios.For example,Wang et al.[16]proposed a hierarchical multi-task learning framework for end-toend aspect-level sentiment analysis,which first performed aspect term extraction and sentiment word detection,and then performed aspect sentiment detection with the help of attention mechanism,and finally used LSTM and conditional random field (CRF) to extract aspect sentiment,and realized end-to-end aspect-level sentiment analysis by making full use of the information of lower-level auxiliary tasks by the upper-level main task.Zeng et al.[17]proposed an end-to-end network model based on joint learning,and used convolutional neural network CNN to obtain word-level and character-level feature information as embedding information for aspect category detection and aspect emotion recognition.Fu et al.[18]proposed a model framework for a multi-perspective attention mechanism based on a twolayer LSTM,which transformed aspect category sentiment analysis into an aspect category sentiment pair extraction task,and utilized two layers of LSTM to extract contextual semantics and aspect context semantic information respectively,and finally through the attention mechanism extracted the sentiment of multiple categories respectively.Due to the imbalance of the dataset,they proposed a loss function method to solve the problem of the imbalance of the data caused by the transformation of category sentiment to improve the model effect.

To sum up,the research on aspect term sentiment analysis tasks and aspect category sentiment analysis tasks is highly dependent on well-labeled aspect datasets.However,in real application scenarios,the aspect categories and terms of each sentence are often unknown,and it takes a lot of human and material resources that can only be obtained by labeling.In real life,it is often necessary to extract both aspect categories and their corresponding sentiment information.Therefore,in the case where the aspect categories are not given in advance in the dataset,studying aspect category sentiment pair extraction task via multi-task joint learning is challenging.Based on this,this paper proposes an end-to-end aspect category sentiment analysis model based on the type graph convolutional network.A joint dependent type GCN network is used to enhance feature extraction,by converting the input format and using focal loss to solve the data imbalance problem.This framework provides a more accurate model approach for the aspect category sentiment extraction task.

2 Model description

This section will introduce in detail the end-to-end aspect category sentiment analysis model ETESA based on type graph convolutional network.The structure of the model is shown in Fig.1.The ETESA model consists of an embedding layer,a BiGRU encoding layer,a type graph convolutional network layer,an aspect attention layer,and an output layer.The model uses the bidirectional encoder representation from transformers(BERT) pre-training model to vectorize aspects and words,and then uses the BiGRU network layer to initially connect the aspect and context semantic information,learn the features between aspects and contexts,and then send the learned context information to TGCN.The network layer further strengthens the extraction of contextual features,and then uses the attention mechanism to give high weight to aspect-related sentiment words,and finally through the fully connected layer (FC)employs softmax for sentiment analysis.

2.1 BERT-based embedding layer

Fig.1 Architecture of ETESA model

In this paper,the BERT pre-training model is used to implement the vector representation of words,which is different from the common aspect category sentiment analysis input format.All aspect categories contained in the dataset are taken as input together with sentences,and then the sentiment labels of each aspect category are output separately,which is transformed into multiple single-label classification problem.The BERT model is based on the Encoder structure of the bidirectional Transformer for feature extraction,which can distinguish texts with polysemy.To participate in the training of text data,it needs to be vectorized into a trainable data format.This paper uses the pre-training model BERT to vectorize the dataset.For a dataset containingmaspect categoriesAC{ai}m i=1,given a sentenceS= {w1,w2,…,wn} consisting ofnwords,make‘[CLS] +ai+[SEP] +S+[SEP]’ as the input of the embedding layer BERT,whereairepresents the aspect category word,Srepresents a piece of data,[CLS] and [SEP] respectively represent the start character and the aspect or sentence end character of the new input to the BERT model.After being encoded by the pre-trained model,the embedded representation of a sentence will be obtained asIe∈R(n+3+xa)×dw,wherenrepresents the number of words in a sentence,and the number of input characters is fixed to 3,xarepresents the number of aspect category words,anddwrepresents the dimension of the BERT model embedding.Since the character embedding does not participate in the calculation of the next layer,the word vector input to the next layer is represented byEw∈R(n+xa) ×dw.Each word in the sentence can obtain its corresponding part-of-speech tag through the Allennlp tool,but the aspect category has no part-ofspeech information.In order to maintain the balance of the feature dimension,a virtual tagpT is added to represent the part-of-speech of the aspect category,so the part-of-speech can be obtained.The embedding is represented asEp∈R(n+xa) ×dp,wheredprepresents the dimension of the part-of-speech embedding.The feature vectorEi∈R(n+xa)×(dw+dp)obtained by splicing word embedding and part-of-speech embedding vector is sent to the BiGRU layer.

2.2 BiGRU encoding layer

As a variant of recurrent neural network (RNN),gating recurrent unit(GRU) can capture long-distance semantic information with timing information,and effectively solve the problems of gradient explosion and gradient disappearance during forward and backward propagation.In addition,because it only has update gates and reset gates,compared to LSTM,using GRU training can achieve the same effect as LSTM in performance,but can speed up the training time to a certain extent.Therefore,this paper uses the BiGRU network to initially associate aspect categories and sentences and learns the hidden layer output vector containing aspect and context information.The formula is as follows.

2.3 Type graph convolutional network layer

Graph convolutional neural networks can efficiently extract features from graph structure information,and using the graph structure information derived from syntactically dependent features,and then applying it to GCN for aspect-level sentiment analysis is a common approach used by researchers in recent years.Ref.[10] proposed a TGCN network for aspect term sentiment analysis.At present,no one has applied dependent type information to aspect category sentiment analysis.Therefore,this paper uses a type graph convolution network to enhance feature extraction to realize aspect category sentiment analysis.First,use the Allennlp tool to obtain the dependencies between words,and establish the adjacency matrixAdj={axy}n×nand relation matrixRel={rxy}n×nthrough this dependency information.For these two matrices,it is generated by the following rules: if there is a dependency between wordxand wordy,letaxy=1 andrxy=ri(where 0

Finally,the hidden layer output of thelth layer of the wordxis calculated as

2.4 Aspect attention layer

The attention mechanism can quickly extract important features of data,so it has been widely used in natural language processing.In the ACSA task,the model should pay more attention to words that have semantic associations with aspect category words.For example,in the sentence ‘Delicious sushi and poor service’,semantically,‘Delicious’ is used to modify ‘sushi’,while ‘poor’ modifies ‘service’.Therefore,when judging the sentiment tendencies of the category‘food’,‘Delicious’ should be given a higher weight.The main purpose of this layer is to capture the dependencies between aspect category words and contextual semantics and to enhance aspect-related feature extraction.This paper uses the bilinear attention score function as

2.5 Output layer

The output feature vectorZ(kj,q)∈R(dhid+da)of the previous layer is passed through the fully connected layer and the output tensor size is 4,and then the final sentiment prediction result is obtained through softmax function.The specific formula is as follows.

whereW∈Rdc×(dhid+da)andb∈Rdcare the learnable parameter matrix and bias,respectively,heredcrepresents the number of categories;andôis a set of probability values predicted as four sentiment values;y^is the final prediction result value.In this paper,the conversion of aspect-category sentiment pairs is used for sentiment analysis.Although the interpretability of the model is improved,it also causes data imbalance.Therefore,the focal loss function is used in this paper to solve the model performance caused by data imbalance.The specific expression is

where,ôindicates that the output is a set of four emotions,αtandγare adjustable factors.

3 Experiment

3.1 Data set

This paper conducts a series of experiments on ETESA model on three public datasets to verify the interpretability and effectiveness of the ETESA model.

The first is the SemEval2014 restaurant dataset(Rest14).This dataset comes from the data for restaurant reviews in SemEval2014 task 4[19],which is commonly used for ACSA tasks.It includes 5 aspect categories: ‘service’,‘price’,‘food’,‘ambience’ and‘miscellaneous’.It contains 4 sentiment categories:‘positive’,‘negative’,‘neutral’ and ‘conflict’.Since there are conflicting label items,this experiment removes the data of the ‘conflict’ label.

The other two datasets are the SemEval2015 and SemEval2016 restaurant datasets ( Rest15 and Rest16).These two datasets come from the restaurant review datasets in SemEval2015 task 12[20]and SemEval2016 task 5[21],respectively,with 12 and 13 aspect categories each,and both contain 3 sentiment labels:‘positive’,‘negative’ and ‘neutral’.The detailed statistics of the three datasets are shown in Table 1.

Datasets Aspect Sentences Positive Negative Neutral Total Rest14 train 5 2885 2179 839 500 3518 test 5 767 657 222 94 973 Rest15 train 13 1117 1083 369 52 1504 test 13 580 413 329 45 787 Rest16 train 12 1702 1501 697 100 2298 test 12 585 513 195 42 750

3.2 Experiment setup and evaluation metrics

This paper adopts the PyTorch deep learning framework and uses the 768-dimensional dynamic pretraining model BERT provided by Devlin et al.[22]as the word embedding method.The part-of-speech embedding vector dimension is 332,and the BiGRU hidden layer unit is 250.In addition,the Allennlp tool is used to obtain information such as dependency syntactic relations and part-of-speech tags,and the syntactic dependency tree is visualized as shown in Fig.2.This paper uses 2-layer TGCN and sets dropout to 0.5 for BiGRU.Using the Adam optimizer,set the learning rate size to 2e-5,the batch size to 8,and the epoch size to 30,and the result is the optimal value among all epochs.In the experiment,Precision,Recall and F1 values were used as the evaluation indicators to verify the model.

Fig.2 An example of syntactic dependency tree

3.3 Models comparison

To comprehensively evaluate the performance of the ETESA model proposed in this paper on the ACSA task,this paper selects a series of representative compare experiments for comparison with ETESA,including:

TAN[23]proposes a model network based on GRU encoding and topic attention,which detects different aspect categories by paying attention to distinct parts in sentences based on various topics.

MTNA[24]proposes a multi-task learning model based on LSTM and CNN networks that simultaneously addresses both aspect category and aspect term extraction.

BERT-pair-NLI-B[25]constructs auxiliary sentences based on aspects,converts ABSA to sentence pair classification task,and implements sentiment analysis by fine-tuning the pre-trained BERT model.

TAS-BERT[26]proposes a target-aspect-sentiment triplet joint detection model based on the BERT pretrained model.

MEJD[27]proposes a model based on BERT,LSTM and graph attention convolutional network GACN to capture the dependencies between aspects and sentences,and realizes the target-aspect-sentiment triplet multi-task sentiment analysis model.

3.4 Experiment results

In order to verify the effectiveness of the proposed method,this paper compares each dataset with related research in recent years.As can be seen from Table 2,on the ASPE task,the REST14 and REST16 datasets have achieved the best F1 values,Precision and Recall,and the F1 value of the ETESA model is 0.41%and 0.70% higher than that of the MEJD model,while the results on the REST15 dataset are higher than those of the MEJD but slightly lower than the TAS-BERT model.Through analysis of the results,it is found that adding dependency type information to the model makes the results of this model better than that of MEJD.At the same time,because the distribution ratio of different sentiment polarities in the training set and test set in the REST15 dataset is quite different,the F1 value of ETESA on this dataset will be slightly lower than the TAS-BERT model based on BERT.On the ACE task,by analyzing the experimental results,it is found that the F1 value of ETESA achieves the current bestresults.Analyzing the results of three different datasets,it can be found that the models REST14 and REST16 perform better.By analyzing Table 1,it is found that this is mainly because the data in REST15 is not enough,and the model is not enough to learn better features.Therefore,the performance of the model on this data set is inferior to that of other data sets.

Tasks Models REST14 REST15 REST16 Precision Recall F1 Precision Recall F1 Precision Recall F1 ACE BERT-pair-NLI-B 93.91 88.70 91.23 75.63 66.51 70.78 87.04 74.44 80.25 TAN - - 90.61 - - - - - 78.38 MTNA - - 88.91 - - 65.97 - - 76.42 TAS-BERT 92.68 88.15 90.36 82.32 71.17 76.34 88.92 75.34 81.57 MEJD 94.29 88.16 91.12 82.74 72.07 77.04 89.07 78.80 83.62 ETESA 94.96 88.39 91.56 83.48 74.08 78.50 90.93 78.52 84.27 ASPE BERT-pair-NLI-B 82.72 78.85 80.74 66.73 60.88 63.67 77.51 68.45 72.70 TAS-BERT 84.79 80.20 82.43 71.55 65.70 68.50 78.96 69.84 74.12 MEJD 85.82 80.72 83.19 71.28 64.64 67.80 79.73 71.17 75.21 ETESA 86.33 81.04 83.60 71.93 64.86 68.21 80.40 71.90 75.91

Overall,ETESA achieves a relatively state-of-theart on the ASPE task,demonstrating the effectiveness of end-to-end type graph convolutional neural network approach.

3.5 Ablation experiment

To investigate the effectiveness of different modules (part-of-speech,BiGRU,type-dependent) in the model,the following ablation experiments were studied and compared with the full experiments.Table 3 shows the experimental results using different modules on 3 datasets.It can be observed that the model proposed in this paper has the highest F1 value,and if any one module is removed from the full model,the model performance on all three data sets will be degraded.From the comparison results of models 1 and 4,it can be known that the dependency type information in dependency parsing can improve the performance of the model to a certain extent,because the importance of different dependency types in sentences is different.By giving different weights to them through attention,the capture of aspects and related sentiment words can be improved.From the experimental results of models 2 and 4,it can be concluded that the BiGRU module is crucial to the task of this article,as it can extract contextual semantic information and initially obtain the interaction information between aspect categories and sentences.From the results of models 3 and 4 on all datasets,it can be seen that after adding part-of-speech information,the performance of aspect-sentiment pair extraction task is improved more than other related models,which indicates that part-of-speech has a greater impact on the sentiment classification task.To sum up,part-ofspeech,BiGRU,and type-dependent all play an important role in enhancing aspect-sentiment pair extraction tasks,and each module has its own unique contribution to the model.

?

3.6 Parametric study

This section explores the impact of different hyperparameters on model performance, including the number of layers of TGCN and the size ofαtin the loss function.

Fig.3 shows the different experimental results for the TGCN layer number from 0 to 4.It can be seen that when the number of layers is 2,the model obtains the optimal F1 value.When the number of layers is 0,that is,the TGCN layer is not used,the effect is the worst,which proves the effectiveness of the TGCN layer.When the number of layers is 1 and 3,the performance of the model is not as good as that of 2 layers.The possible reason is that when the number of layers is 1,the model does not capture the dependencies between all words.When the number of layers is 3,the model is overfitted,so the performance on the test set is reduced.

Fig.3 The influence of different TGCN layers

Fig.4 shows the study of the hyperparameterαtfor balanced samples in the loss function,three common values with an interval of 0.25 were chosen for the experiments.The results show that the model has the best performance when the value ofαtis 0.5.Because in the training phase,the samples are weighted random sampling,so that the emotional and non-emotional samples in each batch are uniform,so the model performance is the best when the value is 0.5.

Fig.4 The influence of balanced sample scale factor αt

4 Conclusions

Aiming at the problems that syntactic dependency information is not fully utilized in the current research on aspect category sentiment analysis tasks,there is a lack of effective mechanisms to distinguish the important relationships of different dependency types,and the pipeline method used in most studies is prone to error accumulation.This paper designs and proposes an endto-end aspect category sentiment analysis (ETESA)model based on type graph convolutional networks.The model converts the three-category sentiment problem into a four-category problem.First, the pre-trained BERT model is used to obtain word embedding information,and this model can solve the problem of polysemy in the text.Then,the BiGRU network is used to preliminarily combine the aspect and context semantic features.After that,the feature extraction of the context is further achieved through the TGCN network,and finally the attention mechanism is used to obtain the sentiment related to the aspect category.The comparative experimental results on three public datasets show that the ETESA model proposed in this paper can effectively improve the performance of aspect category sentiment analysis.

The recent related research on sentiment analysis[26-27]can achieve more fine-grained sentiment analysis by jointly extracting three tuples of aspect categories,aspect sentiment,and aspect terms,compared with the two tuples proposed in this paper.Therefore,in future work,the model method of triplet task combination extraction will be further studied,so that it can achieve joint extraction of more tasks.