王大方
(中国人民大学,北京 100872)
从回指解析视角探析语篇实体突显性的有效评估
王大方
(中国人民大学,北京 100872)
回指研究领域的一个普遍共识是处于受话者注意力中心、突显性最高的语篇实体在语篇中再次出现时更倾向于被缩略形式所替代(多数情况为代词)。因此,在回指解析中的一个关键问题是:哪些因素影响语篇实体的突显性并帮助受话人对其进行有效评估?本文借助丰富的语例探究指称距离、信息状态、视角效果、指称形式和平行效应这5个因素对语篇实体突显性的影响,并借助一定数量的英文语料对解析效度进行检验。
回指解析;突显性;指称距离;平行效应
回指解析指为自然语篇中出现的回指语(主要是第三人称代词)检索出它所指代的先行语(主要是名词性表达式)的过程。很多学者发现突显性较高的语篇实体在语篇中再次出现时更倾向于被缩略形式(如代词)替代(Givón 1983, Ariel 1990, Gundel et al. 1993)。因此,如果从回指解析的视角出发,是否能够较为准确地评估不同语篇实体的突显性成为解析成功的关键。考虑到自然语言在机器处理方面的可操作性,本文将借助语例从指称距离、信息状态、视角效果、指称形式和平行效应这5个维度对语篇实体突显性以及它们在回指解析中的作用加以检验。
突显性来自认知心理学对“图形—背景”的分析。继Talmy (1975) 率先将其引入认知语言学之后,Langacker (1987)突破性地将“突显”(salience) 视为“识解”(const-rual)的重要组成部分,而Chiarcos等(2011)则进一步将人类认知的“突显方式”扩大到语篇分析层面,拓宽其应用范围。在指代消解的过程中,有些学者关注指称词语的语篇功能并以此作为解析的依据(Fox 1987, Cristea et al. 2000, 许金龙 2004),有些学者则更关注指称对象在记忆中的激活程度和认知地位在指代消解过程中的作用(Ariel 1990, Chafe 1994, Gundel et al. 1993, Kibrik 1999, Almor 2004)。计算语言学为回指解析提供重要理论依据的中心理论以指称语在语句中担当的句法成分为标准,将突显性由高至低排列为:主语>>直接宾语>>间接宾语>>修饰语 (Grosz et al. 1983, 1995)。这种评估方法虽然便于计算机对自然语言的处理,但是略显粗糙,在一定程度上影响解析的准确度。鉴于读者在获取新信息的同时须要不断修正之前的预设和推理,本文将语篇信息处理视为一个读者注意力状态不断更新变化的动态过程,从指称距离、信息状态、视角效果、指称形式和平行效应这5个维度对不同语篇实体的突显性进行更为全面和细致的评估,从而有效地识别当前的注意力中心,对语篇内的指称衔接做出正确解读。
2.1 指称距离:线性距离与修辞距离
指称距离是判断语篇中潜在先行语突显性的一个重要指标。现有的回指解析方案对指称距离的计算有两种:一种是线性向前追溯一句或多句来寻找和代词相匹配的名词短语 (Lappin, Leass 1994; Ge et al. 1998; Beaver 2004),另一种是借助语篇层级性的修辞结构来确定搜寻范围(Fox 1987, Cristea et al. 2000)。McEnery(1997)等学者借助UCREL回指树库的语料发现,85.64%的回指语可以在3个语句的范围内找到其先行语,94.91%的回指语可以在5个语句内得到正确解析。虽然线性解析模式因简单易行而在代词自动化解读领域更为普遍,但是层级性的解析模式更符合语篇中信息的组织方式,因而在解析准确度上更胜一筹。
关于如何借助修辞距离来评判语篇实体的突显性,Fox(1987)基于修辞结构理论提出的回指语在书面语篇中的分布模式极具指导意义。基于修辞结构中各语句之间的语义关联,Fox指出如果先行语没有出现在代词所在的语句,它通常会出现在这一代词所属语句的积极语句或控制语句(“积极语句”是该代词所处最小功能语段中的其它功能语句,“控制语句”是对这一语段具有统领作用的语句),因此在这两种功能语句中出现的语篇实体往往具有更高的突显性。以语段(1)中的代词解析为例, Ethani因所在的语句(1①)是(1②)的积极语句而具有较高的突显性并由此成为Hej的先行语。而在控制模式中,因为(1③)中的Ethank出现于距离(1⑦)中代词Hel最近的控制语句而突显性高于(1⑤)中的our smiling, bow-tied pediatrician,成为Hel的正解。
(1) [①Ethaniwas never a typical baby.][②Hejwas colicky and allergic, beset from the start by skin rashes and a chronic runny nose.] [③Ethankwas also late to the milestones first-time parents anxiously wait for.][④Hesmiled at nine weeks, crawled at nine months, and walked at 16 months.][⑤“The late end of normal,” our smiling, bow-tied pediatrician said.] [⑥But as time passed, the list grew:] [⑦Helhad words by two years, but didn’t combine them.] [⑧He didn’t point, didn’t wave bye-bye, and blinked stupefied at a knot of doting adults clustered around him.][⑨Worse still, he seemed happiest playing alone, dribbling sand through his fingers.] (BoyWonder, fromReader’sDigest, 2006 (8):138)
2.2 信息状态:主题与焦点
语篇的信息结构指在言语交际中,说话人利用语言符号把想要传达的信息编码,组成一个由已知信息和新信息构成的连续的组织模式。在分析信息结构时,有两个核心概念:“主题”和“焦点”。主题为已知信息,为论述提供主体和背景;而焦点为未知信息,对主题进行补充说明。在回指解析过程中,因为充当主题或焦点的名词性表达式都预设指称对象的存在,所以都有可能成为读者注意力的中心进而成为代词的指称对象。
代词的指称对象往往是当前语篇的核心话题。在无特殊句法标记的句子中,句子的主语通常被默认为句子的主题,如例(2)中的Mr. Bingley,传统中心理论中对优先中心的判定正是基于这一理念。而在一些有标记的句式中,读者可以根据表层句法组织结构来确定句子的主题词。根据Givón(1983)对多种语言的研究,常见的句法主题标示手段有:存现结构(Existential-presentative Construction,例3)、主题化(Topicalization,例4)、左偏置(Left Dislocation,例5)、右偏置(Right Dislocation,例6)和提升结构 (Raising Structure,例7和例8)等。除句法标识手段,有一些表示“关涉性”(aboutness)的词语也能够起到标识主题的作用,如例(9)中的as for. 在英语中,类似的表达形式还有很多,例如speaking of X,with regard to X,considering X和about X. 得益于这些句法手段和标识语的运用,句子主题的识别具有较强的可操作性和明晰性。
(2)Mr.Bingleywas good-looking and gentlemanlike;hehadapleasantcountenance,andeasy,unaffectedmanners. (Jane Austen:PrideandPrejudice)
(3) Andtherewas my aunt, all the time I was dres-sing, preaching and talking away just as if she was reading a sermon. (Jane Austen:PrideandPrejudice)
(4) In principle, he is now capable of carrying out or determining the accuracy of any computation.Somecomputationshe may not be able to carry out in his head. (Noam Chomsky, 1980:221)
(5)Thewomanyouwerejusttalkingto, I don’t know whereshewent.
(6) Below the waterfall, a whole mass of enormous glass pipes were dangling down into the river from somewhere high up in the ceiling!Theyreally were enormous,thosepipes. (R. Dahl:CharlieandtheChocolateFactory)
(7) a. It seems [thatlightenergywill be an important subject of scientific research in the future].
b.Lightenergyseems to be an important subject of scientific research in the future.
(8) a. Many people believe [thatpoinsettiasare poiso-nous].
b. Many people believepoinsettiasto be poisonous.
(9) ... as forMr.Hurst, by whom Elizabeth sat, he was an indolent man, who lived only to eat, drink, and play at cards; who, when he found her to prefer a plain dish to a ragout, had nothing to say to her. (Jane Austen:PrideandPrejudice)
在回指解析的过程中,主题名词通常被视为默认指称对象,而将新的信息引入语篇的焦点名词则往往被视为标记性指称对象,因其更容易吸引读者的注意力而在突显性上更胜一筹。很多学者发现,如果代词的指称对象处于语句的焦点位置,那么读者可以用较短的时间建立起二者间的同指关系(Cutler,Fodor 1979; Almor 1999)。与主题类似,读者可以借助标记性的句法结构和形式快速定位在语句中充当焦点的语篇实体。英语的分裂句式(cleft construction)是突出焦点名词的最为常见和有效的句法手段。这一句法结构将一个命题拆分为两个小句,进而实现强调句中某一部分的效果。我们可以根据形式主语的不同进一步划分为It-cleft (例10)和Pseudo-cleft(例11)两种。
(10) It waslegendarySNLcreatorLorneMichaelswho re-commended him as Letterman’s replacement. (Reader’sDigest, 2006:120)
(11) Out of all the episodes we did, the one that really worked wastheoneJeffwroteentirelyhimself. (Reader’sDigest, 2006(9):140)
(12) By using the Magic Formula you can be certain of gaining attention and focusing it upon the main point of your message. It cautions against indulgence in vapid opening remarks, such as: “I didn’t have time to prepare this talk very well”, or “When your chairman asked me to talk on this subject, I wondered why he selected me”. (Dale Carnegie:TheQuickandEasyWaytoEffectiveSpeaking)
在书面语篇中,除句法手段外,英语的一些动词具有聚焦功能,这些动词之后的第一个名词往往能够获得读者更多的关注从而具有较高的突显性。计算语言学家Mit-kov通过实证性研究对这类指示动词(indicating verbs)简单归纳,主要包括:analyze, assess, check, consider, cover, define, describe, develop, discuss, examine, explore, highlight, identify, illustrate, investigate, outline, present, report, review, sow, study, summarize, survey和synthesize(Mitkov 2002:146)。在处理语篇信息时,这些动词可视为焦点标记来辅助突显性的评估。此外,斜体、粗体、下划线和字母大写等呈现形式的变化也可以视为作者发出的提示信号,引发读者在阅读时对焦点信息的关注。在例(12)中,作者通过首字母大写的方式使得Magic Formula成为语篇的焦点信息,因此在对后续语句中的代词it(下划线标识)进行解读时,Magic Formula在突显性上高于attention和the main point of your message,成为胜出的候选项。
2.3 视角效果:客观与主观
如同摄影师在拍照时可以选取不同的角度来突出景物的不同部分,发话者在描述一个事件时选取的视角对语篇实体的突显性也有着重要影响。一方面,说话人可以和语篇中的人物保持一定的距离,采用客观的视角描述事件;另一方面,说话人可以拉近和语篇中某一个人物的距离,以当事人的视角进行更为主观的讲述。在形式功能语言学领域,Kuno(1987)利用“移情”(empathy)概念对说话人选取的视角和身份进行描述。具体而言,Kuno将说话人对某一个语篇实体X的移情效应E(x)具化为一个在0到1区间内变化的数值:当说话人与X的身份完全重合时E(x)为1,这样一种主观视角使得说话人能够洞悉X的感观和内心活动;而当说话人与X没有任何关联时E(x)为0,表明说话人所做的是完全独立于当事人X的客观讲述。很多学者注意到移情效应和突显性之间的正向相关,并借助对日语和土耳其语等多种语料的分析证实视角选择在回指解析中的重要影响 (Kameyama 1985, Walker et al. 1994, Turan 1995)。
说话人视角的选取对受话人注意力状态的变化密切相关,当说话人采用主观视角时,语篇中作为感官主体的人物的突显性会有一定程度的提升。在对例(13④)中的代词she进行解读时,短语in Elizabeth’s mind使小说人物Elizabeth成为当前语篇中突显度最高的感官主体,虽然另外一个候选项Mrs. Reynolds在线性距离上更胜一筹, Elizabeth仍然是代词she的真正指称对象。很多表达认知和心理活动的词语,如feel,appear,remember,interest,consider,think和across one’s mind,都是主观视角的有效标识,它们的出现往往意味着作者和某一个语篇人物的身份部分重叠或完全重合。
(13) [①There was certainly at this moment,inElizabeth’smind, a more gentle sensation towards the original than she had ever felt at the height of their acquain-tance.] [②The commendation bestowed on him byMrs.Reynoldswas of no trifling nature.] [③What praise is more valuable than the praise of an intelligent servant?] [④As a brother, a landlord, a master,sheconsidered how many people’s happiness were in his guardianship!] (Jane Austen:PrideandPrejudice)
值得注意,英语中还有一些表示感官的动词和短语(如see,look at,catch sight of,hear,listen to和notice) 虽然也在一定程度上体现出作者的主观视角,但是这些动词往往将读者的注意力引向紧随其后的宾语成分,从而提升宾语而非主语的突显性。在这种情况下,聚焦效应对解析的影响力高于视角效应。例如在例(14②)中的代词her有两个潜在的指称对象:Mrs.Bennet和her eldest daughter. 虽然Mrs. Bennet处于主语位置,但是感官动词seen的聚焦功能将读者的注意力引向her eldest daughter,从而使其突显度更高并成为代词her的正确解读。
(14) [①Mrs.Bennethad seenhereldestdaughtermuch admired by the Netherfield party.] [② Mr. Bingley had danced withhertwice,][③and she had been distinguished by his sisters.] (Jane Austen:PrideandPrejudice)
2.4 指称形式:名词与代词
除信息状态和移情效应外,语篇中不同指称形式因为体现出不同量级的主题延续性,同样可以帮助我们对不同语篇实体的突显性进行比较和评估。总体而言,作者在提及一个语篇实体时使用的指称形式越简略,这一语篇实体的突显性越高,在下文中被再次提及的可能性也就越大。这一理论设想借助篇章分析统计和心理语言学的方法得以充分验证 (Chafe 1976; Garrod, Sanford 1983; Givón 1995; Kameyama 1999)。Givón以其主题延续性的研究为基础,指出“将一个指称对象编码为零形代词或非重读代词,表示该指称对象是当前活跃主题,这一激活状态应继续维持,并应该把即将加入的信息进一步存储在以该指称对象为标签的档案之中” (Givón 1992:5)。在解读例(15③)中的代词she时,发现两个有效候选项:Miss Bingley和Elizabeth(her)。虽然Miss Bingley出现在主语位置,但是Elizabeth因为是以宾格代词的形式出现在语篇当中而在突显性上更胜一筹,成为正解。如果我们将上下文的语境纳入视野,不难发现Elizabeth是整个语篇段落中的核心人物,她借助由例(15②)和例(15④)中的代词her以及例(15③)中的代词she建构而成的话题链一直处于读者注意力的中心,由此进一步辅证解析结果的准确性。
(15) [①When the clock struck three,Elizabethfelt that she must go, and very unwillingly said so.][②MissBingleyofferedher(=Elizabeth) the carriage,][③andsheonly wanted a little pressing to accept it...][④when Jane testified such concern in parting withher,][⑤that Miss Bingley was obliged to convert the offer of the chaise to an invitation to remain at Netherfield for the present.] (Jane Austen:PrideandPrejudice)
2.5 平行效应:延续与并联
除了语句内部的影响因素,语句间的语义关联也会对语篇实体的突显性和回指解析产生深远的影响。在中心理论中,根据回指中心的变化程度和方式,Grosz等(1983)将语言使用者注意力中心的切换(transition)区分为持续(continuation)、保持(retaining)和转换(shifting) 3种类型。Brennan等(1987)在此基础上将注意力中心的转换更细致地区分为平稳转换(smooth-shifting)和粗糙转换(rough-shifting) (例(16))。从语篇局部连贯的视角出发,注意力中心过渡状态的优先等级顺序为:持续>保持>平稳转换>粗糙转换。
(16) a. Brennan drives an Alfa Romeo.
b. She drives too fast. (She =Brennan) (CONTINUATION)
c. Friedman races her on weekends. (her=Brennan) (RETAINING)
d. She often beats her. (She=Friedman, her=Brennan) (SMOOTH-SHIFTING)
d’. She often beats her. (She=Brennan, her=Friedman) (ROUGH-SHIFTING)(Brennan et al. 1987:159)
值得一提的是,虽然注意力中心的线性延续符合读者阅读时的心理预期,但是如果读者在处理语篇信息时解读出两个或多个语句间存在并列或对比关系时,会自然地将两个小句间的并联关系应用于回指解析。这种平行效应的作用在例(17)中得以充分体现,因为两个小句间明晰的对比关系,读者会自然而然地将代词it和同它处于对等位置的名词联系起来,其指称对象也就由(a)中的Prolog变为(b)中的C.
(17) a. The chef successfully combinedPrologwith C, but he had combineditwith Pascal last time.
b. The programmer successfully combined Prolog withC, but he had combined Pascal withitlast time.
(Mitkov 2002:43)
另外,平行效应可以强化或者超越主题延续性在回指解析中的作用(Sidner 1981;Kameyama 1986;Gordon, Scearce 1995;Mitkov 2002)。在例(18)中,on the one hand...on the other 这一对标志性短语让读者可以轻松地将例(18①)中Mr. Giuliani和例(18②)中的代词he联系起来。因为Mr. Giuliani恰好是语句主语,因此平行效应起到强化主题延续性的作用。而在例(19)中,虽然the wild rose并非语句主语,但平行效应仍然使其成为代词it的指称对象而非the green Whitierleaf.
(18) [①On the one hand,Mr.Giulianiwants to cut into Mr. Dinkins’s credibility.][②On the other,heseeks to convince voters he’s the new Fiorello LaGuardia — affable, good-natured and ready to lead New York out of the mess it’s in.] (WallStreetJournal)
(19) a. The green Whitierleaf is most commonly found nearthewildrose.
b. The wild violet is found nearittoo. (Sidner 1981:228)
为了验证这5个对突显度有影响的因素在回指解析中的作用,笔者利用一定数量的英文语料检验其解析效度。为保证语料的多样性,语篇素材来自于文学作品和报刊文章,其中包括马克吐温和欧亨利的短篇小说各5篇,《纽约时报》和《华尔街日报》的文章各5篇。表1汇总语料中的1201个第三人称代词,由于物主代词和自反代词主要依靠线性指称距离和句法规约解析,我们关注其中979个具有指称作用的名词性人称代词的解析情况。
表1 第三人称代词汇总表
在回指解析中,有两种类型的回指是解析难点:远距离回指(long-distance anaphora)和模糊回指(tough anaphora)。以往很多回指解析理论模型仅从语篇的线性发展入手,到回指语所在语句的前一句去寻找其先行语,这样的搜寻范围因太过狭窄而常常在解析远距离回指时遭遇瓶颈。得益于修辞距离的引入,在全部979个人称代词中,有882个代词的先行语出现在本句或与其相邻的积极语句中,另有71个代词通过向前追溯两个语句甚至更远的距离在它们的控制语句中找到对应的先行语,只有26个代词的先行语超出控制语句的范畴。而对于具有多个性数一致的先行语的模糊回指,另外4个因素在对候选先行语的筛选中发挥更重要作用。在本研究采集的语料中,模糊回指的数量为94个,如果只依靠句法成分来评估语篇实体的突显性,解析的成功率为92.4%,而如果将信息状态、视角效果、指称形式和平行效应纳入先行语的筛选体系,解析的成功率提升到95.2%。由此不难看出,对于突显性更为细致准确的评估能够在很大程度上提升回指解析的准确度。
在回指解析的过程中,是否能够准确地评估不同语篇实体的突显性是成功的关键。利用中心优选理论框架中评估器的筛选机制,本文分析指称距离、信息状态、视角效果、指称形式和平行效应这5个对突显性有重要影响的因素,并验证它们在回指解析中的重要作用。这些评估因素的引入无疑提升回指解析的准确度,但是不可避免地降低在自然语言处理方面的可操作性,如何在理论体系的功能性和可操作性之间寻求更好的平衡还需要更为深入的探讨。希望本研究所做的理论探索可以助力更大规模的自然语言处理中回指解析的实证性研究。
许余龙. 篇章回指的的功能语用探索[M]. 上海:上海外语教育出版社, 2004.
Almor, A. Noun-phrase Anaphora and Focus: The Informational Load Hypothesis[J].PsychologicalReview, 1999(106).
Almor, A. A Computational Investigation of Reference in Production and Comprehension[A]. In: Trueswell, J.C., Tanenhaus, M.K.(Eds.),ApproachestoStudyingWorld-situatedLanguageUse:BridgingtheLanguage-as-productandLanguage-as-actionTraditions[C]. Cambridge: MIT Press, 2004.
Ariel, M.AccessingNoun-phraseAntecedents[M]. London: Routledge, 1990.
Beaver, D. The Optimization of Discourse Anaphora[J].LinguisticandPhilosophy, 2004(1).
Brennan, S., Friedman, M., Pollard, C. A Centering Approach to Pronouns [A]. In: Sidner, C. (Ed.),Proceedingsofthe25thAnnualMeetingoftheACL(ACL’87)[C]. Stanford: Association for Computational Linguistics, 1987.
Chafe, W. L. Givenness, Contrastiveness, Definiteness, Subjects, Topics, and Point of View [A]. In: Li, C.(Ed.),SubjectandTopic[C]. New York: Academic Press, 1976.
Chafe, W. L.Discourse,Consciousness,andTime[M]. Chicago: The University of Chicago Press, 1994.
Chiarcos, C., Claus. B., Grabski, M.Salience:MultidisciplinaryPerspectivesonItsFunctioninDiscourse[M]. Berlin: Mouton de Gruyter, 2011.
Chomsky, N.RulesandRepresentations[M]. New York: Columbia University Press, 1980.
Cristea, D., Ide, N., Marcu, D., Tablan, V. An Empirical Investigation of the Relation between Discourse Structure and Co-reference[A]. In: Bulmahn, E.(Ed.),Procee-dingsofthe18thInternationalConferenceonComputatio-nalLinguisticsCOLNG[C]. Saarbrucken: International Committee on Computational Linguistics, 2000.
Cutler, A., Fodor, J.A. Semantic Focus and Sentence Comprehension[J].Cognition, 1979(7).
Fox, B.DiscourseStructureandAnaphora[M]. Cambridge: Cambridge University Press, 1987.
Garrod, S., Sanford, A. Topic Dependent Effects in Language Processing[A]. In: Flores d’Arcais, G., Jarvella, R.(Eds.),TheProcessofLanguageUnderstanding[C]. Chichester: John Wiley, 1983.
Ge, N., Hale, J., Charniak, E. A Statistical Approach to Anaphora Resolution[A]. In: Isabelle, P.(Ed.),ProceedingsoftheWorkshoponVeryLargeCorpora[C]. Montreal: Association for Computational Linguistics, 1998.
Givón, T. Topic Continuity in Discourse: An Introduction[A]. In: Givón, T. (Ed.),TopicContinuityinDiscourse:AQuantitativeCross-languageStudy[C]. Amsterdam: John Benjamins Publishing Company, 1983.
Givón, T. The Grammar of Referential Coherence as Mental Processing Instructions[J].Linguistics, 1992(30).
Givón, T.FunctionismandGrammar[M]. Amsterdam: John Benjamins Publishing Company, 1995.
Gordon, P. C., Scearce, K. A. Pronominalization and Discourse Coherence, Discourse Structure and Pronoun Interpretation[J].Memory&Cognition, 1995(23).
Grosz, B., Joshi, A., Weinstein, S. Providing a Unified Account of Definite Noun Phrases in Discourse[A]. In: Marcus, M.(Ed.),Proceedingsofthe21stAnnualMee-tingoftheAssociationforComputationalLinguistics[C]. Stroudsburg: Association for Computational Linguistics, 1983.
Grosz, B., Joshi, A., Weinstein, S. Centering: A Framework for Modeling the Local Coherence of Discourse[J].ComputationalLinguistics, 1995 (21).
Gundel, J., Hedberg, N., Zacharski, R. Cognitive Status and the Form of Referring Expressions in Discourse[J].Language, 1993(69).
Kameyama, M. Zero Anaphora: The Case of Japanese[D]. Stanford University, 1985.
Kameyama, M. A Property-sharing Constraint in Centering[A]. In: Bierman, A.W.(Ed.),Proceedingsofthe24thAnnualMeetingoftheAssociationforComputationalLinguistics[C]. New York: Associations for Computational Linguistics, 1986.
Kameyama, M. Stressed and Unstressed Pronouns: Complementary Preferences[A]. In: Peter, B., van der Sandt, R.(Eds.),Focus:Linguistic,Cognitive,andComputationalPerspectives[C]. Cambridge: Cambridge University Press, 1999.
Kibrik, A. A. Reference and Working Memory: Cognitive Inferences from Discourse Observations [A]. In: van Hoek, K.(Ed.),DiscourseStudiesinCognitiveLinguistics[C]. Amsterdam: John Benjamins Publishing Compary, 1999.
Kuno, S.FunctionalSyntax:Anaphora,DiscourseandEmpathy[M]. Chicago: University of Chicago Press, 1987.
Langacker, W.FoundationsofCognitiveGrammar:TheoreticalPrerequisites[M]. Stanford: Stanford University Press, 1987.
Lappin, S., Leass, H. An Algorithm for Pronominal Anaphora Resolution[J].ComputationalLinguistics, 1994(20).
McEnery, A., Tanaka, I., Botley, S. Corpus Annotation and Reference Resolution[A]. In: Cohen, P.R., Wallster, W.(Eds.),ProceedingsoftheACL‘97/EACL’ 97WorkshoponOperationalFactorsinPractical,RobustAnaphoraResolution[C]. Madrid: Association for Computational Linguistics, 1997.
Mitkov, R.AnaphoraResolution[M]. London: Pearson Education, 2002.
Sidner, C. L. Focusing for Interpretation of Pronouns[J].ComputationalLinguistics, 1981(7).
Talmy, L. Figure and Ground in Complex Sentences[A]. In: Cogien, C., Thompson, H., Thurgood, G., Whistler, K., Wright, J.(Eds.),ProceedingsoftheFirstAnnualMeetingoftheBerkeleyLinguisticsSociety[C]. Berkeley: Berkeley Linguistics Society, 1975.
Turan, U. D. Null vs. Overt Subjects in Turkish Discourse: A Centering Analysis[D]. University of Pennsylvania, 1995.
Walker, M.A., Iida, M., Cote, S. Japanese Discourse and the Process of Centering[J].ComputationalLinguistics, 1994(20).
ValidAssessmentofDiscourseEntities’SalienceWeightintheLightofAnaphoraResolution
Wang Da-fang
(Renmin University of China, Beijing 100872, China)
There exists a general consensus that the most salient discourse entity, those entities that are currently at the center of attention, tend to be referred to with the most reduced referring expressions, in most cases a pronoun. Here comes the question that is crucial for anaphora resolution: what kinds of factors influence a referent’s salience weight and how to evaluate it validly? In this paper, five factors that have been claimed to influence salience are examined with abundant examples: (1) referential distance; (2) information status; (3) view-point effect; (4) referential form, and (5) parallel effect. Finally, the effectiveness of the five factors in anaphora resolution is fully validated with a number of English texts.
anaphora resolution; salience;referential distance; parallel effect
H030
A
1000-0100(2016)05-0068-6
10.16263/j.cnki.23-1071/h.2016.05.020
定稿日期:2016-06-17
【责任编辑孙 颖】