Investigating Disfluencies in E-C Sight Translation

2021-01-19 10:48JunwenCao
Language and Semiotic Studies 2020年4期

Junwen Cao

Soochow University, China

Abstract Spoken language is marked by para-verbal and non-verbal dimensions, such as silent pause, hesitation, and intonation. Fluency is considered as an essential parameter for evaluating the quality of oral output. Sight translation is a hybrid form of written translation and oral interpreting; however, fluency in sight translation is an underresearched topic. This paper examines the disfluencies in the sight translation of professional and novice translators working from English to Chinese. It adopted a statistical approach to compare the silent pauses and hesitation fillers in the delivery of professional and student participants. According to the independent samples t-test results, the differences were significant in the occurrence and ratio of hesitation between professional and student participants. It can be inferred that professional translators are more apt at coping tactics, such as pausing for several seconds, rather than inserting hesitation fillers unconsciously. Furthermore, the corpus-assisted analysis suggested a higher lexical density among professional participants, followed by the think-aloud approach, which revealed the causes for disfluencies. By drawing upon the Speech Production Theory, the paper found six most influential factors: vocabulary, emotion, syntactic category, speaking habit, lexical ambiguity, and topical difficulty. It is hoped that translation and interpreting studies will not be confined to linguistic dimensions. Paralinguistic signs should have their places in the domain of translatology. Semiotics of translation, which views translation as a pure semiotic act, thereby provides a valued perspective for translation and interpreting studies.

Keywords: translation process and product, E-C sight translation, disfluency, Speech Production Theory, lexical density

1. Introduction

Translation is a dynamic transfer of verbal and non-verbal signs. From the perspective of semiotics, translation is studied as a semiotic act that involves the transition from one semiotic system (source language) to another (target language). “A new trend has emerged within semiotics, namely that of audio visual translation. Scholars have employed semiotics as a tool for the analysis of audiovisual translation, because audiovisual texts are multimodal as they require the combined deployment of a wide range of semiotic sources or modes for their production and development” (Kourdis, 2015, p. 311).

Sight Translation (ST) involves multimodal communication, the visual signs of the source language system and the vocal signs of the target language system. As “a hybrid form of language mediation that partially resembles both the translation and the interpreting process” (Chen, 2015, p. 144), it can be defined as the reading of a text from the source language into the target language, consisting of reading comprehension, information processing and oral delivery. ST is the most commonly used form in environments such as courtrooms, and is often considered as a necessary step which comes before the training of simultaneous interpreting, in that simultaneous interpreters have the luxury of reading a speech script or PowerPoint slides from time to time. When a speaker delivers the speech with the help of a written transcript, which the sight translator has access to while listening to the original speech, the act of sight translation becomes sight interpreting (Qin & He, 2009; Pö chhacker, 2004) or sight interpretation (Lambert, 2004; see Chen, 2015, p. 144).

The past three decades have witnessed remarkable development in translation studies. Translation studies have focused much attention on the products of translation; however, Bell argues that “advances in translation theory can only be achieved through a study of process of translation” (Bell, 1991, p. 22), and a descriptive rather than a prescriptive approach should be adopted in investigation of the process (Bassnett-McGuire, 1980, p. 37)1. With the development of psychological linguistics and cognitive linguistics, scholars have used varied approaches, such as Think-Aloud Protocols (TAPs) (Li, 2008; Bayer-Hohenwarter, 2013), eye-tracking (Jakobsen & Jensen, 2008; Pavlović & Jensen, 2009), ERP (Ma, 2018), fMRI (Mouthon et al., 2020), to look at what happens in the “black box”, i.e., the cognitive pattern of human brain, during translational activities.

Spoken language has its para-verbal and non-verbal dimensions. While in ST the message is conveyed by text, interpreting involves not only linguistic elements, but also intonation, voice quality, changes in pitch and loudness, pauses and non-linguistic elements (Tissi, 2000). This paper attempts to examine the disfluencies of English-Chinese sight translation by presenting a quantitative analysis of the silent pauses and hesitation fillers between professional and student translators. AntConc, a corpusassisted tool, served to analyze the end products of E-C sight translation, implemented by the think-aloud approach which revealed the causes for the disfluencies.

2. Process vs. Product

The literature on ST research is rather scant. Scholars such as Weber (1990), Moser-Mercer (1995), Angelelli (1999), Agrifoglio (2004), Pöchhacker (2004), and Hong (2010) describe ST as neglected or unexplored (see Li, 2014). Although recent years have witnessed a rise in ST research, it is still an under-researched area in the translation and interpreting community. The ST studies mainly hinge on its relations with other interpreting modes (Agrifoglio, 2004), pedagogy (Fatollahi, 2016), assessment (Chen & Ko, 2010), cognitive pattern (Jakobsen & Jensen, 2008), and reading pattern (Dragsted & Hansen, 2009). Modern technologies such as eyetracking have been used to study the ST process (Dragsted & Hansen, 2009). Shreve, Lacruz and Angelone (2011, p. 93) believe that the analysis of speech disfluencies occurring during sight translation provides key information about cognitive activities associated with sight translation such as visual interference.

In the past three decades, Chinese scholars have employed the ontological approach, pedagogical approach and cognitive approach to carry out ST studies. According to Deng Wei (2017, p. 98), the ST ontological studies possessed the largest scale and longest duration; however, cognitive approaches were least adopted by scholars. The non-empirical studies account for a large proportion of the ST studies. Much interest has been given to its end product instead of its process, in other words, the “content” rather than the “package” (way of delivery). In recent years, due to the rapid development of interpreting studies and cognitive linguistics, empirical contributions have been on the rise. However, there still exist the following issues worth exploration:

First off, more emphasis should be laid on the internal integration of translation discipline. Translatology has strong interdisciplinarity, drawing upon expertise from linguistics, social studies, psychology, semiotics and cognitive sciences. Nevertheless, the academia has not given due attention to the internal integration of translation discipline. The investigations into the translation pedagogy, process and product are conducted separately without a holistic view. Few articles have shed light on the three aspects in an integrated way.

Secondly, ST studies are to be broadened. Despite the fact that a growing number of scholars have developed great interests in sight translation/interpretation, the studies on it still lag behind other types of translational activities in terms of quantity and quality. Sight translation/interpretation is frequently adopted on varied occasions, for instance, “a conference setting where live speeches are delivered, translators may be given the speech texts in advance, allowing them to perform sight interpretation”, or specifically simultaneous interpreting with text (SIT) or consecutive interpreting with text (CIT) (Chen, 2015, p. 144). Hereby, it is of practical significance to delve into both the ST process and product of sight translation/interpretation.

Lastly, cognitive approaches to translation studies are to be diversified. Various technologies have been used for translation studies, including questionnaire, TAPs, eye-tracking, ERP, and fMRI. Scholars such as McDonald and Carpenter (1981), Tommola and Niemi (1986), Tommola and Hyönä (1990), Sjørup (2008), Pavlović and Jensen (2009) use the eye-tracking technology to study ST. However, few studies have been mutually examined. Zheng Binghan (2008) proposes a triangulation module, which mainly consists of TAPs and Translog, for the process-oriented translation studies. The methodology of triangualtion can contribute to the validity of research findings by reducing the deficiencies of a single approach. Bayer-Hohenwarter (2013) triangulates translational creativity scores by using both the product and process data (think-aloud data), and finds the convergent results, certain methodological risks and benefits of the combined approach.

3. ST as an Act of Speech Production

Speech Production Theory provides important implications for interpreting studies, as it focuses on the issues in language production, such as fluency, hesitation, selfrepair, which are rarely seen in traditional interpreting studies. Caroll (2008, p. 193) divides the process of speech production into two stages: 1. conceptualization and formulation of linguistic plans; 2. implementation of linguistic plans by articulating and self-monitoring. For Fromkin (1973), we alternate between planning speech and implementing our plans. Henderson (1966) finds that all the participants showed the cycle of hesitation and fluency, although the ratio of speech to silence varied among speakers. Duez (1982) explores the frequency, duration and distribution of pauses in French political interviews, casual interviews, and carefully prepared political speeches. Corley and Stewart (2008) have reviewed the production and comprehension of fillers such as “um” and “uh” to determine whether they are “words” with “meanings”.

Hesitations during speech production are caused by various factors. Humans tend to have more hesitations rooted in uncertainty. Self-monitoring and correction may come after the hesitation and pause. After detecting an error in speech, we may interrupt ourselves by uttering some editing expressions such as uh, sorry, I mean, and the like, and finally repair the utterance. Interpreting is an instant act of language planning and formulation, in which fluency has gradually become a hot topic that receives much attention (Zeng, 2002; Xu, 2010; Dai, 2011). Western scholars often use the notion “Disfluency” to study the fluency in the process of interpreting, regarding it as a useful tool to probe into the psychological and cognitive mechanism of the interpreting behavior. However, Chinese scholars prefer to use the notion “Fluency” to discuss the same issue (Wang, 2016), for the fact that “不流畅性” (non-fluency) does not strike as an idiomatic Chinese expression. The impediments to fluency often include blank, pause, repair, omission, hesitation and substitution. Nonetheless, this paper will not discuss repair, omission and substitution, which do not necessarily lead to disfluencies, but can be the strategies deliberately used by translators/interpreters. In what follows, the paper will focus on the two obstacles to achieving fluency in sight translation/interpretation, namely silent pause and hesitation filler.

4. Experiment

4.1 Participants

The participants were composed of 15 students from a comprehensive university and 6 professional translators. The students, aged 23-25 years, were in the first year of the master program of interpreting. They had taken courses of consecutive interpreting and written translation between English and Chinese for more than 10 months. However, they had received no training in sight translation or simultaneous interpreting prior to the experiment. The professional participants, aged 26-40 years, all of whom possessed a master’s degree, had performed simultaneous interpreting and consecutive interpreting for about 100 meetings, conferences and forums before the present study.

4.2 Materials and method

The material used in the experiment is an English text on environmental protection. The text was an excerpt from a conference address delivered on an international forum on emission reduction and environmental protection (247 words). As there was no time constraint, the participants could sight-translate the text to Chinese at their normal speed. It is to be noted that, without a live speech delivered, the term used to describe the translational activity in this experiment is “sight translation”, i.e., the oral translation of a written text, according to the definition given by Chen (2015, p. 144).

The think-aloud protocols originated from the oral report in psychological tests, requiring the subjects to speak out their thoughts when performing a certain task. Ericsson and Simon (1984) find that the information processing activity in working memory can be reported orally unless it is an extremely automatic action. Li (2011) insists that the synchronized introspection method, which is often adopted in written translation research (Miao, 2005), can fail to reflect the mental activities of interpreters, because it is difficult for them to shift between interpreting and reporting. Instead, the general introspection and instant introspection method are highly recommended. The former requires the subject to make oral reports after the completion of an interpreting task, whereas in the latter, the subject makes the instant oral report during the pauses behind segmentations, sentences or paragraphs.

This paper opines that, compared with simultaneous interpreting, TAPs is more applicable and reliable for ST experiments. The attention-splitting burden in ST is much lighter than that in simultaneous interpreting. The sight translators do not need to cope with listening, reading and speaking simultaneously. Therefore, it is more convenient for them to review the ST strategies and process. For the continuity and integrity of the ST activity, the study adopted the retrospective TA approach, requiring the participants to recall the process and mental activities after the completion of the ST task. The retrospections of participants were recorded and transcribed, followed by some questions concerning the ST process and product.

4.3 Procedure

The process of the experiment was divided into three stages: training period, ST period and TA period. First of all, the participants were informed of the requirements and details of the ST experiment. The researcher assured the participants that the collected statistics would be kept confidential and only used for the purpose of experiment. Then the researcher made a detailed explanation of the operation process and showed two examples. The participants first sight translated a short text, and then did the TAPs. They were provided with proper suggestions on the problems that might occur in the ST process. The experiment then began. The data collected include the ST recordings and TA data. Afterwards, all the recordings were transcribed carefully. Pause means the interruption of oral delivery, including the discontinuity of semantic coherence and grammatical structure. According to the latest studies (Duez, 1982; Tissi, 2000), the grammatical pause longer than 1.4 seconds and the semantic pause longer than 0.56 seconds are recorded as negative pauses (including grammatical pause longer than 1.4 seconds and the semantic pause longer than 0.56 seconds). Hesitation fillers refer to the hesitating expressions such as uh, um, etc. For the purpose of the study, the number of pauses longer than 1 second as well as hesitation fillers (such as um, uh) were counted. The statistics were saved as .txt files in Unicode format.

5. Results

The statistics of SP4 and SP6 were discarded due to their failure to translate the main idea of the text. The effectiveness of TA data is vital to the study as well. According to the standards by Guo (2007, p. 17), the accumulated silent time should not account for over 10% of the total experiment time. Therefore, the statistics of SP14 were discarded as well. In total, the recordings of 18 participants (including 12 students and 6 professionals) were collected and analyzed. The recordings of 18 participants were transcribed as texts. The students were marked as 0, and the professionals were marked as 1. The silent pauses, hesitation fillers and total words of participants’ delivery were counted. The silent pause ratio means the percentage of silent pauses in the delivery word count, whereas the hesitation ratio refers to the percentage of hesitations in the delivery word count. An independent samples t-test was conducted to compare the silent pause ratio, hesitation ratio and delivery word count. The statistics are as follows:

5.1 Independent t-test

Table 1. Professional and student participants’ silent pauses and hesitation fillers

Table 2. Comparison between professional and student translators’ silent pauses and hesitation

According to the independent samples t-test results, there was a significant difference in the delivery word count between student and professional participants (t=-2.68, df=16, p=0.019<0.05); the word count of student participants’ delivery was significantly lower than that of professional participants’ (MD=-41.5). There was also a significant difference in the hesitation occurrence between student and professional participants (t=2.67, df=16, p=0.018<0.05); the number of hesitation occurrence among student participants exceed that of professional participants (MD=1.67). There was also a significant difference in the hesitation ratio between student and professional participants (t=2.89, p=0.013<0.05); the hesitation ratio among student participants was higher (MD=0.005106). However, there was no significant difference in the silent pause occurrence (t=1.371, df=16, p=0.189) and ratio (t=1.71, df=16, p=0.107). The transcripts of the ST recordings were then imported into AntConc to analyze the lexical density.

5.2 Lexical density

The term “lexical density” was first proposed by Ure (1971), whose formula for lexical density is “Lexical Density = Content Words/Total Words * 100%”. Lexical Density is an important index to measure the information distribution in the text. Adjectives, adverbs, nouns and verbs were identified as the content words in this study. The number of concordance hits among student participants’ delivery was 1663, and the number of total word tokens was 2563. According to the formula, the lexical density of student participants was 64.9%; the number of concordance hits among professional participants was 932, and the number of total word tokens was 1402, which suggested, the lexical density of professional participants was 66.5%. It showed that the lexical density of professionals was slightly higher than that of students; in other words, the professionals’ delivery had a higher percentage of content words and a lower percentage of function words (such as particles and conjunctions), suggesting a more fluent flow of information output among professional translators.

Figure 1. Concordance hits of students translators’ ST delivery

Figure 2. Concordance hits of professional translators’ ST delivery

Figure 3. Word tokens of student translators’ ST delivery

Figure 4. Word tokens of professional translators’ ST delivery

In all, the independent samples t-test and lexical density were used to examine disfluencies of ST delivery. In this experiment, no significant difference was revealed in silent pause occurrence and ratio between the professional and student participants. However, the average word count of professional participants was significantly higher than that of student participants, with lower hesitation occurrence and ratio. The lexical density of professional participants was higher than that of student participants, suggesting that the professionals used more content words, and rendered more fluently. It can be inferred that professional translators are more apt to use some coping tactics; for instance, they would pause and stay silent for several seconds, rather than insert hesitation fillers unconsciously, when encountering difficult lexicon and syntax. By doing so, they can reduce pet phrases and unnecessary repetitions, so as to achieve higher fluency.

6. Discussion

With a lower hesitation ratio and a higher lexical density, we can’t help wondering, what might be the underlying reasons for professional’s greater fluency? I adopted the retrospective TAPs, requiring the participants to recall the reasons for the silent pauses and hesitation fillers. Here are some examples:

Table 3. TA data for silent pauses and hesitation fillers

Based on the TA data, the factors leading to the silent pauses and hesitation fillers were categorized according to different stages of speech production (Caroll, 2008, p. 209). Due to the features of ST, three more factors were found, namely “external influence”, “emotion”, and “speaking habit”. In a real ST circumstance, external influence (e.g. noise, disturbance), emotion (e.g. nervousness and excitement) and speaking habit (e.g. psellism), can find a way to affect the speech production as well.

Table 4. Influential factors on speech production

According to Table 4, the influential factors on the fluency of E-C sight translation can be ranked in terms of priority as follows: 1. Vocabulary (14%); 2. Emotion (14%); 3. Syntactic Categories (12.4%); 4. Speaking Habit (10.1%); 5. Lexical Ambiguity (10.1%); 6. Topical Difficulty (9.6%); 7. External Influence (6.7%); 8. Morphological Complexity (9%); 9. Cognitive Load (5.6%); 10. Semantic Priming (5.6%); 11. Phonological Factors (2.8%); 12. Gestures (0). As “Gesture” did not occur in the process of ST, it was removed from the table. Besides, the E-C ST experiment did not involve listening comprehension and time constraints, and accordingly, cognitive load was not as critical as expected. In this study, the fluency of E-C sight translation was mainly dependent upon language proficiency, interpreting skills, topical difficulty, psychological quality and speaking habits of the participants. Another finding was that due to the difficulties in the first stage of speech production, i.e., the conceptualizing period, the participants produced the most non-fluent expressions. According to the speech production model proposed by Kormos (2006), vocabulary and topical difficulty fall into the domain of mental lexicon and long-term memory. Lexical ambiguity, syntactic category and morphological complexity can be attributed to grammatical decoding, whilst the speaking habit is a matter of articulator. Negative changes of the influential factors are inclined to cause obstructions to the speech production, in this particular study, the ST delivery, and eventually result in disfluencies.

7. Conclusion

The paper carried out an E-C sight translation experiment of 12 student and 6 professional translators, and investigated the disfluencies in their delivery by drawing upon the Speech Production Theory. The independent samples t-test suggested that the number and percentage of hesitation fillers of professional participants was significantly lower than those of student participants. According to AntConc, the lexical density of professional participants was also higher than that of student participants. Nouns, verbs, adjectives, and adverbs were identified as content words, as opposed to function words, in this study. The percentage of content words in the total word tokens is termed lexical density, which is often regarded as an important indicator for the fluency of speech. Both the independent samples t-test and AntConc indicated a more remarkable fluency among professional translators.

After the E-C sight translation experiment, the TA approach was adopted to explore the factors affecting the fluency of E-C sight translation. Based on the retrospective reports by student translators on the non-fluent expressions, it can be inferred that professional translators were more capable of adopting coping tactics. Pauses can be exploited in a tactical way. Confronted with complex lexicons or syntactic structures, professional translators would pause for a few seconds, rather than uttering hesitation fillers such as pet phrases or unnecessary repetitions. By doing so, they can achieve higher fluency and better effects on the audience. The paper found six most influential factors for the fluency in the E-C sight translation, namely, vocabulary, emotion, syntactic category, speaking habit, lexical ambiguity and topical difficulty. Based on the ranking, it can be deduced that the obstruction of conceptualization, i.e., the first stage in the process of speech production, has the most considerable impact on the fluency of E-C sight translation.

Scholars have focused on the product of translation for long. Yet, due attention has not been paid to the process of translation. In ST, the oral output of the translator is delivered by spoken language, consisting of both para-verbal and non-verbal signs. Hereby, silent pauses and hesitation fillers fall into the category of para-verbal signs. It is hoped that translation and interpreting studies will not be confined to the linguistic dimension, for the fact that the para-linguistic signs, such as pauses, hesitation, tone, intonation, speed of speaking, voice quality, changes in pitch and loudness, and other non-linguistic elements, should have their places in the domain of translatology. Semiotics of translation, which views translation as a pure semiotic act, thereby provides a valued perspective for translation and interpreting studies.

Note

1 This view was also endorsed by Li Defeng, Professor of Translation Studies and Director of Centre for Studies of Translation, Interpreting and Cognition (CSTIC) of University of Macau, on the 2nd Workshop on Corpus-Assisted Approaches to Translation Studies held at Nanjing Agricultural University in July, 2018.

Acknowledgements

This paper is part of the research project “A TAPs-Based Cognitive Approach to Interpreting Studies” (2016SJB740029) funded by Jiangsu Provincial Department of Education. The author would like to acknowledge with gratitude the comments on this paper from the anonymous reviewers.