Wang Tao
Sichuan University
Abstract: Based on comprehensive construction of the linguistic ontology category system, the basic categorical units for nouns and verbs in Chinese can be defined. Furthermore, logic symbols could be applied to jointing of categories and procedure of lexical meaning"s calculating, thereby externalizing and formalizing the calculating procedure of lexical meaning. This could contribute to Chinese computational linguistics and international Chinese teaching.
Keywords: informatization of lexical explanations, uncertainty, philosophical categories,logical calculating
The accuracy and calculation of the semantic meaning of words have always been the focus of researches in Chinese linguistics.From “morpheme analysis” to “sememe comparing” and “semantic feature analyzing,” scholars have been exploring how to use a more accurate, objective and scientific method to analyze the meaning of words. However, constrained by traditional research methods and characteristics of Chinese linguistics, the degree of precision of interpretation is difficult to advance further. Therefore,it is necessary to absorb the interdisciplinary achievements from a broader perspective and apply them to the study of a word"s meaning, especially the new exploration of the theory and method of relevant interpretation.
In recent years, many new paradigms of word interpretation have emerged in the study of linguistics, which are based on the view of breaking through the traditional framework of word meaning. WordNet, Generative Lexicon Theory and Ontol-MT are the most representative.
WordNet is an English dictionary based on cognitive linguistics designed by psychologists,linguists and computer engineers at Princeton University. In WordNet, nouns, verbs, adjectives and adverbs are organized into a network of synonyms, each synonym set representing a basic semantic concept and linked by various relationships.
Generative Lexicon Theory was put forward by American scholar James Pustejovsky in the 1990s. With the help of the category system of Aristotle"s “Four Causes” Theory, he established the basic framework for interpretation by setting up four physical roles, namely the formal role, the constituent role, the giving role and the functional role (Pustejovsky, 1996, p. 145).
When introducing Generative Lexicon Theory into China, Yuan Yulin made a comprehensive analysis of Chinese words on the basis of the theoretical framework of the theory, and re-proposed the construction method of deep lexical structure from the perspective of physical role. He expanded the four characters in Pustejovsky"s original physical structure to ten.
Chinese computational linguist Feng Zhiwei claimed that if we put all fields of knowledge abstraction into a system of concepts, we could create a glossary which could be used to indicate the concept system and clearly describe the word meaning. Relationships between the words and the words transferred into the glossary were a consensus among experts in the field, and thus a knowledge ontology was constituted. Feng created a knowledge ontology system based on Chinese-Japanese machine translation: Ontol-MT. In this system, Chinese words and Japanese words are decomposed into concepts to form a deep “conceptual semantic layer.”
The above researches on word meaning have broken through the framework of linguistics itself and gone deep into the level of ontological category to explore new interpretive models, pushing the research on word meaning to a new theoretical height. In general, two essential basic steps need to be carefully discussed in the interpretation of words from the level of “ontological category”: A.establishing a category system; and B. Constructing a computational method through categories.
In recent years, with the rapid development of the internet, the “knowledge graph” technology comes into being. In essence, “knowledge graph” is a kind of semantic network, which emphasizes semantic retrieval abilities, and the key technology is used to extract entities, attributes and relationships from the text. The goal is to find the right machine representation of everything to record knowledge about the world.
Aristotle"s “Four Causes” Theory is a philosophical category theory that is tacitly applied due to its conciseness and high generality. But does the category reflected in the lexical system of a language exactly correspond to Aristotle"s “Four Causes” Theory? We need in-depth investigation and analysis to answer this question. We believe that it is necessary to comb through the representative category theories in the history of philosophy and to sublate the extraction and application of various viewpoints on ontology. Combined with the “ontological presupposition” of the modern Chinese lexical system itself, this paper exhaustively investigates and analyzes a core lexical group exhaustively to construct a category system more suitable, comprehensive and appropriate for the Chinese lexical system itself.
To develop this category system, it is necessary to explore the internal logical relations among various categories and construct a category computing system of Chinese word meanings from the bottom up by using corresponding logical operation symbols. In the ideal state, the calculation results for each word in the system would be represented by the combination of “category” and “operational symbol.”
The introduction of logical symbols can improve the accuracy of the interpretation of words; more importantly, it can enhance the objectivity of interpretation on the basis of ontological categories,and is expected to build a consensus platform for interpretation, thus providing services for machine recognition of natural language processing and international Chinese language teaching, which are two major applied fields of contemporary Chinese language research.
From the perspective of linguistic research, it is necessary to define words accurately. The essential attribute of language is to transmit information. From the perspective of information theory,the process of interpreting words is the process of decoding the information words convey. In the view of information theory, the process of information transmission from source to lodge is bound to be accompanied by interference, so the complete transmission of information only exists in theory.From this point of view, it is impossible to have a perfect definition for a word. However, it is entirely possible for us to apply the methods of information theory to the study of language and further improve the accuracy of interpretation at the level of words, based on the fact that language is a kind of information.
The earliest attention to the concept of “information” i n modern society comes from the field of communication. The representatives in this field are Norbert Wiener and Claude Elwood Shannon,who stated “the purpose of communication is to eliminate the indeterminations of the recipient” and that “the amount of information is equal to the amount of uncertainty eliminated.”
From the viewpoint of information theory, it is reasonable to believe that one of the main functions of a sentence (which is to express a complete meaning) is to eliminate the many uncertain meanings of words (which is to refer to or describe the object), to make information precise and unified between communication.
Since both a sentence and a word can be equivalent to the transmission unit of information, What kind of information (i.e. object) does the word meaning refer to or describe once it is fixed in the sentence? What are the uncertainties it removes in the process?
To answer these questions, we need to understand that after hundreds of thousands of years of evolution, human language is not only a communication tool, but also an important tool humans use to know and understand the world. People perceive external information using language, and then classify everything one by one through words, which then convey and deepen each others"understanding of the external world through verbal and written communication.
Even from the internal system of linguistics, we can find many ideas for “eliminating uncertainty in the use of language” by looking at our predecessors. The only difference lies in the different methods of expression.
For example, inMa Shi Wen Tong (Mr. Ma"s Compleat Grammar), a fundamental work of Chinese grammar, the definition of “sentence” is “whether the words match and the meaning of the words has been complete,” which clearly expresses the idea that “a sentence is a linguistic unit with complete meaning.” In addition, for the use of words in sentences, the book states, “A word has no fixed meaning, so there is no fixed category, which should be identified from the context.” The statement can be comprehended in another way: Context helps eliminate words" uncertainty.
For another example, Gottlob Frege, a forerunner of the philosophy of language, also proposed the linguistic principle of “eliminating uncertainty.” Frege explicitly rejected the linguistic idea, once prevalent in his day, that the basic unit of meaning should be a noun or a name, and that by virtue of the existence of something to which the noun refers (the object), a sentence containing the noun has meaning. On the contrary, Frege referenced the method of logic and claimed that nouns in a sentence only play a functional role, which can be replaced. A sentence without a noun may be incomplete,unsaturated, but “What really determines the meaning of a sentence is the sentence itself, which is the logical structure of the predicate.”
Therefore, we can at least think that the words in Chinese may be observed by various categories of concepts explained in ontology. When we examine semantic information, we must go beyond the language itself and analyze it at the level of ontology.
To accurately describe the information conveyed by the language, we first need to describe the information of lexical meaning: investigating the information which corresponds, and is borne and encapsulated in the lexicon; investigating where the information comes from, how it enters the vocabulary, how it is classified, how it is measured, and ultimately how it is conveyed and understood in the form of “meaning.”
We use the method of Information Theory to explore the precise interpretation of words, which is no doubt a specific problem at the “word” level of language; but more importantly, we hope to be able to bridge the gap between the concepts of “semantics” and “information,” which belong to different disciplines, to achieve the integration and complementarity of disciplines at the methodological level. Therefore, by rigorous analyzing and subsequent enhancing, the comprehending of lexical and semantic information can be incorporated into a more macro research paradigm.
In general, one of the main components of each vocabulary system is the most basic core meaning expressed by the words in the system. For example, “fire,” “river,” “go” and “sleep”. This part of the lexical meaning objectively reflects objects and events, so it can also be called “objective information.”Also, people can add their own emotional experience to the objective meaning of words, which constitutes another part of the meaning source of words, such as:
“Animal”—“pet”—“cute pet”
“Animal” is a concept which presents a kind of living organism that exists objectively in nature.“Pet” is a definition of a secondary group of animals from the perspective of human beings and emotions. Based on this, “cute pet” further reflects the subjective emotional experience that people impose on a sub-group of pets. This part of lexical meaning information is different from the “objective information” of lexical meaning.
In addition, most lexical systems have been deposited through a long history. To fully grasp the information reflected by a lexical system, one should make an in-depth investigation of the etymology and diachronic evolution of these words. At the same time, due to the effect of context, the lexical meaning in a specific sentence may change more or less, and it is difficult for us to say that it can correspond to its lexical meaning exactly one to one. This part of the lexical meaning is also different from “objective information.” Therefore, the emotional experience of the people, the origin of the lexical system and the internal membership of lexical meaning information are collectively referred to as “subjective information.” According to the definition by psychology, emotions are a series of internal and external physiological and psychological reactions caused by the cognition of objective events and the grasp of the relationship between the objective events, although in language, emotion reflects another kind of movement of people which is opposite to the outside world, and we can suggest (not quite precisely) that all the information transmitted by a word is composed of its objective information and subjective information.
Ludwig Wittgenstein believed that the most important way for people to know the world is to describe various events about the world through a series of sentences. Therefore, the objective information of words reflects many facts regarding the objective external world, which made the objective information of the word meaning the main part of the word meaning information. The analysis of objective information is the foundation of the analysis of the entire word meaning information. The purpose of this paper is to analyze and deduce the meaning of words from the perspective of information, and on this basis, to summarize the material information reflected by the Chinese word system. Therefore, the above analysis and methods focus on the investigation of objective word meaning information (we use the term “word meaning information” rather than“objective word meaning information” in a general way for the convenience of writing).
As for the subjective information of words, this reflects the internal relations of the language system. If we want to make an in-depth analysis of the subjective information, we need to use completely different research routes and methods. Generally speaking, we must use statistical methods and conduct a certain number of statistical analyses on the sentences where the word is applied, and obtain relevant data from these examples to ensure our overall planning and analysis of the overall state of the word. However, it is a huge project to describe the semantic information internally and externally as a whole, which we are not able to accomplish at present. We must then analyze and calculate the objective information of words step by step. We look forward to having time and energy to complete the remaining work in the future.
There are two essential steps in the approach to word interpretation from the perspective of information theory, which have been mentioned above: A. establishing a category system; and B.constructing a computational method through categories. We can build on these two basic layers with examples below.
In the Western philosophy system, it is generally believed that Aristotle introduced the concept of“category” systematically into philosophy for the first time and posited 10 categories of existing things in hisCategories: “Substance, quantity, quality, relative, where, when, being-in-a-position, having,doing, being affected.”
In the period of modern philosophy, Immanuel Kant started from four aspects and established a whole set of interlinked system. He claimed that when removing all the contents of judgment, we can find that all the functions of judgment are unified from the four aspects of “quality, quantity, relation and modality.”
From this point of view, Kant divides the category system into four broad categories. As for the concept of category, Aristotle is concerned with the ontological nature of objects, while Kant is more concerned with the transcendental epistemic framework of organizing experience. They are in different historical contexts, but their interpretation of categories could communicate and dialogue to some extent. For Kant, each of them is divided into three smaller categories, making a total of twelve categories:
After Kant, Georg Wilhelm Friedrich Hegel, Franz Clemens Brentano and Edmund Husserl successively delved into the research of categories. In the 20th century, the school of modern analytical philosophy also developed its category theory.①It mainly discusses the philosophical viewpoints represented by Russell and Wittgenstein"s theories.
Based on the above theories, we now focus on the information contained in the Chinese lexical system. After careful consideration, we roughly examined 8802 words of four grades, A, B, C and D, included in theOutline of Chinese Vocabulary Levels and Chinese Character Grades, and the selected data reflect the composition, distribution and frequency of the modern Chinese vocabulary system.Among them, the 1,033 Grade-A words and 2,018 Grade-B words listed in theOutlinerepresent the most core and most frequently used word groups in the Chinese vocabulary system. It is also an ideal survey sample for the research objective of this paper.
Taking the 3,051 core Chinese words at grade A and grade B as samples, we exhaustively analyzed and examined the deep semantic structure of nouns, verbs and adjectives, and then extracted and summarized the category system of Chinese semantic information at four levels, including “the philosophical level,” “the physica level,” “the social dimension level” and “the psychological level”(Seen more in Appendix 1).
Overall, this category system is built on the basis of a general survey of the theory of category,which combined with the characteristics of the Chinese word system itself, and there are also organic links between the categories of the same and different levels. Based on this research, we can look forward to objectively reflecting the Chinese word meaning information system. Below are some examples in which information of categories are contained in nouns.
(1) “apple”
Since it refers to a kind of plant, we can extract the category of [plant] from the level of “physical level” and assign a value to “apple,” thus defining the meaning information of “apple” as a kind of“plant.”
(2) “south”
The word reflects a direction in space, to which we can assign the category of “physical level”; in addition, since “directionality” is highlighted in “south,” the subcategory of [space] (direction) needs to be further invoked.
(3) “sweat”
It can be analyzed into three categories: [energy], [animate] and [nature].
(4) “vigor”
It can be extracted from two categories: [animal] and [energy].
The category mentioned above is the Chinese basic semantic element summarized and extracted after a comprehensive investigation of theOutline, and it is also based on the framework of the category theory explained above. On the basis of semantic transliteration of the basic Chinese word system, we also expect that this category system can be developed and improved in the subsequent research.
When we use the category method to deduce the word meaning, we can use the related terms and calculation methods of the Set Theory in mathematics to make the calculation process of the category systematic and formal. Therefore, we selected seven basic logical symbols to participate in the calculation (also seen in Appendix 2).
① “=” : “equivalence relation.”
In this paper, “=” means “to assign” the category and relation obtained from the analysis to the paraphrase, so it is more like the assign symbol in the computer, reflecting a dynamic process.
② “∈” : “belong relation.”
This represents a binary relation between a concrete thing X and set A: If X is a member of A, it can be expressed as X∈A.
③ “⊂”: “subset relationship.”
If all the elements in set A are elements in set B, then set A is said to be a subset of B with the symbol A and B.
④ “∩”: “intersection relationship.”
If the elements of set C appear in both sets A and B, then set C is called the intersection of sets A and B, denoted as C=A∩B.
⑤ “∪”: “union relationship.”
This symbol denotes the relationship between two sets. If the elements of set C contain both the elements of set A and B, then we say that set C is the union of sets A and B, and we say that C=A∪B.
⑥ “~” : “negation relationship.”
Strictly speaking, “~” is a symbol in propositional logic and is used in Set Theory, but in this paper, we need to borrow this concept for the treatment of lexical categories, so we define it here.Specifically, it means the negation of a value category.
⑦ “F(x)” : “functional relationship.”
Gottlob Frege was the first scholar to apply the concept of functional relationship to language research. He re-examined the functional relations in mathematics from a new perspective and put forward that concepts are functions that represent the truth values of objects.
The above seven logical operations are the basis of Set Theory calculation and also the basis of analyzing the category of word meaning information in this paper. We can use them to deduce the logical relations of each category inside the word meaning. While categories and symbols of operation were combined, an effective calculation of a word would be constituted.
On this basis, according to the internal logical relations among the categories, we can construct the category calculation system of Chinese word meaning from bottom to top by using the corresponding logical operation symbols. In the ideal state, the calculation result of each word in the system is represented by the combination of “category” and “operational symbol.”
On the basis of the above discussion, the semantic information of words can be calculated completely by introducing logical symbols into the derivation process. For example:
(5) “length” (“the distance between the two ends”)①The part in braces is the definition of the interpreted words in the 7th edition of Modern Chinese Dictionary, which is used as the style of this paper.
Step 1: The lexical meaning of “length” is a physical property. As discussed above, “length,” as one of the seven basic physical quantities, is also one of the basic categories of “physics” defined in this paper. Therefore, we can use the category [length] directly as the definition of the word.
Step 2: From this, we can rewrite the noun “length” as follows:
“Length” =[length]
Where “=” represents an equivalence and assignment relationship. In this example, we assign the category value [length] to the paraphrased word “length.” Since both “=” and “[length]” are attributes we have defined (or called “value”), we apply them to the “length” of the interpreted word, which is equivalent to using known information to describe unknown information, which is in line with the principles of information theory and the general law of human cognition.
Using this derivation method, we can rewrite Example 1 to Example 4:
“Apple” =[plant];
“South” =[space] (direction);
“Sweat” = [energy] ([animal]) ([natural]);
“Vigo” =[animal] ([energy]).
(6) “audience” (a person who attends a performance, a game, or a movie.)
Step 1: From the lexicographic meaning of “audience,” we can see that the word refers to people,so we put the category of [people] into the analysis process.
Step 2: In addition, along with [people] comes the concept of “seeing.” As mentioned above,vision is one of the main ways for people to receive information from the outside world. We define it as [vision]∪[hearing] in the category of “six senses.” In this case, we need to plug perception into the analysis process.
Step 3: In addition to the [people] and [intellectual] knowledge, we should also see that the word“audience” refer to the “group,” because it means that the morpheme “all” is the word for “a group of people” rather than “an individual” (as in linguistic terms, the word for “collective noun”. Therefore,we need to invoke the category [quantity] (group) in the “philosophical level” to express the concept of “group.”
Step 4: After the category objects are established, we need to establish the logical relationship between them. Through the previous steps, we analyze that the main object of the word meaning information expression of “audience” is [people], and the other categories all express the limitation of“person.” Therefore, there is, first of all, a “functional relationship” between them.
Step 5: In addition to being the main purpose [people], [quantity] (group) express the limitation of [people], and there should be an “intersection relationship” between them, and we substitute the
symbol “∩” into the analysis process.
Step 6: So we can rewrite the noun “audience” as follows:
“Audience” = ([vision]∪[hearing])∩[number] (group)) ([people])
In this case, we established the values involved in the analysis of the three categories, using the multiple logical relationship for computing: First of all in [people] and the other two categories, we established the functional relationship between the second and analysis out [knowledge] and [number](group) the two categories of “intersection relationship,” and then the complete formula expressing the meaning of “audience” information. Through this analysis, we can, to some extent, express the internal relations between the components that make up the meaning of words.
(7) “landscape”: An area of flowers, trees, buildings and some natural phenomena (such as rain and snow) that can be enjoyed by people.
Step 1: From the dictionary meaning of “landscape,” it can be seen that the “flowers, trees,buildings, rain, snow” in the definition of the word correspond to our classification of substances in the “physical level.” Among them, “flowers and trees” correspond to the category of [plants]; “rain and snow” correspond to the category of [inanimate]; “Buildings” corresponds to the category of[Artifact]. Therefore, we first bring these three categories into the analysis process.
Step 2: On the basis of establishing the above categories, we analyze the logical relationship between them. Because they jointly constitute the signified words to be interpreted and are in a state of coexistence, there exists a “union relationship” between them. We call the sign “∪,” and we substitute [plants]∪[inanimate]∪[artifacts] into the analysis.
Step 3: In addition, the “within a certain region” part of the interpreted word reveals the “space”feature of the “physical plane” reflected by the word, that is, the category of [space] that we have defined. Therefore, we call the “space” category into the analysis process.
Step 4: The definition of “available for viewing” reveals the “beauty” attribute of these categories.As discussed above, “beauty” is a category of virtue shared by human society, which is defined as[virtue] (beauty). Therefore, we substituted [virtue] (beauty) in the category of “social level” into the analysis process.
Step 5: We explore logical relationship between [space], [virtue] (beauty) and [plants]∪[inanimate][Artifacts]. It can be seen from the definition that [virtue] (beauty) is a description of[plants]∪[inanimate] [Artifact], and [space] is a general description of them, and there is a functional relationship between them that contains and is contained. So, first, with the symbol “F(x),” we have recognised the union of [Virtue] [Beauty] [inanimate object] [Artifact]; then we call the symbol “F(x)”again, made Space to [Virtue] (Beauty) ([Plant]∪[inanimate] [Artifact]) to enter the analysis.
Step 6: Therefore, we can categorize the noun “landscape” as follows:
“Landscape” = [Space] ([Virtue] (Beauty) ([Plants]∪[inanimate] [Artifact])
In this example, we use the union of related categories in “physical level” and the category of[virtue] (beauty) in “social level” to form the calculation process of “landscape,” and the calculation process corresponds strictly to the dictionary definition inModern Chinese Dictionary. Therefore, the analysis process of this word is objective and effective to a certain extent, both from the perspective of semantic motivation and the matching of lexical meanings, showing the advantages of the method of“using categories to analyze the meaning of the word.” And this relative advantage is also built on our accurate grasp of the concept of “category” and classification, as well as the accurate definition of the category on each level.
Let us list some more categorical analyses of words. The specific steps of analysis are omitted due to space limitation, but the process of analysis can be inferred from the above content.
(8) “father” = [relationship] ([person] (sex));
(9) “class” = (trade + group) (person);
(10) “pen” = ([purpose] [knowledge]) ([artificiality]);
(11) “dish” = [Artifact] (food);
(12) “everyone” = ([number] (many)) ([people]);
(13) “broadcast” = ([power] [listening]∩[region]) ([electricity]);
(14) “skill” = (degree) (high) (intellectual);
(15) “family” = ([relationship]∩[number] (many)) ([people]);
(16) “history” = ([people]∪[nature]) ([time] (origin)∩[time] (stop));
(17) “canteen” = ([power] ([Artifact] (food)) ([Artifact] (shelter));
(18) “classmate” = ([industry]∩[relationship]) ([person]);
(19) “love” = ([people] (sex)) ([happiness]∩[trust]);
(20) “child” = ([space] (direction) ([length])) ([person]);
(21) “contribution” = ([purpose] ([relationship]∩[number] (many)) ([people])) (~ [benefit]);
(22) “environment” = ([relationship] [nature] [artificial objects] [people]) ([space]).
In the above equation, the content after the equal sign as the element of the meaning of the word before the equal sign is completed after we have fully investigated the A and B classes ofOutline of Chinese Vocabulary Level and Chinese Character Grades. The goal is to reflect the semantic primitives of the most basic Chinese core vocabulary and their relationships, on the condition of extending it to a larger vocabulary scope. The function of this system is certainly incomplete, though we also hope to make it perfect in the subsequent research.
In both Chinese and English, verbs are divided into transitive verbs and intransitive verbs (Japanese calls these two categories “他動詞” and “自動詞”).
Semantically speaking, these two categories of verbs are based on the relationship between the“initial state” and the “terminal state” of “motion.” For example, “the wind blows” (“风在吹”) in the Chinese sentence can be translated into English “the wind is blowing,” and can also be translated to “風が吹く” in Japanese. In several examples, “wind,” “blowing” and “吹く” used for the moving process of “wind,” the coordinates are spontaneous, and there is no mediator involved in the process. In other words, they can be claimed as “intransitive verb,” and verbs without the properties are “transitive verbs” or other kinds.
Depending on the types of events that make up our world, verbs can be roughly divided into three categories: state, activity, and event (Pustejovsky, p. 12)①In order to be true to the original meanings of these three words, we keep their forms in the original text and do not translate them into Chinese..
State means a continuous action or state. Examples in Chinese are “sleep” (“睡”), “work” (“工作”),and so on. The start and end of these actions are not instantaneous, implying a kind of continuity of time; In contrast, the verb in activity class refers to a momentary, fleeting action, such as “knock”(“敲”), “kick” (“踢”), and “collide” (“相撞”).
In addition, the verbs of the Event class often express the sense of “completion,” with the sense of“ending a previous process.” For example, the verb “establish” (“建立”) means to complete a process,and the goal of that process (“house [“房屋”], “regime” [“政权”], etc.) has already been accomplished;“discovery” (“发现”), for example, means that the process of “search” (“寻找”) has been completed,and that a goal (something, someone, a “new continent”) has been achieved.
It can be seen that the trinity of “state, activity and event” can effectively analyze verbs from the perspectives of time and persistence. If placed within the framework of this paper, we can say that this classification method divides verbs from the category of “time” and its subordinate categories of “time”(beginning) and “time” (end). However, from the perspective of category, this classification method is not enough to complete an effective description of “movement” in our world. We need to analyze“movement” and classify verbs from another perspective.
In addition, the verbs of the Event class often express the sense of “completion,” with the sense of“ending a previous process.” For example, the verb “establish” means to complete a process, and the goal of that process (“house”, “regime”, etc.) has already been accomplished; “discovery,” for example,means that the process of “searching” has been completed, and that a goal (something, someone, a “new continent”) has been achieved.
We have mentioned the binary classification of verbs in traditional linguistics: “transitive and intransitive verbs” or “automatic and other verbs.” To make the calculation more formal, we can rewrite them using the symbols “○” and “→” for “coordinates” and “motion”:
① “○→”, which in this text means that “coordinate ○” is the sender of “movement →” (in other words, “○” is the initial coordinate of “→”); At the same time, no coordinates other than “○” are involved in the motion process (in other words, “→” has no termination coordinates). This type of movement corresponds to the “intransitive verb” in traditional linguistics.
②“○1→○2”, it says in this paper that “coordinate ○1” is the “movement →” of the sender, at the same time, this type of movement is required to have an endpoint or bear the object, and “coordinate○2” is the “movement→” of the bearer (in other words, “○2” is the “movement→” of the end of the coordinate). This type of movement corresponds to the “transitive verb” in traditional linguistics.
In this way, we can analyze the verbs that are commonly used to express general actions, such as“run” and “wear”.
(23) “run”: Quickly moving on two or four legs.
Step 1: We first need to determine the starting coordinate of the action. From the lexical meaning of the word “two feet or four legs,” we can see that the action is made by an animal or a person. We have discussed in the first chapter that [animal] can be analyzed in “the physical level” category(animals), and “people,” which needs analysis under the category of “social” [people] (Although in biology, “people” and “animal” are under “kind,” as the analysis of [people] is based on the category level, and in this case, the expression also indirectly supports our discussion). At the same time, “foot”and “leg” are part of the body of a human or an animal. Therefore, we need to call the category of [part]in the “philosophical level” to combine with [people] and [animal].
Step 2: After several categories are applied in the previous step, we need to further analyze the logical relationships between them. According to our analysis, the relationship among [part], [man]and [animal] should be analyzed as a “logical relationship,” so, with the corresponding operational symbol, we have written the letters of [part], [people] and [animal].
Step 3: Put [part] [people] and [some] ([animals]) at the same time as the starting coordinates, we need to analyze the logical relationship between them. Due to the fact that the said words [part] [people]or [some] ([animals]), issued by one of the two, do not belong at the same time, they exist between “and set the relationship,” we will sign “∪” in the analysis process, which will be [part] ([people])∪[part]([animals]) as the starting coordinates, as “○.”
Step 4: There is no other element in the lexical meaning of the word to be interpreted as the“receiver” or “termination coordinate” of the action. In other words, the motion represented by the word falls into the “○→” type we analyzed above, so we do not need to invoke the additional coordinate symbol “○.”
Step 5: From this, we can summarize the meaning analysis of the interpreted word “run”:
“Run” = ○→
“○” =[part] ([people])∪[part] ([animal]).
“→” means “○” has a process of movement, and it can also be used as the identification of a verb (In the following example, we will not explain “→” unless there is a special need.)
In this example, we can see how two symbols “○” and “→” are applied to the formal interpretation of words and expressions. It will also be seen how the verb form “○→” discussed and defined in the previous section, is clearly clarified. At the same time, by releasing the process step by step,we rewrite the unknown information elements (interpreted words) one by one with the known information elements (defined categories and logical symbols), and get the accurate interpretation and formalized process, thus reaching the goal of learning the concept of information theory to combine effectively.
(24) “dress”: Clothes, shoes, socks, etc. on the body.
Step 1: Firstly we need to determine the starting coordinate of the action. It can be seen from the dictionary meaning of the word to be interpreted that the sender of the action is human, so “human”should be taken as the starting coordinate of the action. Moreover, since we have defined the category of “people” in the category of “social” level analyzed in the previous chapter, we simply take [people]as the starting coordinate, mark it as “○1”, and substitute it into the analysis process.
Step 2: In addition, the lexicographical meaning of the word to be interpreted includes “clothes,shoes, socks, etc.,” which forms the verb"s other coordinate beyond [person]: the terminating coordinate. So we need to use the symbol “○2.” At the same time, a careful analysis of “clothes, shoes,socks, etc.” shows that they belong to the category of “social level” defined in our previous chapter“artificiality” (clothing). Therefore, we can directly substitute [artificiality] (clothing) into the analysis process as the terminated coordinate “○2.”
Step 3: The “artifact” (clothing) analyzed in the previous step is only a component of the termination coordinate, which is an added component of the initial coordinate [man] after it has experienced movement. And for these, we need to determine their logical relationships. By interpreting the word"s meaning, we can see that at the end of the movement process, people should be added with no clothes. So, the result should be [people] and [man] the combination of (clothing), there is an “intersection relationship” between the two, we need to sign the “studying” into the analysis process, and make “[people] studying [man] (clothing)” become a termination of the coordinates of the activity.
Step 4: Therefore, we can summarize the analysis of the meaning of the interpreted word “wear”:
“wear” =○1→○2
(○1=[man]; ○2=[man]∩[artifact] [clothing])
Through the analysis of this example, we can have a deeper understanding of the specific process of the interpretation of another verb form, “○1→○2.” The concept of the two categories of “starting coordinate” and “ending coordinate” also appears in the example. Through their analysis and understanding, we can interpret the corresponding verbs more objectively. And these processes are based on our effective definition and classification of the category.
For Aristotle, the task of philosophy is to explain why things come into being and why they move.From the perspective of category, one of the important forms of “movement” is the increase, decrease and reorganization of the number and form of category. Then, if we deeply investigate this process,what is the force that causes such changes? And how does this force intervene in the process of change? And this is exactly what Aristotle"s “efficient cause” of “theory of four causes” covers.
For example, the verb “crowing” is used only with the noun “rooster,” so we can say that the verb“crowing” also contains its efficient cause: “rooster.” The verb “to lay eggs” is reserved for the female bird, and they are the efficient cause behind the word.
In addition to the “motive force that causes motion” as expressed by “kinetic cause,” another important question is, what is the cause of this motion process? What will be the result when it is done? The answer to such questions are exactly what Aristotle"s “final cause” is about.
For example, the verb “extinguish” is often used with the noun “flame”, which is both the cause of the action and its final aim, and can therefore be the “final cause”; The verb “publish” is a verb that is specific to an industry within human society (we often call such words exclusive verbs), and the verb is also strongly suggestive of the “final cause” it involves: print products or audio or video.
“Efficient cause” and “final cause” are contained in some verbs that reflect the complex process of “motion,” which can be presented through the categorization analysis of specific verbs. By adding “efficient cause” and “final cause” into the verb analysis process, we can conduct a complete informatization analysis of Chinese verbs. We specify two new operand symbols to represent them:Let the “efficient cause” be “<” and the “final cause” be “>”, thus embodying the basic world attribute of “movement in universal connection.”
(25) “protect” Try to take care of, so as not to be hurt.
Step 1: We first need to determine the starting coordinate of the action. From the dictionary definition of the word being interpreted, it is not clear what the object of the action is, but we can analyze that all things in the world can be protected objects. Therefore, the starting coordinate of the action should include everything in the world, and we call the categories of [animate] and [inanimate]from the “Physical Level” and make them together part of the starting coordinate ○1.
Step 2: Here we need to analyze the logical relationship between [biological] and [inanimate objects]. The combination of them constitutes a part of what we need to analyze. We make[animate]∪[inanimate] as the starting coordinate of ○1 in the analysis process.
Step 3: Furthermore, the definition of “make harmless” can help us determine the termination coordinates of the interpreted words: Careful analysis shows that they fall into the category of “harm”as defined in the previous chapter. Therefore, [harm] and the analysis result of the previous step[animate]∪[inanimate] jointly constitute the termination coordinate ○2.
Step 4: As for the logical relationship between [harm] and [animate]∪[inanimate] that constitute the termination coordinates, we can find that [harm] is one of the internal attributes of[animate]∪[inanimate] after the end of the movement process, so there is a “functional relationship”between them. Therefore, we need the symbol F(x), and let the [harm] ([animate]∪[inanimate]) be substituted into the analysis as the terminating coordinate ○2.
Step 5: Regarding the interpretation of “not subject to...”, in traditional lexical meaning studies,this component is classified as a “negative adverb” to modify the verb. In the framework of this paper,it corresponds to the negation logical relation word “~” defined by us, and also forms a functional relation with the action symbol “→”. So, we use the symbol F(x), and made “(~→)” into the analysis.
Step 6: In addition, we should also note that the movement of the interpreted word from the starting and ending coordinates is driven by forces outside the coordinates. In other words, the word is a “transitive verb” in traditional lexical meaning. Therefore, we still need to find the driver object.According to our analysis in this section, this component is what we have defined as the “efficient cause” of the movement process.
Since all living things can protect the other, the dynamic response of the interpreted word should be [biological] in the category of physical level. We introduce the [biological] category into the analysis as the “efficient cause” of the interpreted word “<”.
Step 7: In this way, we can summarize the meaning analysis of the verb “protect”:
“protect” = <○1(~→)○2
(<=[animate]; ○1=[animate]∪[inanimate]; ○2=[harm] ([animate]∪[inanimate]); (~→) means“movement”)
In this case, because the “efficient cause” of the word being interpreted is prominent in the dictionary, we must also use the corresponding analysis and the corresponding symbol to reflect this relationship, and “<” is applied in the analysis. At the same time, since the movement process expressed by this word contains the meaning of “negation,” it needs the “~” symbol to express this relationship for analysis. Compared with the previous example, the final analytical expression appears to have a more complex structure, which is also determined by the number of categories and logical relations within the translated words.
This paper focuses on the analysis and calculation of objective word meaning information, which we think is only the beginning of a small step to study the macro problem of “semantic information.”With the deepening of the research, the subjective word meaning information should also be included in the scope of research. On this basis, the information reflected in other aspects of language should also be gradually paid attention to. Specifically, there are:
(1) Objective word meaning information. This is the main content of this paper.
(2) Subjective word meaning information. The research methods have been discussed above.
(3) Phrase information. It mainly focuses on the information types and structures at the phrasal level, and requires further studies of the introduction of semantic roles, semantic orientation,diachronic grammar, rhetoric and other relevant knowledge.
(4) Sentence information. Focusing on the types and structures of information reflected at the sentence level requires comprehensive research in combination with the achievements of pragmatics,analytical philosophy, big data statistics and other disciplines.
(5) Semantic information. After a certain foundation has been accumulated in the research of language informationization at all levels, we can explore the “semantic information,” a category that spans many disciplines. This process will make extensive use of the results and methods of information science, cognitive science and other disciplines to conduct multi-pronged research.
In this paper, a preliminary exploration is made in the field of word meaning information based on the results of multidisciplinary research. Within the language discipline, it mainly refers to WordNet,Generative Lexicon Theory, Ontol-MT and “knowledge graph” technology.
Comparatively speaking, this paper describes and calculates word meanings in a hierarchical and phased way. The main idea is to start with the analysis of the most basic and core objective word meaning information. As for the highly controversial subjective information and information at other levels, we expect to solve it gradually in the follow-up research. The advantage of this method is that as long as the limited categories and logical symbols are marked or grasped, the core components of the word meaning can be formally deduced and calculated. Therefore, the research results can be quickly and directly applied to machine recognition, international Chinese language teaching and other fields.
In terms of application in the field of computational linguistics, it can be used to label words at the level of meaning, and then provide a tool for the formal transformation of word meaning information.In the field of international Chinese teaching, it can help master the core meaning of words and sentences to save a lot of repetitive tasks.
APPENDIX 1: SYMBOL SYSTEM OF LEXICAL SEMANTIC CATEGORIES
0.The Philosophical level
[coordinates] (or “○”), [campaign] (or “→”), [entity], [material], [form], [effciient] (or “<”), [fnial] (or “>”), [number] ([number] (Single -) , [number]
(multi -)), [degrees] ([degrees] (high), [degrees] (low)), [range], [part], [relationship] ([relationship] (order), [relationship] (disorder)).
1. The Physical level
[Length], [quality], [time]([time](origin), [time] (terminus)), [electricity], [temperature], [photometrics], [energy], [space], [space]
(direction), [color], [sound], [nature], [animate], [inanimate], [Animal], [Plant], [Microorganism].
2. The Social Dimension level
[people]([people](sex)), [vision], [hearing], [smell], [taste], [touch], [intellectual], [Artifact] ([Artifact] (clothing), [Artifact] (food), [Artifact]
(shelter), [Artifact](transportation)), [Industry], [Status], [Region], [Date], [benefit], [harm], [Evaluation] ([Evaluation] (Positive),
[Evaluation] (Negative)).
3. The Psychological level
[Happiness], [Sadness], [Anger], [Fear], [Trust], [Disgust], [Expectation], [Surprise].
APPENDIX 2: SYMBOL SYSTEM OF LOGICAL
(1) “=”: “equivalence relation”
(2) “∈”: “belong relation”
(3) “⊂”: “subset relationship”
(4) “∩”: “intersection relationship”
(5) “∪”: “union relationship”
(6) “~”: “negative relationship”
(7) “F(x)”: “function relationship”
Contemporary Social Sciences2021年3期