Rule Based Machine Translation (RBMT)
Machine translation system is developed with the rise of Corpus Linguistics and most machine translation systems adopt rule-based strategy, which can be divided into grammatical type, semantic type, knowledge type and intelligent type.
Different types of machine translation system have different component. Say abstractly, the processing process of all machine translation systems includes the following steps: analyzing or understanding of source language, transferring on a horizontal plane of language, generating target language according to target language structure rules. Technology difference is mainly manifested in the transferring horizontal plane.
1. Grammatical machine translation system The research focus is morphology and syntax, represented by context-free grammars and early systems mostly fall into this category. Grammatical system has three parts: source analysis, transition from source language to target language and target language generation. Source analysis is to analyze input source, and this process can be divided into morphology analysis, grammar analysis and semantic analysis. After described analysis, one can get the internal representation of source. Transition is to transfer internal representation which is independent of source surface expression to internal representation that is corresponding to target language. Target language generation is to transfer target language internal representation to target language surface structure.
2.Semantic Type System The research focus is to apply semantic feature information in the process of machine translation, represented by semantic grammar presented by Burtop and case frame grammar presented by Charles Fillmore. The theories and methods of semantic analysis are mainly to solve the unity problem of formality and logic. Using the semantic segmentation of the system, segment the input source text into related semantic metacomponents. On the basis of semantic conversion, such as keyword matching, find semantic internal representation corresponding to semantic metacomponents. The system tests the relation between semantic metacomponents and establishes a logic relation and forms a whole text semantic representation. The process is achieved mainly through looking up semantic lexicon. Semantic representation is mainly case frame or concept dependency representation. Eventually, machine translation system explains middle semantic representation and generates related translation.
3. Knowledge Type System The research focus is to equip machines with common sense to achieve understanding-based translation system, represented by knowledge type machine translation system presented by Tomita. Knowledge type machine translation system uses enormous semantic knowledge base, to convert source text into middle semantic representation. Using professional knowledge and commonsense knowledge to refine it and finally convert it to one or several translation outputs. 4. Intelligent Type System The research focus is to adopt the latest AI results, realizing multi-path dynamic selection and automatic reorganization technology and converting different sentences on different planes. In this way it connects planes of grammar, semantics and common sense, which inherits the merits of traditional system and achieves self-growth of the system. This type of system is represented by IMT/EC developed by Institute of Computing Technology Chinese Academy of Sciences.
|