Abstract
Text analysis involves the deconstruction of information within a text. This includes text structure, text pattern, linguistic features, lexical analysis, and syntactic analysis. This research took as its starting point the bottom-up approach of analysing the lexical features, syntactic features, and textual features of patent abstracts for comprehensive coverage of text analysis. Several tools have been applied in the analysis of patent abstracts. This three-fold analysis of text outlined above embraces information on sentence statistics, segmentation statistics, word frequencies, lexical densities, and readability levels. It was found that English translated texts presented a more consistent use of short sentences than in the original Chinese texts, and a common usage of shorter words was also evident in the translated texts. While short sentences, short word length, and high repetitions of words characterised texts with easy readability, findings from the readability tests indicated that in order to understand patent abstracts without difficulty, readers should have received at least 14 years of education.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright (c) 2010 Yvonne Tsai